Publication:
Evaluating the retrieval performance model of Addaall stemmer for arabic news of al-Jazeera

dc.contributor.affiliation#PLACEHOLDER_PARENT_METADATA_VALUE#en_US
dc.contributor.authorSaoudi, Ouahibaen_US
dc.date.accessioned2024-10-08T07:38:58Z
dc.date.available2024-10-08T07:38:58Z
dc.date.issued2012
dc.description.abstractAlthough, stemming improves the effectiveness of information retrieval of language, it has some limitations and shortcomings. Among the main problems are that it can reduce unrelated words to the same stem as well as fall short to reduce related words to a common stem. In addition, most stemmers such as light stemming are of heuristic effort, falling short of full understanding of the morphology of the language. This lays the ground for more research works on search engines which use stemming in order to develop the most effective one for Arabic IR, such as Addaall. This research investigated the retrieval performance of Addaall stemmer. Addaall is a web based Arabic search engine that uses a morphological analyzer and generator to construct different indices based on both root and stem of a word. It evaluates the Addaall prefixes and suffixes removal search (PSRS) and root search (RS) for retrieving Arabic news documents in semi-laboratory setting. The theoretical assumption is that semantic linguistic search can improve recall and precision for both root and stem searches in semi-laboratory environment setting. The research conducted a comparison between PSRS, RS and exact search (ES) as well as explored the main obstacles attributed to indexing and retrieval of Arabic information using different types of index strategies, stem and roots. Queries were constructed from Al-Jazeera news from 2002-2007 and submitted to the two main searches; Addaall and AlJazeera search engine. The retrieved documents were judged relevant if they contain correct and meaningful search term with no ambiguity. Strata and random sampling were carried out in order to run statistical significance testing. The findings of this research demonstrated that PSRS precision rate was significantly higher than those of ES and RS. The RS recall rate was significantly higher than ES and PSRS recall rate. Additionally, this research indicated that Addaall stemmer had improved both recall and precision compared to non-stemming. This is due to Addaall use of linguistic semantic search in its morphological analysis at different levels. On the other hand, the causes for failure were related to root, stemming, Arabic diacritics and indexing. Hence, the significance of this research is underscoring the need for constant research on these factors in order to propose the proper strategies and solutions. Generally, the findings of this research indicated that using web collection in semi-laboratory environment showed that removing prefixes and suffixes without trying to remove the infixes or finding the root enhanced recall and precision values.en_US
dc.description.callnumbert ZA 4230 S239E 2012en_US
dc.description.degreelevelDoctoral
dc.description.identifierThesis : Evaluating the retrieval performance model of Addaall stemmer for arabic news of al-Jazeera /by Quahiba Saoudien_US
dc.description.identityt00011273704OuahibaSaoudien_US
dc.description.kulliyahKulliyyah of Information and Communication Technologyen_US
dc.description.notesThesis (Ph.D)--International Islamic University Malaysia, 2012.en_US
dc.description.physicaldescriptionxvi, 271 leaves :illustrations ;30cm.en_US
dc.description.programmeDoctor of Philosophy in Library and Information Scienceen_US
dc.identifier.urihttps://studentrepo.iium.edu.my/handle/123456789/9412
dc.identifier.urlhttps://lib.iium.edu.my/mom/services/mom/document/getFile/8gKVkVRVpFkfxYxiRb6hIZstQAOnEoJF20180806121824003
dc.language.isoenen_US
dc.publisherKuala Lumpur :International Islamic University Malaysia,2012en_US
dc.rightsCopyright International Islamic University Malaysia
dc.subject.lcshWeb search enginesen_US
dc.subject.lcshInternet searchingen_US
dc.subject.lcshWorld Wide Web -- Subject accessen_US
dc.subject.lcshInformation retrievalen_US
dc.titleEvaluating the retrieval performance model of Addaall stemmer for arabic news of al-Jazeeraen_US
dc.typeDoctoral Thesisen_US
dspace.entity.typePublication

Files

Original bundle

Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
t00011273704OuahibaSaoudi_SEC_24.pdf
Size:
753.68 KB
Format:
Adobe Portable Document Format
Description:
24 pages file
Loading...
Thumbnail Image
Name:
t00011273704OuahibaSaoudi_SEC.pdf
Size:
12.65 MB
Format:
Adobe Portable Document Format
Description:
Full text secured file

Collections