Browsing by Author "Alzeber, Mogahed"
Now showing 1 - 1 of 1
- Results Per Page
- Sort Options
- Some of the metrics are blocked by yourconsent settings
Publication Comparative study between regular expression and google similarity index for instance based schema matching(Gombak, Selangor : International Islamic University Malaysia, 2016, 2016) ;Alzeber, MogahedSchema matching is considered as one of the essential phases of database integration. The aim of the schema matching process is to identify the correlation between Schemas which help later in the data integration process. The main issue concern during schema matching is how to support the merging decision by providing the correspondence between attributes through syntactic and semantic heterogeneous in data sources. There have been a lot of attempts in the literature toward utilizing database instances to detect the correspondence between attributes during schema matching process. Many schema matching approaches based on instances have been proposed aiming at improving the accuracy of the matching process. We observed that no single technique managed to provide accurate matching for different types of data. In other words, some of the techniques treat numeric values as strings. This will negatively influence the process of discovering the match and further on the quality of match results. Similarly, other techniques treat textual instance, as numeric, and this will also impact the quality of the match result. Thus, a comparative study between syntactic and semantic techniques is needed. The study should emphasize on analyzing these techniques deeply in order to determine the strengths and weaknesses of each technique. This thesis aims at developing two schema matching techniques, namely: (i) regular expression and (ii) Google similarity to identify the match between attributes for numeric, alphabetic and mix instances. Furthermore, comparing these techniques and evaluate their performance empirically. Several analyses have been conducted on real and synthetic datasets to evaluate the performance of the schema matching techniques considered in this thesis with respect to Precision (P), Recall (R) and F-Measure.