Publication: Online feature selection based on input significance analysis (ISA) for evolving connectionist systems (ECos)
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Subject LCSH
Subject ICSI
Call Number
Abstract
In todays world that continuously processes data, offline or online, data is accumulating every day, which create difficulties for the existing data processing, such as classification to catch up. The more the data means, the more it requires time for processing, and may cause data overfitting, and this will conflict with today
s lifestyle that demands faster and accurate results. Many researchers in this area are focusing on applying Feature Selection (FS) techniques that will reduce the number of features. However, based on the reviews, none is working together with Input Significance Analysis (ISA) techniques, which can provide meaning for each feature in the dataset before being processed by the classifiers. Additionally, ISA can offer some insights about the “black box” element inherited by the classifiers; that hides any details about the classification processes and results derivation, which later can trigger doubts and questions on how such classification results produced. The methodology of this research comprises of six groups of experiments or stages. In the first three stages, the feature ranking method is performed, as part of ISA implementation. The last three stages performed the feature selection, as part of FS implementation. The preliminary results, obtained from the first three stages, showed that the percentage of error rate is decreasing by using ranked dataset. From the last three stages, as final results, the ranked dataset with feature selection has been found to produce improved results compared to the original and complete dataset. In summary, after the original and complete dataset has been interpreted well by ISA, together with the implementation of FS that reduce the number of features according to the weights obtained and ordered by ISA, training has become faster, the size of the network has been reduced, and more accurate results has been produced.