Publication:
Speaker identification based on hybrid feature extraction techniques

Date

2020

Journal Title

Journal ISSN

Volume Title

Publisher

Kuala Lumpur : Kulliyyah of Information and Communication Technology, International Islamic University Malaysia, 2020

Subject LCSH

Speech processing systems
System identification
Wavelets (Mathematics)

Subject ICSI

Call Number

t TK7882 S65 A165S 2020

Research Projects

Organizational Units

Journal Issue

Abstract

Speech contains many features that can be used to determine gender and speaker identity; it is a natural form of communication between humans. One of the most exciting areas of signal processing is speech processing. Speech contains many features or characteristics that can discriminate the identity of a person. The human voice is considered one of the important biometric characteristic that can be used for person identification. The proposed speaker identification system (SIS) consists of four phases, namely, pre-processing phase (involves sample resizing to 40000 samples and normalization to ensure that the sound volume will modifying as a standard level), feature extraction phase (involves extracting a set of fundamental voice features that can represent or identify the entire signal of speech), feature selection phase (involves selecting the best features that describe the speaker, where dealing with hundreds number of features leads to increase the workload of recognition) and recognition phase (involves Backpropagation (BP) neural network in this research). In this work the effects of appropriate extracted voice features from various levels of discrete wavelet transformation (DWT) and the concatenation of DWT and curvelet transformation (DWT+Curvelet hereinafter) are studied. The effects of reducing the number of features via Principal component analysis (PCA) on speaker identification is also investigated, and the (BP) neural network was introduced as a classifier. The classifier is trained with a different set of features extracted from three different levels of DWT; these features are extracted one level at a time. The recognition capabilities of the classifier for all levels are compared to determine the best level. This research explores any positive or negative effects of DWT+Curvelet on the classification capability of the proposed system. in addition, this work investigates the effects of reducing the number of features via PCA with DWT and DWT+Curvelet In this research, different three datasets were used for speaker identification system, where these dataset used for train and testing the Feed-Forward Backpropagation (BP). In this approach it is clear that introducing PCA with BP networks improved the accuracy and is an effective method for speaker identification system, where it keeps the effective information and reduces the redundancy of characteristic parameters Four experiments are performed as follows using the three datasets: Experiment 1: only DWT features that extracted from each level of discrete wavelet transformation independently are used to train and test the Neural Network; Experiment 2: the features extracted from each level of (DWT+Curvelet) used to train and test the Neural Network; Experiment 3: With DWT features after utilized principal component analysis used to train and test the neural network; Experiment 4: With (DWT+Curvelet) features after utilized principal component analysis used to train and test the Neural Network. Practical results showed that, the accuracy is improved in level 1 and 2 with database 1 and increased by approximately 5% and 4%, respectively; whereas the accuracy was improved in all levels 1, 2 and 3 with Database 2 and 3 and increased by approximately 11%, 4% and 2% for database 2 and 9%, 11%, 5% for database 3 respectively, when applying (DWT+Curvelet). The system was trained and tested using (Cross-validation).

Description

Keywords

Citation

Collections