Abstract:
Automatic recognition systems are commonly used in speech processing to classify
observed utterances by the speaker identity, dialect and language. A lot of research has
been performed to detect speeches, dialects and languages of different region throughout
the world. But the work on dialects of Bangladesh is infrequent to our research. These
dialects, in turn, differ quite a bit from each other. In this paper, we propose a method to
detect Bangladeshi different dialects which utilizes Mel-Frequency Cepstral Coefficients
(MFCC), its Deltas and Delta-Deltas as main features and Gaussian Mixture Models
(GMM) to classify characteristics of a specific dialect. Particularly we extract the MFCC,
Deltas and Delta-Deltas features from the speech signal. Then they are merged together to
form a feature vector for a specific dialect. GMM is trained using the iterative Expectation
Maximization (EM) algorithm where feature vectors are served as input. This scheme is
tested on 5 databases of 30 speech samples each. Speech samples are contained dialects of
Borishal, Noakhali, Sylhet, Chittagong and Chapai Nobabgonj regions of Bangladesh.
Experiments show that GMM adaptation gives comparable good performance.
Description:
This thesis submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering of East West University, Dhaka, Bangladesh.