A feature extraction technique named perceptual MVDR-based cepstral coefficients (PMCCs) was introduced into speaker recognition. PMCCs are extracted and modeled using Gaussian Mixture Models (GMMs) for speaker recognition. In order to compensate for speaker and channel variability effects, joint factor analysis (JFA) is used. The experiments are carried out on the core conditions of NIST 2008 speaker recognition evaluation data. The experimental results show that the systems based on PMCCs can achieve comparable performance to those based on the conventional MFCCs. Besides, the fusion of the two kinds of systems can make significant performance improvement compared to the MFCCs system alone, reducing equal error rate (EER) by the factor between 7.6% and 30.5% as well as minimum detect cost function (minDCF) by the factor between 3.2% and 21.2% on different test sets. The results indicate that PMCCs can be effectively applied in speaker recognition and they are complementary with MFCCs to some extent.
LIANGChunyan ZHANG Xiang YANG Lin ZHANG Jianping YAN Yonghong