KONUŞMACI YAŞ VE CİNSİYETİNİN GKM SÜPERVEKTÖRLERİNE DAYALI BİR DVM SINIFLANDIRICISI İLE BELİRLENMESİ

Ergün Yücesoy; Vasif V. Nabiyev

doi:10.17341/gummfd.71595

KONUŞMACI YAŞ VE CİNSİYETİNİN GKM SÜPERVEKTÖRLERİNE DAYALI BİR DVM SINIFLANDIRICISI İLE BELİRLENMESİ

Year 2016, Volume: 31 Issue: 3, 0 - 0, 06.09.2016

Ergün Yücesoy Vasif V. Nabiyev

https://doi.org/10.17341/gummfd.71595

Cited By: 3

Abstract

Bu çalışmada, özellikle konuşmacı doğrulama sistemlerinde yaygın olarak kullanılan GKM süpervektörlerine dayalı DVM yaklaşımı, konuşmacıların yaş ve/veya cinsiyetlerine göre sınıflandırılması problemine uyarlanmıştır. Çalışmada ayrıca farklı sayıda GKM bileşeniyle oluşturulan yaş ve cinsiyet modelleri, farklı uzunlukta konuşmalarla test edilerek konuşma süresinin ve GKM bileşen sayısının başarıya etkisi de araştırılmıştır. Bu amaçla konuşmaların ses içermeyen bölümleri enerjiye dayalı olarak atıldıktan sonra kalan sesli bölümlerden çıkarılan Mel-Frekanslı Kepstrum Katsayıları (MFCC) kullanılarak üç kategoride testler yapılmıştır. Bu testlerde en yüksek sınıflandırma başarıları 16 sn’lik konuşmaların 64 bileşenli GKM’lerle modellenmesi sonucunda elde edilmiştir. Bu oranlar cinsiyet kategorisinde (çocuk, bayan, erkek) %92.42, yaş kategorisinde (çocuk, genç, yetişkin, yaşlı) %60.1 ve yaş-cinsiyet kategorisinde ise %60.02 olarak ölçülmüştür.

Keywords

Yaş ve cinsiyet tanıma, GKM (Gauss Karışım Modeli), GKM süpervektörleri, DVM (Destek Vektör Makinesi)

References

Neti, C., and S. Roukos. "Phone-context specific gender-dependent acoustic-models for continuous speech recognition." Automatic Speech Recognition and Understanding, 1997. Proceedings., 1997 IEEE Workshop on. IEEE, 1997.
Schuller, B., Steidl, S., Batliner, A., Burkhardt, F., Devillers, L., Müller, C. A., & Narayanan, S. S. (2010, September). The INTERSPEECH 2010 paralinguistic challenge. In INTERSPEECH (pp. 2794-2797).
Mysak, Edward D. "Pitch and duration characteristics of older males." Journal of Speech & Hearing Research (1959).
Metze, Florian, et al. "Comparison of four approaches to age and gender recognition for telephone applications." Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on. Vol. 4. IEEE, 2007.
Li, Ming, Kyu J. Han, and Shrikanth Narayanan. "Automatic speaker age and gender recognition using acoustic and prosodic level information fusion."Computer Speech & Language 27.1 (2013): 151-167.
van Heerden, Charl, et al. "Combining regression and classification methods for improving automatic speaker age recognition." Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on. IEEE, 2010.
Meinedo, Hugo, and Isabel Trancoso. "Age and gender classification using fusion of acoustic and prosodic features." INTERSPEECH. 2010.
Bocklet, T., Stemmer, G., Zeissler, V., & Nöth, E. (2010). Age and gender recognition based on multiple systems-early vs. late fusion. In INTERSPEECH(pp. 2830-2833).
J. R. Deller, J. H. L. Hansen, J. G. Proakis, Discrete-Time Processing of Speech Signals, IEEE Press, Piscataway (N.J.), 2000.
Davis, Steven, and Paul Mermelstein. "Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences." Acoustics, Speech and Signal Processing, IEEE Transactions on28.4 (1980): 357-366.
S. Furui, Digital Speech Processing, Synthesis and Recognition, New York,
Marcel Dekker, 2001.
Reynolds, Douglas A., and Richard C. Rose. "Robust text-independent speaker identification using Gaussian mixture speaker models." Speech and Audio Processing, IEEE Transactions on 3.1 (1995): 72-83.
McLachlan, Geoffrey, and David Peel. Finite mixture models. John Wiley & Sons, 2004.
Dempster, Arthur P., Nan M. Laird, and Donald B. Rubin. "Maximum likelihood from incomplete data via the EM algorithm." Journal of the Royal Statistical Society. Series B (Methodological) (1977): 1-38.
Reynolds, Douglas A., Thomas F. Quatieri, and Robert B. Dunn. "Speaker verification using adapted Gaussian mixture models." Digital signal processing10.1 (2000): 19-41.
Ferras, Marc, et al. "Comparison of speaker adaptation methods as feature extraction for SVM-based speaker recognition." Audio, Speech, and Language Processing, IEEE Transactions on 18.6 (2010): 1366-1378.
Campbell, W. M., Sturim, D. E., and Reynolds, D. A.,"Support Vector Machines using GMM Supervectors for Speaker Verification", IEEE Signal Processing Letters, 13(5):308–311, May 2006.

Year 2016, Volume: 31 Issue: 3, 0 - 0, 06.09.2016

Ergün Yücesoy Vasif V. Nabiyev

https://doi.org/10.17341/gummfd.71595

Cited By: 3

Abstract

References

Neti, C., and S. Roukos. "Phone-context specific gender-dependent acoustic-models for continuous speech recognition." Automatic Speech Recognition and Understanding, 1997. Proceedings., 1997 IEEE Workshop on. IEEE, 1997.
Schuller, B., Steidl, S., Batliner, A., Burkhardt, F., Devillers, L., Müller, C. A., & Narayanan, S. S. (2010, September). The INTERSPEECH 2010 paralinguistic challenge. In INTERSPEECH (pp. 2794-2797).
Mysak, Edward D. "Pitch and duration characteristics of older males." Journal of Speech & Hearing Research (1959).
Metze, Florian, et al. "Comparison of four approaches to age and gender recognition for telephone applications." Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on. Vol. 4. IEEE, 2007.
Li, Ming, Kyu J. Han, and Shrikanth Narayanan. "Automatic speaker age and gender recognition using acoustic and prosodic level information fusion."Computer Speech & Language 27.1 (2013): 151-167.
van Heerden, Charl, et al. "Combining regression and classification methods for improving automatic speaker age recognition." Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on. IEEE, 2010.
Meinedo, Hugo, and Isabel Trancoso. "Age and gender classification using fusion of acoustic and prosodic features." INTERSPEECH. 2010.
Bocklet, T., Stemmer, G., Zeissler, V., & Nöth, E. (2010). Age and gender recognition based on multiple systems-early vs. late fusion. In INTERSPEECH(pp. 2830-2833).
J. R. Deller, J. H. L. Hansen, J. G. Proakis, Discrete-Time Processing of Speech Signals, IEEE Press, Piscataway (N.J.), 2000.
Davis, Steven, and Paul Mermelstein. "Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences." Acoustics, Speech and Signal Processing, IEEE Transactions on28.4 (1980): 357-366.
S. Furui, Digital Speech Processing, Synthesis and Recognition, New York,
Marcel Dekker, 2001.
Reynolds, Douglas A., and Richard C. Rose. "Robust text-independent speaker identification using Gaussian mixture speaker models." Speech and Audio Processing, IEEE Transactions on 3.1 (1995): 72-83.
McLachlan, Geoffrey, and David Peel. Finite mixture models. John Wiley & Sons, 2004.
Dempster, Arthur P., Nan M. Laird, and Donald B. Rubin. "Maximum likelihood from incomplete data via the EM algorithm." Journal of the Royal Statistical Society. Series B (Methodological) (1977): 1-38.
Reynolds, Douglas A., Thomas F. Quatieri, and Robert B. Dunn. "Speaker verification using adapted Gaussian mixture models." Digital signal processing10.1 (2000): 19-41.
Ferras, Marc, et al. "Comparison of speaker adaptation methods as feature extraction for SVM-based speaker recognition." Audio, Speech, and Language Processing, IEEE Transactions on 18.6 (2010): 1366-1378.
Campbell, W. M., Sturim, D. E., and Reynolds, D. A.,"Support Vector Machines using GMM Supervectors for Speaker Verification", IEEE Signal Processing Letters, 13(5):308–311, May 2006.

There are 18 citations in total.

Details

Journal Section	Makaleler
Authors	Ergün Yücesoy Vasif V. Nabiyev
Publication Date	September 6, 2016
Submission Date	November 26, 2014
Published in Issue	Year 2016 Volume: 31 Issue: 3

Cite

APA	Yücesoy, E., & V. Nabiyev, V. (2016). KONUŞMACI YAŞ VE CİNSİYETİNİN GKM SÜPERVEKTÖRLERİNE DAYALI BİR DVM SINIFLANDIRICISI İLE BELİRLENMESİ. Gazi Üniversitesi Mühendislik Mimarlık Fakültesi Dergisi, 31(3). https://doi.org/10.17341/gummfd.71595
AMA	Yücesoy E, V. Nabiyev V. KONUŞMACI YAŞ VE CİNSİYETİNİN GKM SÜPERVEKTÖRLERİNE DAYALI BİR DVM SINIFLANDIRICISI İLE BELİRLENMESİ. GUMMFD. September 2016;31(3). doi:10.17341/gummfd.71595
Chicago	Yücesoy, Ergün, and Vasif V. Nabiyev. “KONUŞMACI YAŞ VE CİNSİYETİNİN GKM SÜPERVEKTÖRLERİNE DAYALI BİR DVM SINIFLANDIRICISI İLE BELİRLENMESİ”. Gazi Üniversitesi Mühendislik Mimarlık Fakültesi Dergisi 31, no. 3 (September 2016). https://doi.org/10.17341/gummfd.71595.
EndNote	Yücesoy E, V. Nabiyev V (September 1, 2016) KONUŞMACI YAŞ VE CİNSİYETİNİN GKM SÜPERVEKTÖRLERİNE DAYALI BİR DVM SINIFLANDIRICISI İLE BELİRLENMESİ. Gazi Üniversitesi Mühendislik Mimarlık Fakültesi Dergisi 31 3
IEEE	E. Yücesoy and V. V. Nabiyev, “KONUŞMACI YAŞ VE CİNSİYETİNİN GKM SÜPERVEKTÖRLERİNE DAYALI BİR DVM SINIFLANDIRICISI İLE BELİRLENMESİ”, GUMMFD, vol. 31, no. 3, 2016, doi: 10.17341/gummfd.71595.
ISNAD	Yücesoy, Ergün - V. Nabiyev, Vasif. “KONUŞMACI YAŞ VE CİNSİYETİNİN GKM SÜPERVEKTÖRLERİNE DAYALI BİR DVM SINIFLANDIRICISI İLE BELİRLENMESİ”. Gazi Üniversitesi Mühendislik Mimarlık Fakültesi Dergisi 31/3 (September 2016). https://doi.org/10.17341/gummfd.71595.
JAMA	Yücesoy E, V. Nabiyev V. KONUŞMACI YAŞ VE CİNSİYETİNİN GKM SÜPERVEKTÖRLERİNE DAYALI BİR DVM SINIFLANDIRICISI İLE BELİRLENMESİ. GUMMFD. 2016;31. doi:10.17341/gummfd.71595.
MLA	Yücesoy, Ergün and Vasif V. Nabiyev. “KONUŞMACI YAŞ VE CİNSİYETİNİN GKM SÜPERVEKTÖRLERİNE DAYALI BİR DVM SINIFLANDIRICISI İLE BELİRLENMESİ”. Gazi Üniversitesi Mühendislik Mimarlık Fakültesi Dergisi, vol. 31, no. 3, 2016, doi:10.17341/gummfd.71595.
Vancouver	Yücesoy E, V. Nabiyev V. KONUŞMACI YAŞ VE CİNSİYETİNİN GKM SÜPERVEKTÖRLERİNE DAYALI BİR DVM SINIFLANDIRICISI İLE BELİRLENMESİ. GUMMFD. 2016;31(3).

Journal of the Faculty of Engineering and Architecture of Gazi University

KONUŞMACI YAŞ VE CİNSİYETİNİN GKM SÜPERVEKTÖRLERİNE DAYALI BİR DVM SINIFLANDIRICISI İLE BELİRLENMESİ

Abstract

Keywords

References

Abstract

References

Details

Cite

Cited By

Speech-to-Gender Recognition Based on Machine Learning Algorithms

International Journal of Applied Mathematics Electronics and Computers

https://doi.org/10.18100/ijamec.1221455

Konuşmacı Cinsiyetinin Tespitinde Değişik Normalizasyon Tekniklerinin Kıyaslanması

Mehmet Akif Ersoy Üniversitesi Uygulamalı Bilimler Dergisi

Serhat İLERİ

https://doi.org/10.31200/makuubd.410625

Spermiogram Görüntülerinden Hareket Belirleme Yöntemleri ile Aktif Sperm Sayısının Tahmini

Gazi Üniversitesi Mühendislik-Mimarlık Fakültesi Dergisi

Abdülkadir Gümüşçü

https://doi.org/10.17341/gazimmfd.460524