BibTex RIS Cite

KONUŞMACI YAŞ VE CİNSİYETİNİN GKM SÜPERVEKTÖRLERİNE DAYALI BİR DVM SINIFLANDIRICISI İLE BELİRLENMESİ

Year 2016, Volume: 31 Issue: 3, 0 - 0, 06.09.2016
https://doi.org/10.17341/gummfd.71595

Abstract

Bu çalışmada, özellikle konuşmacı doğrulama sistemlerinde yaygın olarak kullanılan GKM süpervektörlerine dayalı DVM yaklaşımı, konuşmacıların yaş ve/veya cinsiyetlerine göre sınıflandırılması problemine uyarlanmıştır. Çalışmada ayrıca farklı sayıda GKM bileşeniyle oluşturulan yaş ve cinsiyet modelleri, farklı uzunlukta konuşmalarla test edilerek konuşma süresinin ve GKM bileşen sayısının başarıya etkisi de araştırılmıştır. Bu amaçla konuşmaların ses içermeyen bölümleri enerjiye dayalı olarak atıldıktan sonra kalan sesli bölümlerden çıkarılan Mel-Frekanslı Kepstrum Katsayıları (MFCC) kullanılarak üç kategoride testler yapılmıştır. Bu testlerde en yüksek sınıflandırma başarıları 16 sn’lik konuşmaların 64 bileşenli GKM’lerle modellenmesi sonucunda elde edilmiştir. Bu oranlar cinsiyet kategorisinde (çocuk, bayan, erkek) %92.42, yaş kategorisinde (çocuk, genç, yetişkin, yaşlı) %60.1 ve yaş-cinsiyet kategorisinde ise %60.02 olarak ölçülmüştür. 

References

  • Neti, C., and S. Roukos. "Phone-context specific gender-dependent acoustic-models for continuous speech recognition." Automatic Speech Recognition and Understanding, 1997. Proceedings., 1997 IEEE Workshop on. IEEE, 1997.
  • Schuller, B., Steidl, S., Batliner, A., Burkhardt, F., Devillers, L., Müller, C. A., & Narayanan, S. S. (2010, September). The INTERSPEECH 2010 paralinguistic challenge. In INTERSPEECH (pp. 2794-2797).
  • Mysak, Edward D. "Pitch and duration characteristics of older males." Journal of Speech & Hearing Research (1959).
  • Metze, Florian, et al. "Comparison of four approaches to age and gender recognition for telephone applications." Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on. Vol. 4. IEEE, 2007.
  • Li, Ming, Kyu J. Han, and Shrikanth Narayanan. "Automatic speaker age and gender recognition using acoustic and prosodic level information fusion."Computer Speech & Language 27.1 (2013): 151-167.
  • van Heerden, Charl, et al. "Combining regression and classification methods for improving automatic speaker age recognition." Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on. IEEE, 2010.
  • Meinedo, Hugo, and Isabel Trancoso. "Age and gender classification using fusion of acoustic and prosodic features." INTERSPEECH. 2010.
  • Bocklet, T., Stemmer, G., Zeissler, V., & Nöth, E. (2010). Age and gender recognition based on multiple systems-early vs. late fusion. In INTERSPEECH(pp. 2830-2833).
  • J. R. Deller, J. H. L. Hansen, J. G. Proakis, Discrete-Time Processing of Speech Signals, IEEE Press, Piscataway (N.J.), 2000.
  • Davis, Steven, and Paul Mermelstein. "Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences." Acoustics, Speech and Signal Processing, IEEE Transactions on28.4 (1980): 357-366.
  • S. Furui, Digital Speech Processing, Synthesis and Recognition, New York,
  • Marcel Dekker, 2001.
  • Reynolds, Douglas A., and Richard C. Rose. "Robust text-independent speaker identification using Gaussian mixture speaker models." Speech and Audio Processing, IEEE Transactions on 3.1 (1995): 72-83.
  • McLachlan, Geoffrey, and David Peel. Finite mixture models. John Wiley & Sons, 2004.
  • Dempster, Arthur P., Nan M. Laird, and Donald B. Rubin. "Maximum likelihood from incomplete data via the EM algorithm." Journal of the Royal Statistical Society. Series B (Methodological) (1977): 1-38.
  • Reynolds, Douglas A., Thomas F. Quatieri, and Robert B. Dunn. "Speaker verification using adapted Gaussian mixture models." Digital signal processing10.1 (2000): 19-41.
  • Ferras, Marc, et al. "Comparison of speaker adaptation methods as feature extraction for SVM-based speaker recognition." Audio, Speech, and Language Processing, IEEE Transactions on 18.6 (2010): 1366-1378.
  • Campbell, W. M., Sturim, D. E., and Reynolds, D. A.,"Support Vector Machines using GMM Supervectors for Speaker Verification", IEEE Signal Processing Letters, 13(5):308–311, May 2006.
Year 2016, Volume: 31 Issue: 3, 0 - 0, 06.09.2016
https://doi.org/10.17341/gummfd.71595

Abstract

References

  • Neti, C., and S. Roukos. "Phone-context specific gender-dependent acoustic-models for continuous speech recognition." Automatic Speech Recognition and Understanding, 1997. Proceedings., 1997 IEEE Workshop on. IEEE, 1997.
  • Schuller, B., Steidl, S., Batliner, A., Burkhardt, F., Devillers, L., Müller, C. A., & Narayanan, S. S. (2010, September). The INTERSPEECH 2010 paralinguistic challenge. In INTERSPEECH (pp. 2794-2797).
  • Mysak, Edward D. "Pitch and duration characteristics of older males." Journal of Speech & Hearing Research (1959).
  • Metze, Florian, et al. "Comparison of four approaches to age and gender recognition for telephone applications." Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on. Vol. 4. IEEE, 2007.
  • Li, Ming, Kyu J. Han, and Shrikanth Narayanan. "Automatic speaker age and gender recognition using acoustic and prosodic level information fusion."Computer Speech & Language 27.1 (2013): 151-167.
  • van Heerden, Charl, et al. "Combining regression and classification methods for improving automatic speaker age recognition." Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on. IEEE, 2010.
  • Meinedo, Hugo, and Isabel Trancoso. "Age and gender classification using fusion of acoustic and prosodic features." INTERSPEECH. 2010.
  • Bocklet, T., Stemmer, G., Zeissler, V., & Nöth, E. (2010). Age and gender recognition based on multiple systems-early vs. late fusion. In INTERSPEECH(pp. 2830-2833).
  • J. R. Deller, J. H. L. Hansen, J. G. Proakis, Discrete-Time Processing of Speech Signals, IEEE Press, Piscataway (N.J.), 2000.
  • Davis, Steven, and Paul Mermelstein. "Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences." Acoustics, Speech and Signal Processing, IEEE Transactions on28.4 (1980): 357-366.
  • S. Furui, Digital Speech Processing, Synthesis and Recognition, New York,
  • Marcel Dekker, 2001.
  • Reynolds, Douglas A., and Richard C. Rose. "Robust text-independent speaker identification using Gaussian mixture speaker models." Speech and Audio Processing, IEEE Transactions on 3.1 (1995): 72-83.
  • McLachlan, Geoffrey, and David Peel. Finite mixture models. John Wiley & Sons, 2004.
  • Dempster, Arthur P., Nan M. Laird, and Donald B. Rubin. "Maximum likelihood from incomplete data via the EM algorithm." Journal of the Royal Statistical Society. Series B (Methodological) (1977): 1-38.
  • Reynolds, Douglas A., Thomas F. Quatieri, and Robert B. Dunn. "Speaker verification using adapted Gaussian mixture models." Digital signal processing10.1 (2000): 19-41.
  • Ferras, Marc, et al. "Comparison of speaker adaptation methods as feature extraction for SVM-based speaker recognition." Audio, Speech, and Language Processing, IEEE Transactions on 18.6 (2010): 1366-1378.
  • Campbell, W. M., Sturim, D. E., and Reynolds, D. A.,"Support Vector Machines using GMM Supervectors for Speaker Verification", IEEE Signal Processing Letters, 13(5):308–311, May 2006.
There are 18 citations in total.

Details

Journal Section Makaleler
Authors

Ergün Yücesoy

Vasif V. Nabiyev

Publication Date September 6, 2016
Submission Date November 26, 2014
Published in Issue Year 2016 Volume: 31 Issue: 3

Cite

APA Yücesoy, E., & V. Nabiyev, V. (2016). KONUŞMACI YAŞ VE CİNSİYETİNİN GKM SÜPERVEKTÖRLERİNE DAYALI BİR DVM SINIFLANDIRICISI İLE BELİRLENMESİ. Gazi Üniversitesi Mühendislik Mimarlık Fakültesi Dergisi, 31(3). https://doi.org/10.17341/gummfd.71595
AMA Yücesoy E, V. Nabiyev V. KONUŞMACI YAŞ VE CİNSİYETİNİN GKM SÜPERVEKTÖRLERİNE DAYALI BİR DVM SINIFLANDIRICISI İLE BELİRLENMESİ. GUMMFD. September 2016;31(3). doi:10.17341/gummfd.71595
Chicago Yücesoy, Ergün, and Vasif V. Nabiyev. “KONUŞMACI YAŞ VE CİNSİYETİNİN GKM SÜPERVEKTÖRLERİNE DAYALI BİR DVM SINIFLANDIRICISI İLE BELİRLENMESİ”. Gazi Üniversitesi Mühendislik Mimarlık Fakültesi Dergisi 31, no. 3 (September 2016). https://doi.org/10.17341/gummfd.71595.
EndNote Yücesoy E, V. Nabiyev V (September 1, 2016) KONUŞMACI YAŞ VE CİNSİYETİNİN GKM SÜPERVEKTÖRLERİNE DAYALI BİR DVM SINIFLANDIRICISI İLE BELİRLENMESİ. Gazi Üniversitesi Mühendislik Mimarlık Fakültesi Dergisi 31 3
IEEE E. Yücesoy and V. V. Nabiyev, “KONUŞMACI YAŞ VE CİNSİYETİNİN GKM SÜPERVEKTÖRLERİNE DAYALI BİR DVM SINIFLANDIRICISI İLE BELİRLENMESİ”, GUMMFD, vol. 31, no. 3, 2016, doi: 10.17341/gummfd.71595.
ISNAD Yücesoy, Ergün - V. Nabiyev, Vasif. “KONUŞMACI YAŞ VE CİNSİYETİNİN GKM SÜPERVEKTÖRLERİNE DAYALI BİR DVM SINIFLANDIRICISI İLE BELİRLENMESİ”. Gazi Üniversitesi Mühendislik Mimarlık Fakültesi Dergisi 31/3 (September 2016). https://doi.org/10.17341/gummfd.71595.
JAMA Yücesoy E, V. Nabiyev V. KONUŞMACI YAŞ VE CİNSİYETİNİN GKM SÜPERVEKTÖRLERİNE DAYALI BİR DVM SINIFLANDIRICISI İLE BELİRLENMESİ. GUMMFD. 2016;31. doi:10.17341/gummfd.71595.
MLA Yücesoy, Ergün and Vasif V. Nabiyev. “KONUŞMACI YAŞ VE CİNSİYETİNİN GKM SÜPERVEKTÖRLERİNE DAYALI BİR DVM SINIFLANDIRICISI İLE BELİRLENMESİ”. Gazi Üniversitesi Mühendislik Mimarlık Fakültesi Dergisi, vol. 31, no. 3, 2016, doi:10.17341/gummfd.71595.
Vancouver Yücesoy E, V. Nabiyev V. KONUŞMACI YAŞ VE CİNSİYETİNİN GKM SÜPERVEKTÖRLERİNE DAYALI BİR DVM SINIFLANDIRICISI İLE BELİRLENMESİ. GUMMFD. 2016;31(3).

Cited By

Speech-to-Gender Recognition Based on Machine Learning Algorithms
International Journal of Applied Mathematics Electronics and Computers
https://doi.org/10.18100/ijamec.1221455