Self Adaptive Methods for Learning Rate Parameter of Q-Learning Algorithm

Murat Erhan Çimen; Zeynep Garip; Yaprak Yalçın; Mustafa Kutlu; Ali Fuat Boz

doi:10.38016/jista.1250782

Research Article

Q-Learning Algoritmasının Öğrenme Hızı Parametresi için Kendine Uyarlamalı Yöntemler parametresi

Year 2023, Volume: 6 Issue: 2, 191 - 198, 23.09.2023

Murat Erhan Çimen Zeynep Garip Yaprak Yalçın Mustafa Kutlu Ali Fuat Boz

https://doi.org/10.38016/jista.1250782

Abstract

Makine öğrenmesi yöntemleri genel olarak denetimli, denetimsiz ve takviyeli öğrenme olarak sınıflandırılabilir. Bu yöntemlerden biri olan takviyeli öğrenme içerisinde bulunan Q learning algoritması ortamla etkileşime girerek ortamdan öğrenebilen ve ona göre aksiyonlar üretebilen bir algoritmadır. Bu çalışmada Q learning algoritması içerisinde bulunan öğrenme parametresinin değeri için 8 farklı yöntem önerilmiştir. Önerilen yöntemlerin performanslarının test edilebilmesi için donmuş göl ve ters sarkaç sistemlerine bu algoritmalar uygulanmış ve sonuçları grafiksel ve istatistiksel olarak karşılaştırılmıştır. Elde edilen sonuçlar incelendiğinde ayrık bir sistem olan Donmuş Göl sistemi için Metot 1 daha iyi performans sergilerken sürekli bir sistem olan Ters Sarkaç Sistemi için Metot 7 daha iyi sonuç göstermiştir.

Keywords

Takviyeli Öğrenme, Q Learning, Makine Öğrenmesi

References

Adigüzel, F., Yalçin, Y., 2018. Discrete-Time Backstepping Control for Cart-Pendulum System with Disturbance Attenuation via I&i Disturbance Estimation. in 2018 2nd International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT).
Adıgüzel, F., Yalçin, Y., 2022. “Backstepping Control for a Class of Underactuated Nonlinear Mechanical Systems with a Novel Coordinate Transformation in the Discrete-Time Setting.” in Proceedings of the Institution of Mechanical Engineers, Part I: Journal of Systems and Control Engineering.
Akyurek, H.A., Bucak İ.Ö., 2012. Zamansal-Fark, Uyarlanır Dinamik Programlama ve SARSA Etmenlerinin Tipik Arazi Aracı Problemi Için Öğrenme Performansları. in Akıllı Sistemlerde Yenilikler ve Uygulamaları Sempozyumu. Trabzon.
Angiuli, A., Fouque J.P., Laurière M., 2022. Unified Reinforcement Q-Learning for Mean Field Game and Control Problems. Mathematics of Control, Signals, and Systems 34(2):217–71.
Barlow, H. B., 1989. Unsupervised Learning. Neural Computation 1(3).
Barto, A. G., Sutton R.S., Anderson C.W., 1983. Neuronlike Adaptive Elements That Can Solve Difficult Learning Control Problems. IEEE Transactions on Systems, Man, and Cybernetics 5(834–846).
Bayraj, E. A., Kırcı, P., Ensari, T., Seven, E., Dağtekin, M., 2022. Göğüs Kanseri Verileri Üzerinde Makine Öğrenmesi Yöntemlerinin Uygulanması. Journal of Intelligent Systems: Theory and Applications 5(1):35–41.
Bucak, I.Ö., Zohdy M. A., 1999. Application Of Reinforcement Learning Control To A Nonlinear Bouncing Cart. Pp. 1198–1202 in Proceedings of the American Control Conference. San Diego, California.
Candan, F., Emir, S., Doğan, M., Kumbasar, T., 2018. Takviyeli Q-Öğrenme Yöntemiyle Labirent Problemi Çözümü Labyrinth Problem Solution with Reinforcement Q-Learning Method. in TOK2018 Otomatik Kontrol Ulusal Toplantısı.
Chen, T., Chen, Y., He, Z., Li, E., Zhang, C., Huang., Y., 2022. A Novel Marine Predators Algorithm with Adaptive Update Strategy. He Journal of Supercomputing 1–34.
Çimen, M.E., Garip, Z. Pala M.A., Boz, A.F., Akgül, A. 2019. Modelling of a Chaotic System Motion in Video with Artificial Neural Networks. Chaos Theory and Applications 1(1).
Cimen, M.E., Yalçın, Y., 2022. A Novel Hybrid Firefly–Whale Optimization Algorithm and Its Application to Optimization of MPC Parameters, Soft Computing 26(4):1845–72.
Cimen, M.E., Boyraz, O.F., Yildiz, M.Z., Boz, A.F., 2021. A New Dorsal Hand Vein Authentication System Based on Fractal Dimension Box Counting Method, Optik 226.
Cunningham, P., Cord, M. Delany, S.J., 2008. Supervised Learning, Pp. 21–49 in Machine learning techniques for multimedia: case studies on organization and retrieval,.
Ekinci, E., 2022. Classification of Imbalanced Offensive Dataset–Sentence Generation for Minority Class with LSTM, Sakarya University Journal of Computer and Information Sciences 5(1):121–33.
Elallid, B. B., Benamar, N., Hafid, A. S., Rachidi, T., Mrani, N., 2022. A Comprehensive Survey on the Application of Deep and Reinforcement Learning Approaches in Autonomous Driving, Journal of King Saud University-Computer and Information Sciences.
Grefenstette, J. J., 1993. Genetic Algorithms and Machine Learning, in Proceedings of the sixth annual conference on Computational learning theory.
Jogunola, O., Adebisi, B., Ikpehai, A., Popoola, S. I., Gui, G., Gačanin, H., Ci. S., 2020. Consensus Algorithms and Deep Reinforcement Learning in Energy Market: A Review, IEEE Internet of Things Journal 8(6).
Meng, T. L., Khushi, M., 2019. Reinforcement Learning in Financial Markets, Data 4(3).
O’Neill, D., Levorato, M., Goldsmith, A., Mitra U., 2010. Residential Demand Response Using Reinforcement Learning, in 2010 First IEEE International Conference on Smart Grid Communications.
Omurca, S. İ., Ekinci, E., Sevim, S., Edinç, E. B., Eken, A., Sayar, S., 2022. A Document Image Classification System Fusing Deep and Machine Learning Models, Applied Intelligence 1–16.
Pala, M. A., Çimen, M. E., Boyraz, Ö. F., Yildiz, M. Z., Boz, A., 2019. Meme Kanserinin Teşhis Edilmesinde Karar Ağacı Ve KNN Algoritmalarının Karşılaştırmalı Başarım Analizi, Academic Perspective Procedia 2(3).
Pala, M.A., Cimen, M.E., Yıldız, M.Z. Cetinel, G., Avcıoglu, E., Alaca, Y., 2022. CNN-Based Approach for Overlapping Erythrocyte Counting and Cell Type Classification in Peripheral Blood Images, Chaos Theory and Applications 4(2).
Pala, M.A., Cimen, M.E., Yıldız, M.Z. Cetinel, G., Özkan, A.D., 2021. Holografik Görüntülerde Kenar Tabanlı Fraktal Özniteliklerin Hücre Canlılık Analizlerinde Başarısı, Journal of Smart Systems Research 2(2):89–94.
Peng, J., Williams. R.J., 1996. Incremental Multi-Step Q-Learning.
Sarızeybek, A. T., Sevli, O., 2022. Makine Öğrenmesi Yöntemleri Ile Banka Müşterilerinin Kredi Alma Eğiliminin Karşılaştırmalı Analizi. Journal of Intelligent Systems: Theory and Applications 5(2):137–44.
Sathya, R., Abraham., A., 2013. Comparison of Supervised and Unsupervised Learning Algorithms for Pattern Classification, in (IJARAI) International Journal of Advanced Research in Artificial Intelligence.
Singh, B., Kumar, R., Singh., V. P., 2022. Reinforcement Learning in Robotic Applications: A Comprehensive Survey, Artificial Intelligence Review 1–46.
Smart, W.D., Kaelbling, L.P., 2000, Practical Reinforcement Learning in Continuous Spaces. ICML.
Toğaçar, M., K. A. Eşidir, and B. Ergen. 2021. “Yapay Zekâ Tabanlı Doğal Dil İşleme Yaklaşımını Kullanarak İnternet Ortamında Yayınlanmış Sahte Haberlerin Tespiti.” Journal of Intelligent Systems: Theory and Applications 5(1):1–8.
Wang, H., Emmerich, M., Plaat, A., Monte Carlo Q-Learning for General Game Playing, ArXiv Preprint ArXiv:1802.05944.
Watkins, C. J. C. H., 1989. Learning from Delayed Rewards, Dissertation, King’s College UK.
Watkins, C.J.C.H, Dayan P., 1992. Q-Learning, Machine Learning.

Self Adaptive Methods for Learning Rate Parameter of Q-Learning Algorithm

Year 2023, Volume: 6 Issue: 2, 191 - 198, 23.09.2023

Murat Erhan Çimen Zeynep Garip Yaprak Yalçın Mustafa Kutlu Ali Fuat Boz

https://doi.org/10.38016/jista.1250782

Abstract

Machine learning methods can generally be categorized as supervised, unsupervised and reinforcement learning. One of these methods, Q learning algorithm in reinforcement learning, is an algorithm that can interact with the environment and learn from the environment and produce actions accordingly. In this study, eight different on-line methods have been proposed to determine online the value of the learning parameter in the Q learning algorithm depending on different situations. In order to test the performance of the proposed methods, these algorithms are applied to Frozen Lake and Car Pole systems and the results are compared graphically and statistically. When the obtained results are examined, Method 1 has produced better performance for Frozen Lake, which is a discrete system, while Method 7 has produced better results for the Cart Pole System, which is a continuous system.

Keywords

Reinforcement Learning, Q learning, Machine Learning

References

Adigüzel, F., Yalçin, Y., 2018. Discrete-Time Backstepping Control for Cart-Pendulum System with Disturbance Attenuation via I&i Disturbance Estimation. in 2018 2nd International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT).
Adıgüzel, F., Yalçin, Y., 2022. “Backstepping Control for a Class of Underactuated Nonlinear Mechanical Systems with a Novel Coordinate Transformation in the Discrete-Time Setting.” in Proceedings of the Institution of Mechanical Engineers, Part I: Journal of Systems and Control Engineering.
Akyurek, H.A., Bucak İ.Ö., 2012. Zamansal-Fark, Uyarlanır Dinamik Programlama ve SARSA Etmenlerinin Tipik Arazi Aracı Problemi Için Öğrenme Performansları. in Akıllı Sistemlerde Yenilikler ve Uygulamaları Sempozyumu. Trabzon.
Angiuli, A., Fouque J.P., Laurière M., 2022. Unified Reinforcement Q-Learning for Mean Field Game and Control Problems. Mathematics of Control, Signals, and Systems 34(2):217–71.
Barlow, H. B., 1989. Unsupervised Learning. Neural Computation 1(3).
Barto, A. G., Sutton R.S., Anderson C.W., 1983. Neuronlike Adaptive Elements That Can Solve Difficult Learning Control Problems. IEEE Transactions on Systems, Man, and Cybernetics 5(834–846).
Bayraj, E. A., Kırcı, P., Ensari, T., Seven, E., Dağtekin, M., 2022. Göğüs Kanseri Verileri Üzerinde Makine Öğrenmesi Yöntemlerinin Uygulanması. Journal of Intelligent Systems: Theory and Applications 5(1):35–41.
Bucak, I.Ö., Zohdy M. A., 1999. Application Of Reinforcement Learning Control To A Nonlinear Bouncing Cart. Pp. 1198–1202 in Proceedings of the American Control Conference. San Diego, California.
Candan, F., Emir, S., Doğan, M., Kumbasar, T., 2018. Takviyeli Q-Öğrenme Yöntemiyle Labirent Problemi Çözümü Labyrinth Problem Solution with Reinforcement Q-Learning Method. in TOK2018 Otomatik Kontrol Ulusal Toplantısı.
Chen, T., Chen, Y., He, Z., Li, E., Zhang, C., Huang., Y., 2022. A Novel Marine Predators Algorithm with Adaptive Update Strategy. He Journal of Supercomputing 1–34.
Çimen, M.E., Garip, Z. Pala M.A., Boz, A.F., Akgül, A. 2019. Modelling of a Chaotic System Motion in Video with Artificial Neural Networks. Chaos Theory and Applications 1(1).
Cimen, M.E., Yalçın, Y., 2022. A Novel Hybrid Firefly–Whale Optimization Algorithm and Its Application to Optimization of MPC Parameters, Soft Computing 26(4):1845–72.
Cimen, M.E., Boyraz, O.F., Yildiz, M.Z., Boz, A.F., 2021. A New Dorsal Hand Vein Authentication System Based on Fractal Dimension Box Counting Method, Optik 226.
Cunningham, P., Cord, M. Delany, S.J., 2008. Supervised Learning, Pp. 21–49 in Machine learning techniques for multimedia: case studies on organization and retrieval,.
Ekinci, E., 2022. Classification of Imbalanced Offensive Dataset–Sentence Generation for Minority Class with LSTM, Sakarya University Journal of Computer and Information Sciences 5(1):121–33.
Elallid, B. B., Benamar, N., Hafid, A. S., Rachidi, T., Mrani, N., 2022. A Comprehensive Survey on the Application of Deep and Reinforcement Learning Approaches in Autonomous Driving, Journal of King Saud University-Computer and Information Sciences.
Grefenstette, J. J., 1993. Genetic Algorithms and Machine Learning, in Proceedings of the sixth annual conference on Computational learning theory.
Jogunola, O., Adebisi, B., Ikpehai, A., Popoola, S. I., Gui, G., Gačanin, H., Ci. S., 2020. Consensus Algorithms and Deep Reinforcement Learning in Energy Market: A Review, IEEE Internet of Things Journal 8(6).
Meng, T. L., Khushi, M., 2019. Reinforcement Learning in Financial Markets, Data 4(3).
O’Neill, D., Levorato, M., Goldsmith, A., Mitra U., 2010. Residential Demand Response Using Reinforcement Learning, in 2010 First IEEE International Conference on Smart Grid Communications.
Omurca, S. İ., Ekinci, E., Sevim, S., Edinç, E. B., Eken, A., Sayar, S., 2022. A Document Image Classification System Fusing Deep and Machine Learning Models, Applied Intelligence 1–16.
Pala, M. A., Çimen, M. E., Boyraz, Ö. F., Yildiz, M. Z., Boz, A., 2019. Meme Kanserinin Teşhis Edilmesinde Karar Ağacı Ve KNN Algoritmalarının Karşılaştırmalı Başarım Analizi, Academic Perspective Procedia 2(3).
Pala, M.A., Cimen, M.E., Yıldız, M.Z. Cetinel, G., Avcıoglu, E., Alaca, Y., 2022. CNN-Based Approach for Overlapping Erythrocyte Counting and Cell Type Classification in Peripheral Blood Images, Chaos Theory and Applications 4(2).
Pala, M.A., Cimen, M.E., Yıldız, M.Z. Cetinel, G., Özkan, A.D., 2021. Holografik Görüntülerde Kenar Tabanlı Fraktal Özniteliklerin Hücre Canlılık Analizlerinde Başarısı, Journal of Smart Systems Research 2(2):89–94.
Peng, J., Williams. R.J., 1996. Incremental Multi-Step Q-Learning.
Sarızeybek, A. T., Sevli, O., 2022. Makine Öğrenmesi Yöntemleri Ile Banka Müşterilerinin Kredi Alma Eğiliminin Karşılaştırmalı Analizi. Journal of Intelligent Systems: Theory and Applications 5(2):137–44.
Sathya, R., Abraham., A., 2013. Comparison of Supervised and Unsupervised Learning Algorithms for Pattern Classification, in (IJARAI) International Journal of Advanced Research in Artificial Intelligence.
Singh, B., Kumar, R., Singh., V. P., 2022. Reinforcement Learning in Robotic Applications: A Comprehensive Survey, Artificial Intelligence Review 1–46.
Smart, W.D., Kaelbling, L.P., 2000, Practical Reinforcement Learning in Continuous Spaces. ICML.
Toğaçar, M., K. A. Eşidir, and B. Ergen. 2021. “Yapay Zekâ Tabanlı Doğal Dil İşleme Yaklaşımını Kullanarak İnternet Ortamında Yayınlanmış Sahte Haberlerin Tespiti.” Journal of Intelligent Systems: Theory and Applications 5(1):1–8.
Wang, H., Emmerich, M., Plaat, A., Monte Carlo Q-Learning for General Game Playing, ArXiv Preprint ArXiv:1802.05944.
Watkins, C. J. C. H., 1989. Learning from Delayed Rewards, Dissertation, King’s College UK.
Watkins, C.J.C.H, Dayan P., 1992. Q-Learning, Machine Learning.

There are 33 citations in total.

Details

Primary Language	English
Subjects	Artificial Intelligence, Electrical Engineering
Journal Section	Research Articles
Authors	Murat Erhan Çimen 0000-0002-1793-485X Zeynep Garip 0000-0002-0420-8541 Yaprak Yalçın 0000-0002-9261-9032 Mustafa Kutlu 0000-0003-1663-2523 Ali Fuat Boz 0000-0001-6575-7678
Early Pub Date	September 23, 2023
Publication Date	September 23, 2023
Submission Date	March 3, 2023
Published in Issue	Year 2023 Volume: 6 Issue: 2

Cite

APA	Çimen, M. E., Garip, Z., Yalçın, Y., Kutlu, M., et al. (2023). Self Adaptive Methods for Learning Rate Parameter of Q-Learning Algorithm. Journal of Intelligent Systems: Theory and Applications, 6(2), 191-198. https://doi.org/10.38016/jista.1250782
AMA	Çimen ME, Garip Z, Yalçın Y, Kutlu M, Boz AF. Self Adaptive Methods for Learning Rate Parameter of Q-Learning Algorithm. JISTA. September 2023;6(2):191-198. doi:10.38016/jista.1250782
Chicago	Çimen, Murat Erhan, Zeynep Garip, Yaprak Yalçın, Mustafa Kutlu, and Ali Fuat Boz. “Self Adaptive Methods for Learning Rate Parameter of Q-Learning Algorithm”. Journal of Intelligent Systems: Theory and Applications 6, no. 2 (September 2023): 191-98. https://doi.org/10.38016/jista.1250782.
EndNote	Çimen ME, Garip Z, Yalçın Y, Kutlu M, Boz AF (September 1, 2023) Self Adaptive Methods for Learning Rate Parameter of Q-Learning Algorithm. Journal of Intelligent Systems: Theory and Applications 6 2 191–198.
IEEE	M. E. Çimen, Z. Garip, Y. Yalçın, M. Kutlu, and A. F. Boz, “Self Adaptive Methods for Learning Rate Parameter of Q-Learning Algorithm”, JISTA, vol. 6, no. 2, pp. 191–198, 2023, doi: 10.38016/jista.1250782.
ISNAD	Çimen, Murat Erhan et al. “Self Adaptive Methods for Learning Rate Parameter of Q-Learning Algorithm”. Journal of Intelligent Systems: Theory and Applications 6/2 (September 2023), 191-198. https://doi.org/10.38016/jista.1250782.
JAMA	Çimen ME, Garip Z, Yalçın Y, Kutlu M, Boz AF. Self Adaptive Methods for Learning Rate Parameter of Q-Learning Algorithm. JISTA. 2023;6:191–198.
MLA	Çimen, Murat Erhan et al. “Self Adaptive Methods for Learning Rate Parameter of Q-Learning Algorithm”. Journal of Intelligent Systems: Theory and Applications, vol. 6, no. 2, 2023, pp. 191-8, doi:10.38016/jista.1250782.
Vancouver	Çimen ME, Garip Z, Yalçın Y, Kutlu M, Boz AF. Self Adaptive Methods for Learning Rate Parameter of Q-Learning Algorithm. JISTA. 2023;6(2):191-8.

Download Cover Image

Article Files

Full Text

Journal of Intelligent Systems: Theory and Applications