Enhancing Reinforcement Learning through Graph Neural Networks: A Novel Approach
Authors
Department of Computer Science Techno India University Kolkata, West Bengal (India)
Department of CSE-AI Techno India University Kolkata, West Bengal (India)
AiLabs-Artificial Intelligence, DGC DataCore Systems (India) pvt. Ltd. Kolkata, West Bengal (India)
Article Information
DOI: 10.51244/IJRSI.2025.1210000046
Subject Category: Artifitial Intelligence
Volume/Issue: 12/10 | Page No: 521-530
Publication Timeline
Submitted: 2025-10-10
Accepted: 2025-10-14
Published: 2025-11-01
Abstract
Reinforcement Learning (RL) has showcased remarkable success in various domains. However, its performance often degrades in the environment with complex structures and distributed rewards. Graph-Based Reinforcement Learning (GBRL) is an approach that combines the strengths of Graph Theory with Reinforcement Learning to optimize complex decision making problems in any networked system. This paper proposes an approach of integrating Reinforcement Learning approaches with Graph Neural Networks (GNNs)to enhance the learning pipeline and model structured data by utilising their capacity. We present an approach that uses GNNs represented as graphs that enables RL agents to get dependencies between entities and access information through them. This paper exhibits GBRL techniques and their application in different domains. A framework of GBRL methods and its advantages over RL methods in working on graph-based data. This work highlights the synergy between graph-based learning and decision-making, offering a promising direction for solving high-dimensional and structured RL tasks more effectively. We also summarize the key challenges and the open research directions in this field.
Keywords
Graph-Based Reinforcement Learning (GBRL), Reinforcement Learning (RL), Graph Neural Networks (GNN)
Downloads
References
1. Christopher JCH Watkins and Peter Dayan. 1992. Q-learning. Machine learning 8, 3-4 (1992), 279–292. [Google Scholar] [Crossref]
2. David K Duvenaud, Dougal Maclaurin, Jorge Iparraguirre, Rafael Bom barell, Timothy Hirzel, Al´an Aspuru-Guzik, and Ryan P Adams. 2015. Convolutional networks on graphs for learning molecular fingerprints. In Proceedings of the NIPS. 2224–2232. [Google Scholar] [Crossref]
3. Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, et al. 2015. Human-level control through deep reinforcement learning. nature 518, 7540 (2015), 529–533. [Google Scholar] [Crossref]
4. William L Hamilton, Rex Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In Proceedings of the 31st International Conference on Neural Information Processing Systems. 1025–1035. [Google Scholar] [Crossref]
5. Thomas N Kipf and Max Welling. 2017. Semi-supervised classification with graph convolutional networks. Proceedings of the ICLR. [Google Scholar] [Crossref]
6. Guixiang Ma, Lifang He, Chun-Ta Lu, Weixiang Shao, Philip S. Yu, Alex D Leow, and Ann B Ragin. 2017. Multi-view clustering with graph embedding for connectome analysis. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. 127–136. [Google Scholar] [Crossref]
7. John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017). [Google Scholar] [Crossref]
8. Wenhan Xiong, Thien Hoang, and William Yang Wang. 2017. DeepPath: A Reinforcement Learning Method for Knowledge Graph Reasoning. In Proceedings of the EMNLP. ACL, 564–573. [Google Scholar] [Crossref]
9. Scott Fujimoto, Herke van Hoof, and David Meger. 2018. Addressing Function Approximation Error in Actor-Critic Methods. In Proceedings of the 35th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 80). PMLR, Stockholmsm¨assan, Stockholm Sweden, 1587–1596. [Google Scholar] [Crossref]
10. Seyed Mehran Kazemi and David Poole. 2018. Simple embedding for link prediction in knowledge graphs. In Proceedings of the NIPS. 4284–4295. [Google Scholar] [Crossref]
11. Tengfei Ma, Cao Xiao, Jiayu Zhou, and Fei Wang. 2018. Drug similarity integration through attentive multi-view graph auto-encoders. In Proceedings of the 27th International Joint Conference on Artificial Intelligence. 3477–3483. [Google Scholar] [Crossref]
12. Shirui Pan, Ruiqi Hu, Guodong Long, Jing Jiang, Lina Yao, and Chengqi Zhang. 2018. Adversarially regularized graph autoencoder for graph embedding. In Proceedings of the IJCAI. AAAI Press, 2609–2615. [Google Scholar] [Crossref]
13. Hao Peng, Jianxin Li, Yu He, Yaopeng Liu, Mengjiao Bao, Lihong Wang, Yangqiu Song, and Qiang Yang. 2018. Large-scale hierarchical text classification with recursively regularized deep graph-cnn. In Proceedings of the WWW. 1063–1072. [Google Scholar] [Crossref]
14. Michael Schlichtkrull, Thomas N Kipf, Peter Bloem, Rianne Van Den Berg, Ivan Titov, and Max Welling. 2018. Modeling relational data with graph convolutional networks. In Proceedings of the ESWC. Springer, 593–607. [Google Scholar] [Crossref]
15. Petar Veliˇckovi´c, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2018. Graph attention networks. Pro ceedings of the ICLR. [Google Scholar] [Crossref]
16. Zhitao Ying, Jiaxuan You, Christopher Morris, Xiang Ren, Will Hamilton, and Jure Leskovec. 2018. Hierarchical graph representation learning with differentiable pooling. In Proceedings of the NIPS. 4800–4810. [Google Scholar] [Crossref]
17. Muhan Zhang and Yixin Chen. 2018. Link prediction based on graph neural networks. In Proceedings of the NIPS. 5165–5175. [Google Scholar] [Crossref]
18. Xi Zhang, Lifang He, Kun Chen, Yuan Luo, Jiayu Zhou, and Fei Wang. 2018. Multi-view graph convolutional network and its applications on neuroimage analysis for parkinson’s disease. In AMIA Annual Symposium Proceedings, Vol. 2018. 1147. [Google Scholar] [Crossref]
19. Paul Almasan, Jos´e Su´arez-Varela, Arnau Badia-Sampera, Krzysztof Rusek, Pere Barlet-Ros, and Albert Cabellos-Aparicio. 2019. Deep reinforcement learning meets graph neural networks: Exploring a routing optimization use case. arXiv preprint arXiv:1910.07421 (2019). [Google Scholar] [Crossref]
20. Yu Chen, Lingfei Wu, and Mohammed J Zaki. 2019. Reinforcement Learning Based Graph-to-Sequence Model for Natural Question Genera tion. In Proceedings of the ICLR. [Google Scholar] [Crossref]
21. Yang Gao, Hong Yang, Peng Zhang, Chuan Zhou, and Yue Hu. 2019. Graphnas: Graph neural architecture search with reinforcement learning. arXiv preprint arXiv:1904.09981 (2019). [Google Scholar] [Crossref]
22. Di Jin, Ziyang Liu, Weihao Li, Dongxiao He, and Weixiong Zhang. 2019. Graph convolutional networks meet markov random fields: Semi supervised community detection in attribute networks. In Proceedings of the AAAI. AAAI Press, 152–159. [Google Scholar] [Crossref]
23. Kai Lei, Meng Qin, Bo Bai, Gong Zhang, and Min Yang. 2019. GCN GAN: A non-linear temporal link prediction model for weighted dynamic networks. In Proceedings of the IEEE INFOCOM. 388–396. [Google Scholar] [Crossref]
24. Hao Peng, Jianxin Li, Qiran Gong, Yangqiu Song, Yuanxing Ning, Kunfeng Lai, and Philip S. Yu. 2019. Fine-grained event categorization with heterogeneous graph convolutional networks. In Proceedings of the IJCAI. AAAI Press, 3238–3245. [Google Scholar] [Crossref]
25. Daixin Wang, Jianbin Lin, Peng Cui, Quanhui Jia, Zhen Wang, Yanming Fang, Quan Yu, Jun Zhou, Shuang Yang, and Yuan Qi. 2019. A Semi supervised Graph Attentive Network for Financial Fraud Detection. In Proceedings of the IEEE ICDM. 598–607. [Google Scholar] [Crossref]
26. Jianyu Wang, Rui Wen, Chunming Wu, Yu Huang, and Jian Xion. 2019. Fdgars: Fraudster detection via graph convolutional networks in online app review system. In Proceedings of the World Wide Web Conference. 310–316. [Google Scholar] [Crossref]
27. Victor Bapst, Thomas Keck, A Grabska-Barwi´nska, Craig Donner, Ekin Dogus Cubuk, Samuel S Schoenholz, Annette Obika, Alexander WR Nelson, Trevor Back, Demis Hassabis, et al. 2020. Unveiling the predictive power of static structure in glassy systems. [Google Scholar] [Crossref]
28. Patrick Hart and Alois Knoll. 2020. Graph Neural Networks and Rein forcement Learning for Behavior Generation in Semantic Environments. [Google Scholar] [Crossref]
29. Jarom´ır Janisch, Tom´aˇs Pevny, and Viliam Lis ‘ y. 2020. Symbolic Relational Deep Reinforcement Learning based on Graph Neural Networks. arXiv preprint arXiv:2009.12462 (2020). [Google Scholar] [Crossref]
30. Kwei-Herng Lai, Daochen Zha, Kaixiong Zhou, and Xia Hu. 2020. Policy-GNN: Aggregation Optimization for Graph Neural Networks. In Proceedings of the ACM SIGKDD. New York, NY, USA, 461–471. Reinforced Neighborhood Selection Guided Multi-Relational Graph Neural Networks • 39:43 [Google Scholar] [Crossref]
31. Zhiwei Liu, Yingtong Dou, Philip S. Yu, Yutong Deng, and Hao Peng. 2020. Alleviating the Inconsistency Problem of Applying Graph Neural Network to Fraud Detection. Proceedings of the SIGIR, 1569–1572. [Google Scholar] [Crossref]
32. Hao Peng, Jianxin Li, Qiran Gong, Yuanxin Ning, Senzhang Wang, and Lifang He. 2020. Motif-Matching Based Subgraph-Level Attentional Convolutional Network for Graph Classification. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 5387–5394. [Google Scholar] [Crossref]
33. Ruihong Qiu, Zi Huang, Jingjing Li, and Hongzhi Yin. 2020. Exploiting Cross-session Information for Session-based Recommendation with Graph Neural Networks. ACM Transactions on Information Systems (TOIS) 38, 3 (2020), 1–23. [Google Scholar] [Crossref]
34. Junkai Sun, Junbo Zhang, Qiaofei Li, Xiuwen Yi, Yuxuan Liang, and Yu Zheng. 2020. Predicting citywide crowd flows in irregular regions using multi-view graph convolutional networks. IEEE Transactions on Knowledge and Data Engineering (2020). [Google Scholar] [Crossref]
35. Penghao Sun, Julong Lan, Junfei Li, Zehua Guo, and Yuxiang Hu. 2020. Combining deep reinforcement learning with graph neural networks for optimal VNF placement. IEEE Communications Letters 25, 1 (2020), 176–180. [Google Scholar] [Crossref]
36. Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and S. Yu Philip. 2020. A comprehensive survey on graph neural networks. IEEE Transactions on Neural Networks and Learning Systems (2020). [Google Scholar] [Crossref]
37. Zhiqiang Zhong, Cheng-Te Li, and Jun Pang. 2020. Reinforcement Learning Enhanced Heterogeneous Graph Neural Network. arXiv preprint arXiv:2010.13735 (2020). [Google Scholar] [Crossref]
38. Yuwei Cao, Hao Peng, Jia Wu, Yingtong Dou, Jianxin Li, and Philip S. Yu. 2021. Knowledge-Preserving Incremental Social Event Detection via Heterogeneous GNNs. In Proceedings of the Web Conference 2021. Association for Computing Machinery, 3383–3395. [Google Scholar] [Crossref]
39. [39] Hao Peng, Jianxin Li, Yangqiu Song, Renyu Yang, Ranjan Rajiv, Philip S. Yu, and He Lifang. 2021. Streaming Social Event Detection and Evolution Discovery in Heterogeneous Information Networks. ACM Transactions on Knowledge Discovery from Data 15, 5 (2021). [Google Scholar] [Crossref]
40. Hao Peng, Jianxin Li, Senzhang Wang, Lihong Wang, Qiran Gong, Renyu Yang, Bo Li, Philip S. Yu, and Lifang He. 2021. Hierarchical taxonomy-aware and attentional graph capsule RCNNs for large-scale multi-label text classification. IEEE Transactions on Knowledge and Data Engineering 33, 6 (2021), 2505–2519. [Google Scholar] [Crossref]
41. Qingyun Sun, Hao Peng, Jianxin Li, Jia Wu, Yuanxing Ning, Phillip S. Yu, and Lifang He. 2021. SUGAR: Subgraph Neural Network with Reinforcement Pooling and Self-Supervised Mutual Information Mechanism. In Proceedings of the Web Conference. 2081–2091. [Google Scholar] [Crossref]
42. Yang Wang. 2021. Survey on Deep Multi-Modal Data Analytics: Collaboration, Rivalry, and Fusion. ACM Transactions on Multimedia Computing, Communications, and Applications 17, 1s, Article 10 (2021), 25 pages. In IEEE Intelligent Vehicles Symposium (IV). IEEE, 1589–1594 [Google Scholar] [Crossref]