Self-Healing Service Operations: An AI-Driven Causal Process-Mining Control Layer for SLA-Optimized Ticket Workflows

Authors

Arunraju Chinnaraju

Doctorate in Business Administration, Westcliff University (USA)

Article Information

DOI: 10.47772/IJRISS.2026.10190018

Subject Category: Artificial Intelligence

Volume/Issue: 10/19 | Page No: 222-249

Publication Timeline

Submitted: 2026-01-16

Accepted: 2026-01-21

Published: 2026-02-14

Abstract

Enterprise service operations are orchestrated through ticket/workflow systems that manage incidents & support requests under strict SLA constraints. Current automation approaches remain predominantly predictive static using classification models/routing heuristics that cannot adapt system-wide operational policies due to non-stationarity of demand and resource contention. This paper suggests a self-healing framework for enterprise service operations built around an ai-driven closed loop operational control layer (combining process mining, constrained multi-agent reinforcement learning, causal process analytics). The control layer of the service workflow is modelled as an observable Markov decision process (MDP), however it is partially observable due to the use of centralized-training decentralized-execution multi agent reinforcement learning (MARL) that decomposes operational control across queues, skills, and workflow stage. Constrained policy gradient-based MARL is used for the policy learning component; this includes Lagrangian SLA risk constraint coordination for MAPPO style coordination; thereby allowing for adaptive self-healing policies, as well as action policies (skill-based assignment, dynamic priority assignment, escalation gate assignment, batch assignment, automation trigger assignment, and intake throttle assignment). The process mining methodology is utilized to generate executable workflow models, create state abstraction, and develop non-stationary transition dynamics from event logs, which are the result of observed operational behavior as opposed to assumed process models.
To ensure accountability and managerial trust, the architecture incorporates additional "causal process analysis" layer in this design that will allow users to perform both a counter-factual process replay and quasiexperimental impact estimations (including interrupted time series and difference-in-differences design) to assess how much a specific intervention impacts the breach of Service Level Agreement (SLA), time to resolve, and aging of backlogs and reopens. The framework has a "governance enforcement" mechanism (policy constrained learning architecture) which includes probabilistic SLA bound checking, audit log tracking, human-on-the-loop override mechanisms and deployment safeguarding mechanisms. Performance will be evaluated from Open Incident Management and Process Mining data sets using three different metrics; breach probability of SLA, average Time to Resolve (TTR) and backlog stability and robustness of the backlogs during high volume demand shock events. This proposed framework will demonstrate how governance-aware AI control can convert static ticket workflow into dynamic, auditable, self-healing service operations.

Keywords

Self-Healing Service Operations; Multi-Agent Reinforcement Learning; MAPPO

Downloads

References

1. Fuller, Z. Fan, C. Day and C. Barlow, "Digital Twin: Enabling Technologies, Challenges and Open Research," in IEEE Access, vol. 8, pp. 108952-108971, 2020, doi: 10.1109/ACCESS.2020.2998358. [Google Scholar] [Crossref]

2. Sharma, D. Srinivasan and D. S. Kumar, "A comparative analysis of centralized and decentralized multiagent architecture for service restoration," 2016 IEEE Congress on Evolutionary Computation (CEC), Vancouver, BC, Canada, 2016, pp. 311-318, doi: 10.1109/CEC.2016.7743810. [Google Scholar] [Crossref]

3. Achiam, J., Held, D., Tamar, A., & Abbeel, P. (2017). Constrained policy optimization. In Proceedings of the 34th International Conference on Machine Learning (pp. 22–31). PMLR. https://doi.org/10.48550/arXiv.1705.10528 [Google Scholar] [Crossref]

4. Achiam, J., Held, D., Tamar, A., & Abbeel, P. (2017). Constrained policy optimization. Proceedings of the 34th International Conference on Machine Learning. https://doi.org/10.48550/arXiv.1705.10528 [Google Scholar] [Crossref]

5. Ackerman, S., et al. (2023). Deploying automated ticket router across the enterprise. AI Magazine, 44(2), 52–68. https://doi.org/10.1002/aaai.12079 [Google Scholar] [Crossref]

6. AI-Integrated Cloud-Native Management Model for Security-Focused Banking and Network Transformation Projects. (2023). International Journal of Research Publications in Engineering, Technology and Management (IJRPETM), 6(5), 9321-9329. https://doi.org/10.15662/IJRPETM.2023.0605006 [Google Scholar] [Crossref]

7. Aksin, O. Z., Armony, M., & Mehrotra, V. (2007). The modern call center: A multidisciplinary perspective on operations management research. Production and Operations Management, 16(6), 665–688. https://doi.org/10.1111/j.1937-5956.2007.tb00288.x [Google Scholar] [Crossref]

8. Altman, E. (1999). Constrained Markov decision processes. CRC [Google Scholar] [Crossref]

9. Press. https://doi.org/10.1201/9781315140223 [Google Scholar] [Crossref]

10. Amershi, S., et al. (2019). Software engineering for machine learning: A case study. In 2019 IEEE/ACM International Conference on Software Engineering (ICSE). IEEE. https://doi.org/10.1109/ICSESEIP.2019.00042 [Google Scholar] [Crossref]

11. Ammar, M., Haleem, A., Javaid, M., Bahl, S., & Verma, A. S. (2022). Implementing Industry 4.0 technologies in self-healing materials and digitally managing the quality of manufacturing. Materials Today: Proceedings, 52, 2285-2294. [Google Scholar] [Crossref]

12. Amodei, D., Olah, C., Steinhardt, J., Christiano, P., Schulman, J., & Mané, D. (2016). Concrete problems in AI safety. https://doi.org/10.48550/arXiv.1606.06565 [Google Scholar] [Crossref]

13. Angrist, J. D., & Krueger, A. B. (2001). Instrumental variables and the search for identification From supply and demand to natural experiments. Journal of Economic Perspectives, 15(4), 69–85. https://doi.org/10.1257/jep.15.4.69 [Google Scholar] [Crossref]

14. Armony, M., & Maglaras, C. (2004). Contact centers with a call-back option and real-time delay information. Operations Research, 52(4), 527–545. https://doi.org/10.1287/opre.1040.0123 [Google Scholar] [Crossref]

15. Athey, S., & Imbens, G. W. (2016). Recursive partitioning for heterogeneous causal effects. Proceedings of the National Academy of Sciences, 113(27), 7353–7360. https://doi.org/10.1073/pnas.1510489113 [Google Scholar] [Crossref]

16. Atlason, J., Epelman, M. A., & Henderson, S. G. (2008). Call center staffing with simulation and cutting plane methods. Management Science, 54(2), 295–309. https://doi.org/10.1287/mnsc.1070.0774 [Google Scholar] [Crossref]

17. Augusto, A., Conforti, R., Dumas, M., La Rosa, M., & Maggi, F. M. (2019). Automated discovery of process models from event logs: Review and benchmark. IEEE Transactions on Knowledge and Data Engineering, 31(4), 686–705. https://doi.org/10.1109/TKDE.2018.2841877 [Google Scholar] [Crossref]

18. Austin, P. C. (2011). An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivariate Behavioral Research, 46(3), 399–424. https://doi.org/10.1080/00273171.2011.568786 [Google Scholar] [Crossref]

19. Dundar, M. Astekin and M. S. Aktas, "A Big Data Processing Framework for Self-Healing Internet of Things Applications," 2016 12th International Conference on Semantics, Knowledge and Grids (SKG), Beijing, China, 2016, pp. 62-68, doi: 10.1109/SKG.2016.017. [Google Scholar] [Crossref]

20. Bang, H., & Robins, J. M. (2005). Doubly robust estimation in missing data and causal inference models. Biometrics, 61(4), 962–973. https://doi.org/10.1111/j.1541-0420.2005.00377.x [Google Scholar] [Crossref]

21. Bareinboim, E., & Pearl, J. (2016). Causal inference and the data-fusion problem. Proceedings of the National Academy of Sciences, 113(27), 7345–7352. https://doi.org/10.1073/pnas.1510507113 [Google Scholar] [Crossref]

22. Belanche, D., Casaló, L. V., Flavián, C., & Schepers, J. (2020). Service robot implementation: a theoretical framework and research agenda. The Service Industries Journal, 40(3-4), 203-225. [Google Scholar] [Crossref]

23. Ben-Tal, A., & Nemirovski, A. (1998). Robust convex optimization. Mathematics of Operations Research, 23(4), 769–805. https://doi.org/10.1287/moor.23.4.769 [Google Scholar] [Crossref]

24. Bernal, J. L., Cummins, S., & Gasparrini, A. (2017). Interrupted time series regression for the evaluation of public health interventions A tutorial. International Journal of Epidemiology, 46(1), 348–355. https://doi.org/10.1093/ije/dyw098 [Google Scholar] [Crossref]

25. Bertrand, M., Duflo, E., & Mullainathan, S. (2004). How much should we trust differences in differences estimates. The Quarterly Journal of Economics, 119(1), 249–275. https://doi.org/10.1162/003355304772839588 [Google Scholar] [Crossref]

26. Bertsimas, D., & Sim, M. (2004). The price of robustness. Operations Research, 52(1), 35–53. https://doi.org/10.1287/opre.1030.0065 [Google Scholar] [Crossref]

27. Borst, S., Mandelbaum, A., & Reiman, M. I. (2004). Dimensioning large call centers. Operations Research, 52(1), 17–34. https://doi.org/10.1287/opre.1030.0081 [Google Scholar] [Crossref]

28. Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32. https://doi.org/10.1023/A:1010933404324 [Google Scholar] [Crossref]

29. Brodersen, K. H., Gallusser, F., Koehler, J., Remy, N., & Scott, S. L. (2015). Inferring causal impact using Bayesian structural time series models. The Annals of Applied Statistics, 9(1), 247–274. [Google Scholar] [Crossref]

30. https://doi.org/10.1214/14-AOAS788 [Google Scholar] [Crossref]

31. Brundage, M., Avin, S., Wang, J., Belfield, H., Krueger, G., Hadfield, G., … Anderljung, M. (2020). Toward trustworthy AI development Mechanisms for supporting verifiable claims. https://doi.org/10.48550/arXiv.2004.07213 [Google Scholar] [Crossref]

32. Busoniu, L., Babuska, R., & De Schutter, B. (2008). A comprehensive survey of multiagent reinforcement learning. IEEE Transactions on Systems, Man, and Cybernetics, Part C, 38(2), 156–172. https://doi.org/10.1109/TSMCC.2007.913919 [Google Scholar] [Crossref]

33. Dabrowski and K. Mills. 2002. Understanding self-healing in service-discovery systems. In Proceedings of the first workshop on Self-healing systems (WOSS '02). Association for Computing Machinery, New York, NY, USA, 15–20. https://doi.org/10.1145/582128.582132 [Google Scholar] [Crossref]

34. Callaway, B., & SantAnna, P. H. C. (2021). Difference in differences with multiple time periods. Journal of Econometrics, 225(2), 200–230. https://doi.org/10.1016/j.jeconom.2020.12.001 [Google Scholar] [Crossref]

35. Carmona, J., van Dongen, B., Solti, A., & Weidlich, M. (2018). Conformance checking: Relating processes and models. Springer. https://doi.org/10.1007/978-3-319-99414-7 [Google Scholar] [Crossref]

36. Castillo, D., Canhoto, A. I., & Said, E. (2021). The dark side of AI-powered service interactions: Exploring the process of co-destruction from the customer perspective. The Service Industries Journal, 41(13-14), 900-925. [Google Scholar] [Crossref]

37. Chi, O. H., Jia, S., Li, Y., & Gursoy, D. (2021). Developing a formative scale to measure consumers’ trust toward interaction with artificially intelligent (AI) social robots in service delivery. Computers in Human Behavior, 118, 106700. [Google Scholar] [Crossref]

38. Chow, Y., Nachum, O., Duenez Guzman, E., & Ghavamzadeh, M. (2018). A Lyapunov based approach to safe reinforcement learning. In Advances in Neural Information Processing Systems 31. https://doi.org/10.48550/arXiv.1805.07708 [Google Scholar] [Crossref]

39. Chow, Y., Tamar, A., Mannor, S., & Pavone, M. (2015). Risk-sensitive and robust decision-making: A CVaR optimization approach. arXiv. https://doi.org/10.48550/arXiv.1512.01629 [Google Scholar] [Crossref]

40. Dalal, G., Gilboa, D., Mannor, S., & Tamar, A. (2018). Safe exploration in continuous action spaces. arXiv. https://doi.org/10.48550/arXiv.1801.08757 [Google Scholar] [Crossref]

41. Daniel Kuhn, (2009) An Information-Based Approximation Scheme for Stochastic Optimization Problems in Continuous Time. Mathematics of Operations Research 34(2):428-444. https://doi.org/10.1287/moor.1080.0369 [Google Scholar] [Crossref]

42. Dimick, J. B., & Ryan, A. M. (2014). Methods for evaluating changes in health care policy The difference in differences approach. JAMA, 312(22), 2401–2402. [Google Scholar] [Crossref]

43. https://doi.org/10.1001/jama.2014.16153 [Google Scholar] [Crossref]

44. Doshi Velez, F., & Kim, B. (2017). Towards a rigorous science of interpretable machine learning. arXiv preprint. https://doi.org/10.48550/arXiv.1702.08608 [Google Scholar] [Crossref]

45. Dragoni, N. et al. (2017). Microservices: Yesterday, Today, and Tomorrow. In: Mazzara, M., Meyer, B. (eds) Present and Ulterior Software Engineering. Springer, Cham. https://doi.org/10.1007/978-3-31967425-4_12 [Google Scholar] [Crossref]

46. Du, M., Li, F., Zheng, G., & Srikumar, V. (2017). DeepLog: Anomaly detection and diagnosis from system logs through deep learning. Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. ACM. https://doi.org/10.1145/3133956.3134015 [Google Scholar] [Crossref]

47. Dudík, M., Langford, J., & Li, L. (2014). Doubly robust policy evaluation and optimization. Statistical Science, 29(4), 485–511. https://doi.org/10.1214/14-STS500 [Google Scholar] [Crossref]

48. Eitan Naveh, Miriam Erez, (2004) Innovation and Attention to Detail in the Quality Improvement Paradigm. Management Science 50(11):1576-1586. https://doi.org/10.1287/mnsc.1040.0272 [Google Scholar] [Crossref]

49. Evermann, J., Rehse, J.-R., & Fettke, P. (2017). Predicting process behaviour using deep learning. Decision Support Systems, 100, 129–140. https://doi.org/10.1016/j.dss.2017.04.003 [Google Scholar] [Crossref]

50. F. Tao, H. Zhang, A. Liu and A. Y. C. Nee, "Digital Twin in Industry: State-of-the-Art," in IEEE Transactions on Industrial Informatics, vol. 15, no. 4, pp. 2405-2415, April 2019, doi: 10.1109/TII.2018.2873186. [Google Scholar] [Crossref]

51. Finlayson, S. G., Bowers, J. D., Ito, J., Zittrain, J. L., Beam, A. L., & Kohane, I. S. (2019). Adversarial attacks on medical machine learning. Science, 363(6433), 1287–1289. https://doi.org/10.1126/science.aaw4399 [Google Scholar] [Crossref]

52. Foerster, J. N., Assael, Y. M., de Freitas, N., & Whiteson, S. (2016). Learning to communicate with deep multi-agent reinforcement learning. arXiv. https://doi.org/10.48550/arXiv.1605.06676 [Google Scholar] [Crossref]

53. Foerster, J., Farquhar, G., Afouras, T., Nardelli, N., & Whiteson, S. (2018). Counterfactual multi-agent policy gradients. arXiv. https://doi.org/10.48550/arXiv.1705.08926 [Google Scholar] [Crossref]

54. Fujimoto, S., Meger, D., & Precup, D. (2019). Off-policy deep reinforcement learning without exploration. arXiv. https://doi.org/10.48550/arXiv.1812.02900 [Google Scholar] [Crossref]

55. G. Salami, O. Durowoju, A. Attar, O. Holland, R. Tafazolli and H. Aghvami, "A Comparison Between the Centralized and Distributed Approaches for Spectrum Management," in IEEE Communications Surveys & Tutorials, vol. 13, no. 2, pp. 274-290, Second Quarter 2011, [Google Scholar] [Crossref]

56. doi: 10.1109/SURV.2011.041110.00018. [Google Scholar] [Crossref]

57. Gama, J., Žliobaitė, I., Bifet, A., Pechenizkiy, M., & Bouchachia, A. (2014). A survey on concept drift adaptation. ACM Computing Surveys, 46(4), Article 44. https://doi.org/10.1145/2523813 [Google Scholar] [Crossref]

58. Gans, N., Koole, G., & Mandelbaum, A. (2003). Telephone call centers: Tutorial, review, and research prospects. Manufacturing & Service Operations Management, 5(2), 79–141. https://doi.org/10.1287/msom.5.2.79.16071 [Google Scholar] [Crossref]

59. Garnett, O., Mandelbaum, A., & Reiman, M. I. (2002). Designing a call center with impatient customers. [Google Scholar] [Crossref]

60. Manufacturing & Service Operations Management, 4(3), 208–227. [Google Scholar] [Crossref]

61. https://doi.org/10.1287/msom.4.3.208.7753 [Google Scholar] [Crossref]

62. Ghadimi, S., & Lan, G. (2013). Stochastic first and zeroth order methods for nonconvex stochastic programming. SIAM Journal on Optimization, 23(4), 2341–2368. https://doi.org/10.1137/120880811 [Google Scholar] [Crossref]

63. Ghosh, D., Sharman, R., Rao, H. R., & Upadhyaya, S. (2007). Self-healing systems—survey and synthesis. Decision support systems, 42(4), 2164-2185. [Google Scholar] [Crossref]

64. Grieves, M., & Vickers, J. (2017). Digital twin: Mitigating unpredictable, undesirable emergent behavior in complex systems. In Transdisciplinary Perspectives on Complex Systems (pp. 85–113). Springer. https://doi.org/10.1007/978-3-319-38756-7_4 [Google Scholar] [Crossref]

65. Gunasekaran, A., Marri, H. B., McGaughey, R. E., & Nebhwani, M. D. (2002). E-commerce and its impact on operations management. International journal of production economics, 75(1-2), 185-197. [Google Scholar] [Crossref]

66. Gunning, D., & Aha, D. (2019). DARPA Explainable Artificial Intelligence program. AI Magazine, 40(2), 44–58. https://doi.org/10.1609/aimag.v40i2.2850 [Google Scholar] [Crossref]

67. Guoli Li, Vinod Muthusamy, and Hans-Arno Jacobsen. 2010. A distributed service-oriented architecture for business process execution. ACM Trans. Web 4, 1, Article 2 (January 2010), 33 pages. https://doi.org/10.1145/1658373.1658375 [Google Scholar] [Crossref]

68. Gursoy, D., Chi, O. H., Lu, L., & Nunkoo, R. (2019). Consumers acceptance of artificially intelligent (AI) device use in service delivery. International journal of information management, 49, 157-169. [Google Scholar] [Crossref]

69. H. Psaier, F. Skopik, D. Schall and S. Dustdar, "Behavior Monitoring in Self-Healing Service-Oriented Systems," 2010 IEEE 34th Annual Computer Software and Applications Conference, Seoul, Korea (South), 2010, pp. 357-366, doi: 10.1109/COMPSAC.2010.43. [Google Scholar] [Crossref]

70. Hansen-Estruch, P., et al. (2023). IDQL: Implicit Q-learning as an actor-critic method with flexible actor regularization. arXiv. https://doi.org/10.48550/arXiv.2304.10573 [Google Scholar] [Crossref]

71. He, S., Zhu, J., He, P., & Lyu, M. R. (2016). Experience report: System log analysis for anomaly detection. [Google Scholar] [Crossref]

72. In 2016 IEEE/ACM 27th International Symposium on Software Reliability Engineering (ISSRE) (pp. 207– 218). IEEE. https://doi.org/10.1109/ISSRE.2016.21 [Google Scholar] [Crossref]

73. Hernandez-Leal, P., Kartal, B., & Taylor, M. E. (2019). A survey and critique of multi-agent deep reinforcement learning. Autonomous Agents and Multi-Agent Systems, 33(6), 750–797. https://doi.org/10.1007/s10458-019-09421-1 [Google Scholar] [Crossref]

74. Howard, R. A., & Matheson, J. E. (1972). Risk sensitive Markov decision processes. Management Science, 18(7), 356–369. https://doi.org/10.1287/mnsc.18.7.356 [Google Scholar] [Crossref]

75. IEEE Task Force on Process Mining. (2016). The XES standard: A model and format for logging event data. Information Systems, 63, 1–19. https://doi.org/10.1016/j.is.2016.05.004 [Google Scholar] [Crossref]

76. Ikemoto, J., & Ushio, T. (2022). Deep reinforcement learning under signal temporal logic constraints using Lagrangian relaxation. arXiv. https://doi.org/10.48550/arXiv.2201.08504 [Google Scholar] [Crossref]

77. Jennings, O. B., Mandelbaum, A., Massey, W. A., & Whitt, W. (1996). Server staffing to meet time varying demand. Management Science, 42(10), 1383–1394. https://doi.org/10.1287/mnsc.42.10.1383 [Google Scholar] [Crossref]

78. Kaelbling, L. P., Littman, M. L., & Cassandra, A. R. (1998). Planning and acting in partially observable stochastic domains. Artificial Intelligence, 101(1–2), 99–134. https://doi.org/10.1016/S00043702(98)00023-X [Google Scholar] [Crossref]

79. Kairouz, P., McMahan, H. B., Avent, B., Bellet, A., Bennis, M., Bhagoji, A. N., … Zhao, S. (2021). [Google Scholar] [Crossref]

80. Advances and open problems in federated learning. Foundations and Trends in Machine Learning, 14(1– 2), 1–210. https://doi.org/10.1561/2200000083 [Google Scholar] [Crossref]

81. Karwan, K. R., & Markland, R. E. (2006). Integrating service design principles and information technology to improve delivery and productivity in public sector operations: The case of the South Carolina DMV. Journal of operations management, 24(4), 347-362. [Google Scholar] [Crossref]

82. Kleinberg, J., Ludwig, J., Mullainathan, S., & Obermeyer, Z. (2015). Prediction policy problems. American Economic Review, 105(5), 491–495. https://doi.org/10.1257/aer.p20151023 [Google Scholar] [Crossref]

83. Koole, G., & Mandelbaum, A. (2002). Queueing models of call centers: An introduction. Annals of Operations Research, 113(1–4), 41–59. https://doi.org/10.1023/A:1020949626017 [Google Scholar] [Crossref]

84. Kostrikov, I., Nair, A., & Levine, S. (2022). Offline reinforcement learning with implicit Q learning. In International Conference on Machine Learning (ICML 2022). https://doi.org/10.48550/arXiv.2110.06169 [Google Scholar] [Crossref]

85. Kumar, A., Zhou, A., Tucker, G., & Levine, S. (2020). Conservative Q-learning for offline reinforcement learning. arXiv. https://doi.org/10.48550/arXiv.2006.04779 [Google Scholar] [Crossref]

86. Kumar, R. (2023). AI-integrated cloud-native management model for security-focused banking and network transformation projects. International Journal of Research Publications in Engineering, Technology and Management (IJRPETM), 6(5), 9321-9329. [Google Scholar] [Crossref]

87. Kushwaha, A., et al. (2025). A survey of safe reinforcement learning and constrained decision-making. arXiv. https://doi.org/10.48550/arXiv.2505.17342 [Google Scholar] [Crossref]

88. Ladley, J. (2019). Data governance: How to design, deploy, and sustain an effective data governance program. Academic Press. [Google Scholar] [Crossref]

89. Leemans, S. J. J., Fahland, D., & van der Aalst, W. M. P. (2013). Discovering block structured process models from event logs: A constructive approach. In Application and Theory of Petri Nets and Concurrency (pp. 311–329). Springer. https://doi.org/10.1007/978-3-642-38697-8_17 [Google Scholar] [Crossref]

90. Leemans, S. J. J., Fahland, D., & van der Aalst, W. M. P. (2014). Discovering block structured process models from event logs containing infrequent behaviour. In Business Process Management Workshops (pp. 66–78). Springer. https://doi.org/10.1007/978-3-319-06257-0_6 [Google Scholar] [Crossref]

91. Levine, S., Kumar, A., Tucker, G., & Fu, J. (2020). Offline reinforcement learning: Tutorial, review, and perspectives on open problems. arXiv. https://doi.org/10.48550/arXiv.2005.01643 [Google Scholar] [Crossref]

92. Li, M., Yin, D., Qiu, H., & Bai, B. (2021). A systematic review of AI technology-based service encounters: Implications for hospitality and tourism operations. International Journal of Hospitality Management, 95, 102930. [Google Scholar] [Crossref]

93. Lin, KJ., Chang, S.H. A service accountability framework for QoS service management and engineering. Inf Syst E-Bus Manage 7, 429–446 (2009). https://doi.org/10.1007/s10257-009-0109-5 [Google Scholar] [Crossref]

94. Lowe, R., Wu, Y., Tamar, A., Harb, J., Abbeel, P., & Mordatch, I. (2017). Multi-agent actor-critic for mixed cooperative-competitive environments. arXiv. https://doi.org/10.48550/arXiv.1706.02275 [Google Scholar] [Crossref]

95. Lu, L., Cai, R., & Gursoy, D. (2019). Developing and validating a service robot integration willingness scale. International Journal of Hospitality Management, 80, 36-51. [Google Scholar] [Crossref]

96. Mahajan, A., Samvelyan, M., Rashid, T., de Witt, C. S., & Whiteson, S. (2019). MAVEN: Multi-agent variational exploration. arXiv. https://doi.org/10.48550/arXiv.1910.07483 [Google Scholar] [Crossref]

97. Mannhardt, F., de Leoni, M., Reijers, H. A., & van der Aalst, W. M. P. (2016). Balanced multi perspective checking of process conformance. Computing, 98, 407–437. https://doi.org/10.1007/s00607-015-0441-1 [Google Scholar] [Crossref]

98. Mihret DG, Grant B (2017), "The role of internal auditing in corporate governance: a Foucauldian analysis". Accounting, Auditing & Accountability Journal, Vol. 30 No. 3 pp. 699–719, doi: https://doi.org/10.1108/AAAJ-10-2012-1134 [Google Scholar] [Crossref]

99. Mitchell, M., Wu, S., Zaldivar, A., Barnes, P., Vasserman, L., Hutchinson, B., … Gebru, T. (2019). Model cards for model reporting. Proceedings of the Conference on Fairness Accountability and Transparency, 220–229. https://doi.org/10.1145/3287560.3287596 [Google Scholar] [Crossref]

100. Moeller, R. R. (2013). Executive's guide to IT governance: improving systems processes with service management, COBIT, and ITIL (Vol. 637). John Wiley & Sons. [Google Scholar] [Crossref]

101. Monahan, G. E. (1982). A survey of partially observable Markov decision processes: Theory and algorithms. Management Science, 28(1), 1–16. https://doi.org/10.1287/mnsc.28.1.1 [Google Scholar] [Crossref]

102. Nakao, M., & Fujisaki, M. (2020). Risk constrained reinforcement learning: A constrained Markov decision process approach. SIAM Journal on Control and Optimization, 58(4), 2434–2462. https://doi.org/10.1137/19M1268410 [Google Scholar] [Crossref]

103. Nama, P., Reddy, P., & Pattanayak, S. K. (2024). Artificial intelligence for self-healing automation testing frameworks: Real-time fault prediction and recovery. Artificial Intelligence, 64(3S). [Google Scholar] [Crossref]

104. Nedić, A., & Ozdaglar, A. (2009). Distributed subgradient methods for multi agent optimization. IEEE Transactions on Automatic Control, 54(1), 48–61. https://doi.org/10.1109/TAC.2008.2009515 [Google Scholar] [Crossref]

105. Olfati Saber, R., Fax, J. A., & Murray, R. M. (2007). Consensus and cooperation in networked multi agent systems. Proceedings of the IEEE, 95(1), 215–233. https://doi.org/10.1109/JPROC.2006.887293 [Google Scholar] [Crossref]

106. Papoudakis, G., Christianos, F., Schäfer, L., & Albrecht, S. V. (2019). Dealing with non-stationarity in multi-agent deep reinforcement learning. arXiv. https://doi.org/10.48550/arXiv.1906.04737 [Google Scholar] [Crossref]

107. Pearl, J. (2000). Causality: Models, reasoning, and inference. Cambridge University Press. https://doi.org/10.1017/CBO9780511803161 [Google Scholar] [Crossref]

108. Peng, X. B., et al. (2018). Sim to real transfer of robotic control with dynamics randomization. In 2018 [Google Scholar] [Crossref]

109. IEEE International Conference on Robotics and Automation (ICRA). IEEE. https://doi.org/10.1109/ICRA.2018.8460528 [Google Scholar] [Crossref]

110. Peters, J., & Schaal, S. (2008). Reinforcement learning of motor skills with policy gradients. Neural Networks, 21(4), 682–697. https://doi.org/10.1016/j.neunet.2008.02.003 [Google Scholar] [Crossref]

111. Phillips, P. J., Phillips, P. J., Hahn, C. A., Fontana, P. C., Yates, A. N., Greene, K., ... & Przybocki, M. A. (2021). Four principles of explainable artificial intelligence. [Google Scholar] [Crossref]

112. Psaier, H., Dustdar, S. A survey on self-healing systems: approaches and systems. Computing 91, 43–73 (2011). https://doi.org/10.1007/s00607-010-0107-y [Google Scholar] [Crossref]

113. Ramandeep S. Randhawa, Sunil Kumar, (2008) Usage Restriction and Subscription Services: [Google Scholar] [Crossref]

114. Operational Benefits with Rational Users. Manufacturing & Service Operations Management 10(3):429447. https://doi.org/10.1287/msom.1070.0180 [Google Scholar] [Crossref]

115. Rashid, T., Samvelyan, M., Schroeder, C., Farquhar, G., Foerster, J., & Whiteson, S. (2018). QMIX: Monotonic value function factorisation for deep multi-agent reinforcement learning. arXiv. https://doi.org/10.48550/arXiv.1803.11485 [Google Scholar] [Crossref]

116. Ravichandran, N., Inaganti, A. C., Muppalaneni, R., & Nersu, S. R. K. (2020). AI-Driven Self-Healing IT Systems: Automating Incident Detection and Resolution in Cloud Environments. Artificial Intelligence and Machine Learning Review, 1(4), 1-11. [Google Scholar] [Crossref]

117. Lingzhe Zhang, Tong Jia, Mengxi Jia, Yifan Wu, Aiwei Liu, Yong Yang, Zhonghai Wu, Xuming Hu, Philip Yu, and Ying Li. 2025. A Survey of AIOps in the Era of Large Language Models. ACM Comput. Surv. 58, 2, Article 44 (January 2026), 35 pages. https://doi.org/10.1145/3746635 [Google Scholar] [Crossref]

118. Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). “Why should I trust you?” Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM. https://doi.org/10.1145/2939672.2939778 [Google Scholar] [Crossref]

119. Rockafellar, R. T., & Uryasev, S. (2000). Optimization of conditional value at risk. Journal of Risk, 2(3), 21–41. https://doi.org/10.21314/JOR.2000.038 [Google Scholar] [Crossref]

120. Rosenbaum, P. R. (2002). Observational studies (2nd ed.). Springer. https://doi.org/10.1007/978-1-47573692-2 [Google Scholar] [Crossref]

121. Rosenbaum, P. R. (2010). Design of observational studies. Springer. https://doi.org/10.1007/978-1-44191213-8 [Google Scholar] [Crossref]

122. Ross, K. W. (1989). Randomized and past dependent policies for Markov decision processes with multiple constraints. Operations Research, 37(3), 474–477. https://doi.org/10.1287/opre.37.3.474 [Google Scholar] [Crossref]

123. Rozinat, A., & van der Aalst, W. M. P. (2008). Conformance checking of processes based on monitoring real behavior. Information Systems, 33(1), 64–95. https://doi.org/10.1016/j.is.2007.07.001 [Google Scholar] [Crossref]

124. Sampson, S. E., & Froehle, C. M. (2006). Foundations and implications of a proposed unified services theory. Production and operations management, 15(2), 329-343. [Google Scholar] [Crossref]

125. Sapkota, R., Roumeliotis, K. I., & Karkee, M. (2025). Ai agents vs. agentic ai: A conceptual taxonomy, applications and challenges. arXiv preprint arXiv:2505.10468. [Google Scholar] [Crossref]

126. Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal policy optimization algorithms. arXiv. https://doi.org/10.48550/arXiv.1707.06347 [Google Scholar] [Crossref]

127. Shapiro, A., Dentcheva, D., & Ruszczyński, A. (2014). Lectures on stochastic programming: Modeling and theory (2nd ed.). SIAM. https://doi.org/10.1137/1.9781611973433 [Google Scholar] [Crossref]

128. Shen, Z. J. M., & Zhang, H. (2013). Risk sensitive Markov decision processes and robust optimization. SIAM Journal on Optimization, 23(2), 1272–1296. https://doi.org/10.1137/120899005 [Google Scholar] [Crossref]

129. Sherif A. Gurguis and Amir Zeid. 2005. Towards autonomic web services: achieving self-healing using web services. In Proceedings of the 2005 workshop on Design and evolution of autonomic application software (DEAS '05). Association for Computing Machinery, New York, NY, USA, 1–5. https://doi.org/10.1145/1083063.1083069 [Google Scholar] [Crossref]

130. Snell, C., et al. (2022). Offline RL for natural language generation with implicit language Q-learning. arXiv. https://doi.org/10.48550/arXiv.2206.11871 [Google Scholar] [Crossref]

131. Son, K., Kim, D., Kang, W., Hostallero, D. E., & Yi, Y. (2019). QTRAN: Learning to factorize with transformation for cooperative multi-agent reinforcement learning. arXiv. https://doi.org/10.48550/arXiv.1905.05408 [Google Scholar] [Crossref]

132. Stuart, E. A. (2010). Matching methods for causal inference A review and a look forward. Statistical Science, 25(1), 1–21. https://doi.org/10.1214/09-STS313 [Google Scholar] [Crossref]

133. Sukhbaatar, S., Fergus, R., et al. (2016). Learning multi-agent communication with backpropagation. [Google Scholar] [Crossref]

134. arXiv. https://doi.org/10.48550/arXiv.1605.07736 [Google Scholar] [Crossref]

135. Sunehag, P., Lever, G., Gruslys, A., Czarnecki, W. M., Zambaldi, V., Jaderberg, M., Lanctot, M., Sonnerat, N., Leibo, J. Z., Tuyls, K., & Graepel, T. (2017). Value-decomposition networks for cooperative multi-agent learning. arXiv. https://doi.org/10.48550/arXiv.1706.05296 [Google Scholar] [Crossref]

136. Suriadi, S., Andrews, R., ter Hofstede, A. H. M., & Wynn, M. T. (2017). Event log imperfection patterns for process mining: Towards a systematic approach. Information Systems, 64, 132–150. https://doi.org/10.1016/j.is.2016.07.011 [Google Scholar] [Crossref]

137. T. Li et al., "Applications of Multi-Agent Reinforcement Learning in Future Internet: A Comprehensive Survey," in IEEE Communications Surveys & Tutorials, vol. 24, no. 2, pp. 1240-1279, Secondquarter 2022, doi: 10.1109/COMST.2022.3160697. [Google Scholar] [Crossref]

138. Tamar, A., Di Castro, D., & Mannor, S. (2012). Policy gradients with variance related risk criteria. Proceedings of the Twenty-Ninth International Conference on Machine Learning (journal version reference context). https://doi.org/10.48550/arXiv.1206.6404 [Google Scholar] [Crossref]

139. Tax, N., Verenich, I., La Rosa, M., & Dumas, M. (2017). Predictive business process monitoring with LSTM neural networks. In Advanced Information Systems Engineering (CAiSE 2017) (pp. 477–492). Springer. https://doi.org/10.1007/978-3-319-59536-8_30 [Google Scholar] [Crossref]

140. Teinemaa, I., Dumas, M., La Rosa, M., & Maggi, F. M. (2019). Outcome oriented predictive process monitoring: Review and benchmark. ACM Transactions on Knowledge Discovery from Data, 13(2), Article 17. https://doi.org/10.1145/3301300 [Google Scholar] [Crossref]

141. Tessler, C., Mankowitz, D., & Mannor, S. (2019). Reward constrained policy optimization. arXiv. https://doi.org/10.48550/arXiv.1805.11074 [Google Scholar] [Crossref]

142. Tieju Ma, (2009) Coping with Uncertainties in Technological Learning. Management Science 56(1):192201. https://doi.org/10.1287/mnsc.1090.1098 [Google Scholar] [Crossref]

143. van der Aalst, W. M. P. (2016). Process mining: Data science in action (2nd ed.). Springer. https://doi.org/10.1007/978-3-662-49851-4 [Google Scholar] [Crossref]

144. van der Aalst, W. M. P., & Dustdar, S. (2012). Process mining put into context. IEEE Internet Computing, 16(1), 82–86. https://doi.org/10.1109/MIC.2012.12 [Google Scholar] [Crossref]

145. van der Aalst, W. M. P., Adriansyah, A., & van Dongen, B. F. (2012). Replaying history on process models for conformance checking and performance analysis. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2(2), 182–192. https://doi.org/10.1002/widm.1045 [Google Scholar] [Crossref]

146. VanderWeele, T. J., & Ding, P. (2017). Sensitivity analysis in observational research: Introducing the Evalue. Annals of Internal Medicine, 167(4), 268–274. https://doi.org/10.7326/M16-2607 [Google Scholar] [Crossref]

147. Vermesan, O., Bröring, A., Tragos, E., Serrano, M., Bacciu, D., Chessa, S., ... & Bahr, R. (2022). Internet of robotic things–converging sensing/actuating, hyperconnectivity, artificial intelligence and IoT platforms. In Cognitive hyperconnected digital transformation (pp. 97-155). River Publishers. [Google Scholar] [Crossref]

148. Vinyals, O., Babuschkin, I., Czarnecki, W. M., Mathieu, M., Dudzik, A., Chung, J., … Silver, D. (2019). Grandmaster level in StarCraft II using multi agent reinforcement learning. Nature, 575(7782), 350–354. https://doi.org/10.1038/s41586-019-1724-z [Google Scholar] [Crossref]

149. Wagner, A. K., Soumerai, S. B., Zhang, F., & Ross Degnan, D. (2002). Segmented regression analysis of interrupted time series studies in medication use research. Journal of Clinical Pharmacy and Therapeutics, 27(4), 299–309. https://doi.org/10.1046/j.1365-2710.2002.00430.x [Google Scholar] [Crossref]

150. Wallace, R. B., & Whitt, W. (2005). A staffing algorithm for call centers with skill-based routing. [Google Scholar] [Crossref]

151. Manufacturing & Service Operations Management, 7(4), 276–294. https://doi.org/10.1287/msom.1050.0086 [Google Scholar] [Crossref]

152. Wang, T., Wang, J., Wu, Y., & Wang, C. (2020). ROMA: Multi-agent reinforcement learning with emergent roles. arXiv. https://doi.org/10.48550/arXiv.2003.08039 [Google Scholar] [Crossref]

153. Xu, H., Chen, W., Zhao, N., Li, Z., Bu, J., Li, Z., Liu, Y., Zhao, J., Pei, D., Feng, Y., & Qiao, Y. (2018). Unsupervised anomaly detection via variational auto encoder for seasonal KPIs in web applications. In Proceedings of the 2018 World Wide Web Conference (pp. 187–196). ACM. [Google Scholar] [Crossref]

154. https://doi.org/10.1145/3178876.3185996 [Google Scholar] [Crossref]

155. Yang, Y., Luo, R., Li, M., Zhou, M., Zhang, W., & Wang, J. (2018). Mean field multi-agent reinforcement learning. arXiv. https://doi.org/10.48550/arXiv.1802.05438 [Google Scholar] [Crossref]

156. Yu, C., Velu, A., Vinitsky, E., Wang, Y., Bayen, A., & Wu, Y. (2021). The surprising effectiveness of [Google Scholar] [Crossref]

157. PPO in cooperative multi-agent games. arXiv. https://doi.org/10.48550/arXiv.2103.01955 [Google Scholar] [Crossref]

158. Yu, Q., Zhao, N., Li, M., Li, Z., Wang, H., Zhang, W., Sui, K., & Pei, D. (2024). A survey on intelligent management of alerts and incidents in IT services. Journal of Network and Computer Applications, 224, 103842. https://doi.org/10.1016/j.jnca.2024.103842 [Google Scholar] [Crossref]

159. Zhang, K., Yang, Z., Başar, T. (2021). Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms. In: Vamvoudakis, K.G., Wan, Y., Lewis, F.L., Cansever, D. (eds) Handbook of Reinforcement Learning and Control. Studies in Systems, Decision and Control, vol 325. Springer, Cham. https://doi.org/10.1007/978-3-030-60990-0_12 [Google Scholar] [Crossref]

160. Zhang, Y., Lin, KJ. & Hsu, J.Y.J. Accountability monitoring and reasoning in service-oriented architectures. SOCA 1, 35–50 (2007). https://doi.org/10.1007/s11761-007-0001-4 [Google Scholar] [Crossref]

161. Zhu, S., Yu, C., Wang, Y., & Wu, Y. (2023). Constrained reinforcement learning: A survey. Autonomous Agents and Multi Agent Systems, 37, 37. https://doi.org/10.1007/s10458-023-09633-6 [Google Scholar] [Crossref]

Metrics

Views & Downloads

Similar Articles