Architectural Patterns for High-Load Distributed Systems with AI-Driven Optimization in Production Environments
Matvii Horskyi , Senior Software Engineer Austin, TX, United StatesAbstract
This article presents a systematic analysis of architectural patterns for integrating AI-driven optimization into high-load distributed systems operating in production environments. Such systems, built on cloud-native, containerized, and edge platforms, are characterized by dynamic workloads, strict service-level requirements, and high sensitivity to control errors, which limit the effectiveness of reactive and rule-based orchestration mechanisms. The study is conducted as a review-and-analytical synthesis of peer-reviewed publications, focusing on architectural placement of intelligent components, types of control loops, and operational constraints, without quantitative aggregation of results due to heterogeneity of experimental settings and metrics. Particular attention is paid to production-oriented optimization patterns that preserve standard orchestration mechanisms while constraining or parameterizing their decision space, as well as to approaches that introduce intelligence into scheduling and workflow-level coordination. The analysis highlights the trade-offs between measurable performance and cost gains, increased architectural complexity, control-loop stability, and requirements for data quality and interpretability. It is shown that isolated use of AI techniques for scaling or scheduling does not yield sustainable benefits in industrial settings, whereas the most robust effects are achieved when intelligent mechanisms are embedded into managed control loops and operate as adaptive but bounded elements of system governance. The study establishes that the effectiveness of AI-driven optimization is determined not by model sophistication, but by the architectural consistency of intelligent components with the control plane, their ability to respect service constraints, and their impact on system stability. The article is intended for researchers and practitioners in distributed systems, cloud and edge computing, and infrastructure architecture concerned with deploying AI-based optimization under production constraints.
Keywords
high-load distributed systems, AI-driven optimization, cloud-native architectures, Kubernetes, control loop architecture, scheduling, workflow orchestration, production environments
References
Alharthi, S., Alshamsi, A., Alseiari, A., & Alwarafy, A. (2024). Auto-scaling techniques in cloud computing: Issues and research directions. Sensors, 24(17), 5551. https://doi.org/10.3390/s24175551
Augustyn, D. R., Wyciślik, Ł., & Sojka, M. (2024). Tuning a Kubernetes horizontal pod autoscaler for meeting performance and load demands in cloud deployments. Applied Sciences, 14(2), 646. https://doi.org/10.3390/app14020646
Dakić, V., Đambić, G., Slovinac, J., & Redžepagić, J. (2025). Optimizing Kubernetes scheduling for web applications using machine learning. Electronics, 14(5), 863. https://doi.org/10.3390/electronics14050863
Femminella, M., & Reali, G. (2024). Application of proximal policy optimization for resource orchestration in serverless edge computing. Computers, 13(9), 224. https://doi.org/10.3390/computers13090224
Femminella, M., & Reali, G. (2024). Comparison of reinforcement learning algorithms for edge computing applications deployed by serverless technologies. Algorithms, 17(8), 320. https://doi.org/10.3390/a17080320
Hurtado Sánchez, J. A., Casilimas, K., & Caicedo Rendon, O. M. (2022). Deep reinforcement learning for resource management on network slicing: A survey. Sensors, 22(8), 3031. https://doi.org/10.3390/s22083031
Li, W., Li, X., Chen, L., & Wang, M. (2025). Microservice workflow scheduling with a resource configuration model under deadline and reliability constraints. Sensors, 25(4), 1253. https://doi.org/10.3390/s25041253
Nascimento, B., Santos, R., Henriques, J., Bernardo, M. V., & Caldeira, F. (2024). Availability, scalability, and security in the migration from container-based to cloud-native applications. Computers, 13(8), 192. https://doi.org/10.3390/computers13080192
Tran, M.-N., & Kim, Y. (2025). Hybrid resource quota scaling for Kubernetes-based edge computing systems. Electronics, 14(16), 3308. https://doi.org/10.3390/electronics14163308
Zheng, T., Wan, J., Zhang, J., & others. (2022). Deep reinforcement learning-based workload scheduling for edge computing. Journal of Cloud Computing, 11(1), 3. https://doi.org/10.1186/s13677-021-00276-0
Download and View Statistics
Copyright License
Copyright (c) 2026 Matvii Horskyi

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors retain the copyright of their manuscripts, and all Open Access articles are disseminated under the terms of the Creative Commons Attribution License 4.0 (CC-BY), which licenses unrestricted use, distribution, and reproduction in any medium, provided that the original work is appropriately cited. The use of general descriptive names, trade names, trademarks, and so forth in this publication, even if not specifically identified, does not imply that these names are not protected by the relevant laws and regulations.


Engineering and Technology
| Open Access |
DOI: