A Scalable Architecture for Intelligent Document Processing in Multi-Cloud Environments
Suprakash Dutta , Senior Solutions Architect, AMAZON WEB SERVICES Dallas, TX, USAAbstract
This paper reviews an easily expandable plan for smart document handling across multiple cloud systems, aiming to make work easier to manage, more resilient to issues, and improve the total cost of ownership. The importance of this task stems from two factors: first, Intelligent Document Processing (IDP) tools are experiencing growth; second, multi-cloud use is expanding more widely. This increases the primary fight between wanting top-notch help for every step and the dangers of being stuck with one provider, having messy operations, and uneven safety rules. The study aims to create and support a complete design that can hide both setup and software links while offering complete control and standard protection in a mixed environment. The innovation is in the coherent four-layer model, which merges a general control plane atop Kubernetes and Crossplane with portable application runtime Dapr, exposing standard APIs for statelessness, messaging, and service invocation, decomposed IDP microservices, and an overlay layer for management and security. The key findings validate that only the combination of Crossplane at the level of the control plane with GitOps and OPA policies together with Dapr at the level of the application-API can provide real portability, elastic scaling, governed security, while maintaining freedom of choice between cloud services. It proves that workflows crossing provider boundaries can be orchestrated, thus reducing vendor lock-in. The article will be helpful to cloud-platform architects, IT executives, data and MLOps engineers, IDP product teams, and researchers in distributed systems and enterprise AI.
Keywords
intelligent document processing, multi-cloud architecture, application portability, Kubernetes, Crossplane, GitOps
References
Innovation at Work. (2019). The Multi-Cloud: Challenges and Solutions. IEEE. Retrieved July 14, 2025, from https://innovationatwork.ieee.org/the-multi-cloud-challenges-and-solutions/
Lin, Y., Hasan, M., Kosalge, R., Cheung, A., & Parameswaran, A. G. (2025). TWIX: Automatically Reconstructing Structured Data from Templatized Documents. Arxiv. https://arxiv.org/abs/2501.06659
Moravcik, M., Segec, P., Kontsek, M., & Zidekova, L. (2024). Model-Driven Approach to Cloud-Portability Issue. Applied Sciences, 14(20), 9298. https://doi.org/10.3390/app14209298
Pahl, C., Jamshidi, P., & Zimmermann, O. (2018). Architectural Principles for Cloud Software. ACM Transactions on Internet Technology, 18(2), 1–23. https://doi.org/10.1145/3104028
Pradhan, S., Chandrasekaran, M., Michaelraj, S., & Dutta, S. (2024, March 26). Build a receipt and invoice processing pipeline with Amazon Textract. Amazon Web Services. https://aws.amazon.com/ru/blogs/machine-learning/build-a-receipt-and-invoice-processing-pipeline-with-amazon-textract/
Download and View Statistics
Copyright License
Copyright (c) 2026 Suprakash Dutta

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors retain the copyright of their manuscripts, and all Open Access articles are disseminated under the terms of the Creative Commons Attribution License 4.0 (CC-BY), which licenses unrestricted use, distribution, and reproduction in any medium, provided that the original work is appropriately cited. The use of general descriptive names, trade names, trademarks, and so forth in this publication, even if not specifically identified, does not imply that these names are not protected by the relevant laws and regulations.


Applied Sciences
| Open Access |
DOI: