CHAWLA, D. .; CHAWLA, D. Scalable Large Language Model Inference in Cloud Ecosystems: Enterprise-Scale Performance Optimization and Resource-Aware Architectures. Journal of Neonatal Surgery, Lahore, Pakistan, v. 14, n. 28S, p. 1124–1139, 2025. Disponível em: https://www.jneonatalsurg.com/index.php/jns/article/view/9206. Acesso em: 6 oct. 2025.