Forecasting of air quality index using Machine Learning and deep learning models
Keywords:
Air Quality Index (AQI), Random Forest, XGBoost, LSTM, GRU, Machine Learning, Deep Learning, Forecasting, Environmental Monitoring, Bhopal.Abstract
Accurate forecasting of the Air Quality Index (AQI) is essential for proactive environmental management and public health advisories. This study investigates and compares the predictive capabilities of four supervised learning models—Random Forest, XGBoost, Long Short-Term Memory (LSTM), and Gated Recurrent Unit (GRU)—for AQI forecasting in Bhopal, Madhya Pradesh, using hourly pollutant and meteorological data from February 2025. The dataset, collected from CPCB monitoring stations, includes key pollutants (PM2.5, PM10, NO₂, SO₂, CO, O₃) and meteorological parameters (temperature, humidity, wind speed, pressure, and rainfall). All models were evaluated using MAE, RMSE, MAPE, and R² metrics. Results indicate that deep learning models, especially GRU, outperform traditional machine learning models, achieving an R² of 0.952 and RMSE of 14.41. Feature importance analysis highlights PM2.5 and PM10 as dominant contributors to AQI variations. This study underscores the potential of recurrent neural networks for short-term AQI forecasting and provides a foundation for developing real-time, location-specific environmental alert systems.
Downloads
Metrics
References
G. E. P. Box and G. M. Jenkins, Time Series Analysis: Forecasting and Control. San Francisco: Holden-Day, 1976.
L. C. Chan and A. Y. Ho, “Application of ARIMA models in forecasting air quality data,” Environmental Modeling & Assessment, vol. 24, no. 3, pp. 281–296, 2019.
A. Kumar and A. Sinha, “Air quality prediction using machine learning models: a case study from Delhi,” Environmental Monitoring and Assessment, vol. 192, no. 7, pp. 1–14, 2020.
M. B. Dahiya, “India’s air pollution crisis: The need for proactive measures,” Energy Policy, vol. 112, pp. 90–98, 2018.
A. Eskandari and P. Momeni, “Forecasting air pollution using time-series and regression models,” Atmospheric Pollution Research, vol. 12, no. 4, pp. 624–634, 2021.
A. Kulkarni et al., “Application of machine learning techniques in air quality forecasting: a review,” Air Quality, Atmosphere & Health, vol. 15, no. 2, pp. 193–209, 2022.
R. Wang et al., “Air quality forecasting using random forest and gradient boosting models,” Atmospheric Pollution Research, vol. 11, no. 4, pp. 768–778, 2020.
H. Liang et al., “Support vector regression with enhanced features for AQI forecasting,” Ecological Indicators, vol. 118, p. 106774, 2020.
T. Chen and C. Guestrin, “XGBoost: A scalable tree boosting system,” in Proc. 22nd ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, 2016, pp. 785–794.
A. Jain, S. Ghosh, and D. Kumar, “Comparative analysis of air quality prediction models using machine learning,” Journal of Environmental Informatics, vol. 38, no. 1, pp. 1–10, 2021.
H. Liang, J. Zhang, and Y. Wang, “Improved SVR with feature selection for air quality prediction,” Environmental Science and Pollution Research, vol. 28, pp. 553–562, 2021.
A. Mishra et al., “Ensemble learning methods for improving air quality forecasting in urban areas,” Environmental Modelling & Software, vol. 124, p. 104583, 2020.
M. S. Alam et al., “A review of air pollution monitoring and prediction using machine learning,” Environmental Science and Pollution Research, vol. 29, no. 7, pp. 9996–10020, 2022.
Y. Zhang et al., “Long short-term memory for air quality prediction,” IEEE Access, vol. 7, pp. 45645–45656, 2019.
Z. Liu and M. Chen, “Air quality prediction using deep learning based LSTM,” Neural Computing and Applications, vol. 34, pp. 2213–2224, 2022.
J. Singh and R. Goyal, “GRU-based air pollution forecasting model using real-time data,” Environmental Informatics Archives, vol. 13, pp. 34–46, 2021.
P. Chen and S. Y. Tan, “Forecasting PM2.5 using deep learning with temporal features,” IEEE Access, vol. 8, pp. 23420–23429, 2020.
Z. Tang et al., “PM2.5 prediction using hybrid deep learning model with CNN and LSTM,” Atmospheric Environment, vol. 223, p. 117291, 2020.
S. Zhang and Y. Li, “AQI forecasting using ensemble learning and deep LSTM,” Journal of Cleaner Production, vol. 278, p. 123898, 2021.
X. Chen et al., “Attention-based LSTM for AQI prediction,” Applied Intelligence, vol. 52, pp. 439–452, 2022.
L. Guo et al., “Hybrid deep learning approach for air quality forecasting,” IEEE Transactions on Industrial Informatics, vol. 17, no. 1, pp. 396–405, 2021.
J. Singh et al., “A multimodal fusion framework for AQI forecasting using IoT and satellite data,” Sensors, vol. 21, no. 8, p. 2748, 2021.
A. Mishra et al., “Handling missing and noisy data in air quality datasets for deep learning models,” Environmental Modelling & Software, vol. 134, p. 104849, 2020.
V. Kumar et al., “Challenges and perspectives of machine learning in environmental data analytics,” Environmental Research Letters, vol. 15, no. 10, p. 103002, 2020.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
You are free to:
- Share — copy and redistribute the material in any medium or format
- Adapt — remix, transform, and build upon the material for any purpose, even commercially.
Terms:
- Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.