Implementation of Feature Engineering in Prediction of AQI in India using Machine Learning

Main Article Content

Reema Gupta
Priti Singla

Abstract

Prediction of Air Quality Index (AQI) is the necessity of today’s era but for the prediction, analysis of different preprocessing techniques that can be applied, needs to be considered. In this study, first of all we explored various feature engineering techniques such as Data Imputation, Scaling, Extraction, Selection, and Data Split that can be used before applying machine learning algorithm for better results. Second, we used MLR and SVR (Linear, Gaussian) to build the prediction models. Finally, we used root mean square error (RMSE), R2, Mean Squared Error (MSE) and Mean Absolute Error (MAE) to evaluate the performance of the regression models in collaboration with the feature engineering techniques. The results shows that the performance of Linear SVR is better when coupled with imputation and robust scaler (R2=0.7557834846394744) as compared to the others, the performance of Gaussian SVR is better when coupled with the imputation only as compared to the others. In case of MLR, results (R2=0.7769187383819041) are almost same in all the 4 cases and performance degraded when PCA was applied.

Article Details

How to Cite
Gupta, R. ., & Singla, P. . (2023). Implementation of Feature Engineering in Prediction of AQI in India using Machine Learning. International Journal on Recent and Innovation Trends in Computing and Communication, 11(11s), 55–62. https://doi.org/10.17762/ijritcc.v11i11s.8070
Section
Articles

References

L. Pan, E. Yao and Y. Yang, "Impact Analysis of Traffic-Related Air Pollution based on Real time Traffic and Basic Meteorological Information," Journal of Environmental Management, vol. 183, no. 3, pp. 510-520, 2016.

S. Dhingra, R. B. Madda, A. H. Gandomi, R. Patan and M. Daneshmand, "Internet of Things Mobile-Air Pollution Monitoring System (IoT-Mobair)," IEEE Internet of Things Journal, vol. XX, no. XX, p. 8, 2019.

J. Ma, K. Li, Y. Han and J. Yang, "Image based Air Pollution Estimation Using Hybrid Convolutional Neural Network," in 24th International Conference on Pattern Recognition, Beijing, China, 2018.

R. Brugha and J. Grigg, "Urban Air Pollution and Respiratory Infections," Paediatric Respiratory Reviews, vol. 15, no. 2, p. 6, 2014.

K. P. Singh, S. Gupta and P. Rai, "Indentifying Pollution Sources and Predicting Urban Air Quality using Ensemble Learning Methods," Atmospheric Environment, vol. 80, pp. 426-437, 2013.

I. U. Samee and M. T. Jilani, "An Application of IOT and Machine Learning to Air Pollution Monitoring in Smart Cities," IEEE, p. 6, 2019.

S. K, A. K, G. M, G. R and M. A, "Air Quality Prediction Using Classification Techniques," Annals of R.S.C.B, vol. 25, no. 4, pp. 3794-3805, 2021.

H. Liu, Q. Li, D. Yu and Y. Gu, "Air Quality Index and Air Pollutant Concentration Prediction based on Machine Learning Algorithms," MDPI, no. 4069, p. 9, 2019.

A. Kumar and P. Goyal, "Forecasting of air quality in Delhi using principal component regression technique," Atmospheric Pollution Reasearch 2, pp. 436-444, 2011.

H. Maleki, A. Sorooshian, G. Goudarzi, Z. Baboli, Y. T. Birgani and M. Rahmati, "Air Pollution Prediction using an Artificial Neural Network Model," Clean Technologies and Environmental Policy, vol. 21, pp. 1341-1352, 2019.

Q. Wu and H. Lin, "A novel optimal-hybrid model for daily air quality index prediction considering air pollutant factors," Science of the total environment, vol. 683, pp. 808-821, 2019.

G. K. Kang, J. Z. Gao, S. Chiao, S. Lu and G. Xie, "Air Quality Prediction: Big Data and Machine Learning Approaches," International Journal of Environmental Science and Developement, vol. 9, no. 1, pp. 8-16, 2018.

Y. Zhang, Y. Wang, M. Gao, Q. Ma, J. Zhao, R. Zhang, Q. Wang and L. Huang, "A Predictive Data Feature Exploration - Based Air Quality Prediction Approach," IEEE Access, vol. 7, pp. 30732-30743, 2019.

S. Bhattacharya and S. Shahnawaz, "Using Machine Learning to Predict Air Quality Index in New Delhi," arXiv:2112.05753, p. 7, 2021.

M. Castelli, F. M. Clemente, A. Popovik, S. Silva and L. Vanneschi, "A Machine Learning Approach to Predict Air Quality in California," Wiley, vol. 2020, p. 23, 2020.

C. AmuthaDevi, D. S. Vijayan and V. Ramachandran, "Development of Air Quality Mornitoring (AQM) Models using different Machine Leaning Approaches," Journal of Ambient Intelligence and Humanized Computing, p. 13, 2021.

S. W. Choi and B. H. Kim, "Sustainability," Applying PCA to Deep Learing Forecasting Models for Predicting PM 2.5, vol. 13, no. 7, p. 30, 2021.

C. Srivastava, S. Singh and A. P. Singh, "Estimation of Air Pollution in Delhi using Machine Learning Techniques," in 2018 International Conference on Computing, Power and Communication Technologies (GUCON), Greater Noida, 2018.