A Comparative Review of Attribute Selection Techniques for PM2.5 Prediction Using Machine Learning Models

Authors

Dr. Sachin Arun Thanekar

Associate Professor, Computer Science and Engineering Department, MITADT University, Loni Kalbhor, Pune (India)

Article Information

DOI: 10.51244/IJRSI.2025.1210000212

Subject Category: Social science

Volume/Issue: 12/10 | Page No: 2389-2397

Publication Timeline

Submitted: 2025-10-07

Accepted: 2025-10-14

Published: 2025-11-15

Abstract

Accurate prediction of fine particulate matter (PM2.5) is vital for understanding and mitigating air pollution impacts on public health. With the rise of machine learning (ML) in environmental forecasting, selecting the most influential features remains a critical preprocessing step. This review paper evaluates the effectiveness of various attribute selection techniques applied to PM2.5 prediction, including filter, wrapper, and embedded methods. We compare the results from Random Forest, LASSO regression, Recursive Feature Elimination (RFE), and correlation analysis. Our comparative analysis reveals that Random Forest consistently highlights meteorological variables such as temperature and wind speed as top contributors, whereas LASSO reduces model complexity by focusing on core pollutants. The paper provides insights for researchers aiming to develop robust and computationally efficient models for real-time PM2.5 forecasting.

Keywords

PM2.5, Feature Selection, Random Forest

Downloads

References

1. Chen, J., Li, S., Huang, G., & Liu, X. (2023). A robust hybrid feature selection method for high-dimensional environmental data analysis. Environmental Modelling & Software, 160, 105620. [Google Scholar] [Crossref]

2. Wang, Y., Liu, Y., Wu, Y., & Zhang, L. (2022). Feature selection and ensemble learning for air quality prediction: A case study in China. Ecological Indicators, 139, 108930. [Google Scholar] [Crossref]

3. Sharma, A., & Saini, L. M. (2021). Air quality prediction using optimized feature selection and machine learning algorithms. Applied Soft Computing, 113, 107872. [Google Scholar] [Crossref]

4. Ahmed, R. M., & Badr, A. (2021). FSFC-AQ: Feature selection for forecasting city-level air quality using hybrid approaches. Environmental Science and Pollution Research, 28, 28837–28852. [Google Scholar] [Crossref]

5. Zhang, Y., Wang, J., & Wang, S. (2020). Air quality prediction based on SSA optimized BiGRU neural network with hybrid feature selection. Science of The Total Environment, 721, 137763. [Google Scholar] [Crossref]

6. Chandrashekar, G., & Sahin, F. (2014). A survey on feature selection methods. Computers & Electrical Engineering, 40(1), 16–28. [Google Scholar] [Crossref]

7. Gaurav, R., & Behera, H. S. (2021). Hybrid feature selection model for predicting air quality index. Environmental Processes, 8(2), 883–902. [Google Scholar] [Crossref]

8. Guyon, I., & Elisseeff, A. (2003). An introduction to variable and feature selection. Journal of Machine Learning Research, 3(Mar), 1157–1182. [Google Scholar] [Crossref]

9. Guyon, I., Weston, J., Barnhill, S., & Vapnik, V. (2002). Gene selection for cancer classification using support vector machines. Machine Learning, 46(1–3), 389–422. [Google Scholar] [Crossref]

10. Hu, X. M., Klein, P. M., & Xue, M. (2013). Evaluation of the updated YSU planetary boundary layer scheme within WRF for air quality simulations in the Houston–Galveston area. Atmospheric Environment, 92, 274–283. [Google Scholar] [Crossref]

11. Liaw, A., & Wiener, M. (2002). Classification and regression by randomForest. R News, 2(3), 18–22. [Google Scholar] [Crossref]

12. Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1), 267–288. [Google Scholar] [Crossref]

Metrics

Views & Downloads

Similar Articles