Enhanced Multi-Task CNN For Age, Gender, Race with Mask in Facial Images

Kimenyi Butera John Bosco; Yonggang Chi

doi:10.51244/IJRSI.2026.1303000208

Enhanced Multi-Task CNN For Age, Gender, Race with Mask in Facial Images

Authors

School of Electronics and Information Engineering, Harbin Institute of Technology, Harbin, Heilongjiang 150001, People’s Republic of China (China)

Yonggang Chi

State Key Laboratory Communication Research Center, Department of Electronics and Information Engineering, Harbin Institute of Technology, Harbin, Heilongjiang 150001, People’s Republic of China (China)

Article Information

DOI: 10.51244/IJRSI.2026.1303000208

Subject Category: Computer Vision

Volume/Issue: 13/3 | Page No: 2414-2446

Publication Timeline

Submitted: 2026-03-22

Accepted: 2026-03-28

Published: 2026-04-15

Abstract

Facial attribute analysis is a critical technology for security, human-computer interaction, and public health. However, conventional models that perform tasks like age, gender, and race estimation independently are computationally inefficient and struggle with real-world challenges, particularly facial occlusions such as face masks. This paper proposes an enhanced Multi-Task Convolutional Neural Network(CNN) to address these limitations by simultaneously predicting age, gender, race, and mask presence from a single input image. Our architecture employs a shared ResNet-50 backbone for feature extraction, enhanced with a dedicated attention mechanism to improve robustness against occlusions by focusing on the most relevant facial regions. Task-specific heads with dropout and batch normalisation were integrated to ensure strong generalisation. The model was rigorously evaluated using a comprehensive set of regression and classification metrics. Results demonstrate that our multi-task framework significantly outperforms traditional single-task models, achieving a mask detection accuracy above 95%, a gender classification accuracy exceeding 91%, a race classification accuracy of over 86%, and an age estimation error (MAE) below 6 years. This study confirms that integrating multi-task learning with an occlusion–aware attention mechanism creates a more efficient, accurate, and robust system for facial analysis. The proposed model shows strong potential for deployment in real-world applications where reliability in the presence of occlusions is essential.

Keywords

Multi-Task, Race, Mask, Facial Images

Downloads

PDF JATS XML

References

1. Z. Wang, B. Huang, G. Wang, P. Yi, and K. Jiang, ‘Masked Face Recognition Dataset and Application’, IEEE Trans. Biom. Behav. Identity Sci., vol. 5, no. 2, pp. 298–304, Apr. 2023, doi: 10.1109/TBIOM.2023.3242085. [Google Scholar] [Crossref]

2. M. Mahmoud, M. S. Kasem, H.-S. Kang, M. Mahmoud, M. S. Kasem, and H.-S. Kang, ‘A Comprehensive Survey of Masked Faces: Recognition, Detection, and Unmasking’, Appl. Sci., vol. 14, no. 19, Sep. 2024, doi: 10.3390/app14198781. [Google Scholar] [Crossref]

3. D. Gala, H. Behl, M. Shah, and A. N. Makaryus, ‘The Role of Artificial Intelligence in Improving Patient Outcomes and Future of Healthcare Delivery in Cardiology: A Narrative Review of the Literature’, Healthcare, vol. 12, no. 4, p. 481, Feb. 2024, doi: 10.3390/healthcare12040481. [Google Scholar] [Crossref]

4. P. Foggia, A. Greco, A. Saggese, and M. Vento, ‘Multi-task learning on the edge for effective gender, age, ethnicity and emotion recognition’, Eng. Appl. Artif. Intell., vol. 118, p. 105651, Feb. 2023, doi: 10.1016/j.engappai.2022.105651. [Google Scholar] [Crossref]

5. W. Hariri, ‘Efficient masked face recognition method during the COVID-19 pandemic’, Signal Image Video Process., vol. 16, no. 3, pp. 605–612, 2022, doi: 10.1007/s11760-021-02050-w. [Google Scholar] [Crossref]

6. Z. Zhao, L. Alzubaidi, J. Zhang, Y. Duan, and Y. Gu, ‘A comparison review of transfer learning and self-supervised learning: Definitions, applications, advantages and limitations’, Expert Syst. Appl., vol. 242, p. 122807, May 2024, doi: 10.1016/j.eswa.2023.122807. [Google Scholar] [Crossref]

7. S. M. A. A. Alvi et al., ‘Accurate and uncertainty-aware multi-task prediction of HEA properties using prior-guided deep Gaussian processes’, Npj Comput. Mater., vol. 11, no. 1, p. 306, Oct. 2025, doi: 10.1038/s41524-025-01811-2. [Google Scholar] [Crossref]

8. A. Iftikhar, A. Shaukat, and R. Tariq, ‘Masked Face Detection and Recognition Using a Unified Feature Extractor’, in 2024 5th International Conference on Advancements in Computational Sciences (ICACS), Feb. 2024, pp. 1–6. doi: 10.1109/ICACS60934.2024.10473243. [Google Scholar] [Crossref]

9. I. Adjabi et al., ‘Past, Present, and Future of Face Recognition: A Review’, Electronics, vol. 9, no. 8, Jul. 2020, doi: 10.3390/electronics9081188. [Google Scholar] [Crossref]

10. N. E. Fadel, ‘Facial Recognition Algorithms: A Systematic Literature Review’, J. Imaging, vol. 11, no. 2, Feb. 2025, doi: 10.3390/jimaging11020058. [Google Scholar] [Crossref]

11. D. J. Jayamanne, ‘Classification of Age Group, Gender, and Race from Facial Images Using Multi-Task Based Deep CNNs with Transfer Learning’, in 2025 International Conference on Artificial Intelligence, Computer, Data Sciences and Applications (ACDSA), Aug. 2025, pp. 1–8. doi: 10.1109/ACDSA65407.2025.11166141. [Google Scholar] [Crossref]

12. Y. Zhao, X. Wang, T. Che, G. Bao, and S. Li, ‘Multi-task deep learning for medical image computing and analysis: A review’, Comput. Biol. Med., vol. 153, p. 106496, Feb. 2023, doi: 10.1016/j.compbiomed.2022.106496. [Google Scholar] [Crossref]

13. F. I. Eyiokur et al., ‘A survey on computer vision-based human analysis in the COVID-19 era’, Image Vis. Comput., vol. 130, p. 104610, Feb. 2023, doi: 10.1016/j.imavis.2022.104610. [Google Scholar] [Crossref]

14. Y. Xia, B. Zhang, and F. Coenen, ‘Face occlusion detection based on multi-task convolution neural network’, in 2015 12th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), Aug. 2015, pp. 375–379. doi: 10.1109/FSKD.2015.7381971. [Google Scholar] [Crossref]

15. O. O. Oladimeji and A. O. J. Ibitoye, ‘Brain tumour classification using ResNet50-convolutional block attention module’, Appl. Comput. Inform., doi: 10.1108/ACI-09-2023-0022. [Google Scholar] [Crossref]

16. K. M. Hosny, N. AbdElFattah Ibrahim, E. R. Mohamed, and H. M. Hamza, ‘Artificial intelligence-based masked face detection: A survey’, Intell. Syst. Appl., vol. 22, p. 200391, Jun. 2024, doi: 10.1016/j.iswa.2024.200391. [Google Scholar] [Crossref]

17. D. Kollias, V. Sharmanska, and S. Zafeiriou, ‘Distribution Matching for Heterogeneous Multi-Task Learning: a Large-scale Face Study’, 2021, arXiv. doi: 10.48550/ARXIV.2105.03790. [Google Scholar] [Crossref]

18. T. Rakhimzhanova, A. Kuzdeuov, H. A. Varol, T. Rakhimzhanova, A. Kuzdeuov, and H. A. Varol, ‘AnyFace++: Deep Multi-Task, Multi-Domain Learning for Efficient Face AI’, Sensors, vol. 24, no. 18, Sep. 2024, doi: 10.3390/s24185993. [Google Scholar] [Crossref]

19. M. Kim, A. K. Jain, and X. Liu, ‘AdaFace: Quality Adaptive Margin for Face Recognition’, presented at the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 18750–18759. Accessed: Oct. 24, 2025. [Online]. Available: https://openaccess.thecvf.com/content/CVPR2022/html/Kim_AdaFace_Quality_Adaptive_Margin_for_Face_Recognition_CVPR_2022_paper.html [Google Scholar] [Crossref]

20. R. Ranjan, V. M. Patel, and R. Chellappa, ‘HyperFace: A Deep Multi-Task Learning Framework for Face Detection, Landmark Localisation, Pose Estimation, and Gender Recognition’, IEEE Trans. Pattern Anal. Mach. Intell., vol. 41, no. 1, pp. 121–135, Jan. 2019, doi: 10.1109/TPAMI.2017.2781233. [Google Scholar] [Crossref]

21. X. Tang, C. Wang, and S. Yin, ‘Research on Occluded Facial Expression Recognition Algorithm Based on Hybrid Attention Mechanism and Multi-scale Feature Extraction’, in Proceedings of the 2025 6th International Conference on Computer Information and Big Data Applications, in CIBDA ’25. New York, NY, USA: Association for Computing Machinery, Aug. 2025, pp. 87–92. doi: 10.1145/3746709.3746726. [Google Scholar] [Crossref]

22. J. Kim and D. Lee, ‘Facial Expression Recognition Robust to Occlusion and to Intra-Similarity Problem Using Relevant Subsampling’, Sensors, vol. 23, no. 5, p. 2619, Feb. 2023, doi: 10.3390/s23052619. [Google Scholar] [Crossref]

23. E. Hassan, S. A. Ghazalah, N. El-Rashidy, T. A. El-Hafeez, and M. Y. Shams, ‘DenseNet Model with Attention Mechanisms for Robust Date Fruit Image Classification’, Int. J. Comput. Intell. Syst., vol. 18, no. 1, p. 228, Sep. 2025, doi: 10.1007/s44196-025-00809-4. [Google Scholar] [Crossref]

24. A. A. A. Aboluhom and I. Kandilli, ‘Real-time facial recognition via multitask learning on Raspberry Pi’, Sci. Rep., vol. 15, no. 1, p. 28467, Aug. 2025, doi: 10.1038/s41598-025-97490-6. [Google Scholar] [Crossref]

25. W. Hariri, ‘Efficient masked face recognition method during the COVID-19 pandemic’, Signal Image Video Process., vol. 16, no. 3, pp. 605–612, 2022, doi: 10.1007/s11760-021-02050-w. [Google Scholar] [Crossref]

26. J. A. Alzubi, K. S. Pokkuluri, R. Arunachalam, S. K. Shukla, S. Venugopal, and K. Arunachalam, ‘A generative adversarial network-based accurate masked face recognition model using dual scale adaptive efficient attention network’, Sci. Rep., vol. 15, no. 1, p. 17594, May 2025, doi: 10.1038/s41598-025-02144-2. [Google Scholar] [Crossref]

27. F. I. Eyiokur et al., ‘A survey on computer vision-based human analysis in the COVID-19 era’, Image Vis. Comput., vol. 130, p. 104610, Feb. 2023, doi: 10.1016/j.imavis.2022.104610. [Google Scholar] [Crossref]

28. O. Agbo-Ajala and S. Viriri, ‘Deeply Learned Classifiers for Age and Gender Predictions of Unfiltered Faces’, doi: 10.1155/2020/1289408. [Google Scholar] [Crossref]

29. A. Anwar and A. Raychowdhury, ‘Masked Face Recognition for Secure Authentication’, Aug. 25, 2020, arXiv: arXiv:2008.11104. doi: 10.48550/arXiv.2008.11104. [Google Scholar] [Crossref]

30. F. Firdaus and R. Munir, ‘Masked Face Recognition using Deep Learning based on Unmasked Area’, Mar. 2022, pp. 1–6. doi: 10.1109/ICPC2T53885.2022.9776651. [Google Scholar] [Crossref]

31. S. Makinist, G. Aydin, S. Makinist, and G. Aydin, ‘Gender Classification Using Face Vectors: A Deep Learning Approach Without Classical Models’, Information, vol. 16, no. 7, Jun. 2025, doi: 10.3390/info16070531. [Google Scholar] [Crossref]

32. W. Hariri, ‘Efficient masked face recognition method during the COVID-19 pandemic’, Signal Image Video Process., vol. 16, no. 3, pp. 605–612, 2022, doi: 10.1007/s11760-021-02050-w. [Google Scholar] [Crossref]

33. R. Shah, ‘Enhancing Privacy and Accuracy in Facial Recognition with Synthetic Data Generated by Pre-Trained GANs’, 2025. doi: 10.5281/zenodo.16995670. [Google Scholar] [Crossref]

34. S. Jain, G. Seth, A. Paruthi, U. Soni, and G. Kumar, ‘Synthetic data augmentation for surface defect detection and classification using deep learning’, J. Intell. Manuf., vol. 33, no. 4, pp. 1007–1020, Apr. 2022, doi: 10.1007/s10845-020-01710-x. [Google Scholar] [Crossref]

35. ZhouCaixia, ZhiRuicong, and HuXin, ‘Cross-dataset face analysis based on multi-task learning’, Appl. Intell., Oct. 2022, doi: 10.1007/s10489-022-03173-4. [Google Scholar] [Crossref]

36. J. Deng, J. Guo, N. Xue, and S. Zafeiriou, ‘ArcFace: Additive Angular Margin Loss for Deep Face Recognition’, presented at the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 4690–4699. Accessed: Oct. 24, 2025. [Online]. Available: https://openaccess.thecvf.com/content_CVPR_2019/html/Deng_ArcFace_Additive_Angular_Margin_Loss_for_Deep_Face_Recognition_CVPR_2019_paper.html [Google Scholar] [Crossref]

37. M. Yeung, T. Teramoto, S. Wu, T. Fujiwara, K. Suzuki, and T. Kojima, ‘VariFace: Fair and Diverse Synthetic Dataset Generation for Face Recognition’, arXiv.org. Accessed: Oct. 27, 2025. [Online]. Available: https://arxiv.org/abs/2412.06235v2 [Google Scholar] [Crossref]

38. K. B. J. Bosco and Y. Chi, ‘Enhanced Age, Gender, Race Estimation Using Multi-task CNN’, Int. J. Latest Technol. Eng. Manag. Appl. Sci., vol. 14, no. 9, pp. 353–375, Oct. 2025, doi: 10.51583/IJLTEMAS.2025.1409000046. [Google Scholar] [Crossref]

39. ‘Simplifying Multi-Task Architectures Through Task-Specific Normalisation’. Accessed: Mar. 25, 2026. [Online]. Available: https://arxiv.org/html/2512.20420v1 [Google Scholar] [Crossref]

40. KararHaiderDeepVision, ‘ResNet-50 Explained Step by Step: The Easiest Guide to Deep Residual Networks’, Medium. Accessed: Mar. 25, 2026. [Online]. Available: https://medium.com/@deepvisionkararhaider/resnet-50-explained-step-by-step-the-easiest-guide-to-deep-residual-networks-7616f4f45046 [Google Scholar] [Crossref]

41. A. Zhalgas, B. Amirgaliyev, and A. Sovet, ‘Robust Face Recognition Under Challenging Conditions: A Comprehensive Review of Deep Learning Methods and Challenges’, Appl. Sci., vol. 15, no. 17, p. 9390, Jan. 2025, doi: 10.3390/app15179390. [Google Scholar] [Crossref]

42. S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon, ‘CBAM: Convolutional Block Attention Module’, presented at the Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 3–19. Accessed: Mar. 16, 2026. [Online]. Available: https://openaccess.thecvf.com/content_ECCV_2018/html/Sanghyun_Woo_Convolutional_Block_Attention_ECCV_2018_paper.html [Google Scholar] [Crossref]

43. B. Shahriari, M. Yazdian-Dehkordi, and E. Ahmadi, ‘Unmasking Faces: Hybrid Attention Mechanisms for Robust Masked and Unmasked Face Recognition’, Nov. 13, 2024, In Review. doi: 10.21203/rs.3.rs-5364870/v1. [Google Scholar] [Crossref]

44. Z. Zhang et al., ‘RAM++: Robust Representation Learning via Adaptive Mask for All-in-One Image Restoration’, Sep. 15, 2025, arXiv:2509.12039. doi: 10.48550/arXiv.2509.12039. [Google Scholar] [Crossref]

45. ‘Gender Classification Using Face Vectors: A Deep Learning Approach Without Classical Models’. Accessed: Mar. 25, 2026. [Online]. Available: https://www.mdpi.com/2078-2489/16/7/531 [Google Scholar] [Crossref]

46. K. Zhang, Z. Zhang, Z. Li, and Y. Qiao, ‘Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks’, IEEE Signal Process. Lett., vol. 23, no. 10, pp. 1499–1503, Oct. 2016, doi: 10.1109/LSP.2016.2603342. [Google Scholar] [Crossref]

47. ‘A Study on Real-time Object Detection using Deep Learning’. Accessed: Mar. 25, 2026. [Online]. Available: https://arxiv.org/html/2602.15926v1 [Google Scholar] [Crossref]

48. R. Rothe, R. Timofte, and L. Van Gool, ‘Deep Expectation of Real and Apparent Age from a Single Image Without Facial Landmarks’, Int. J. Comput. Vis., vol. 126, no. 2–4, pp. 144–157, Apr. 2018, doi: 10.1007/s11263-016-0940-3. [Google Scholar] [Crossref]

49. V. Forch, J. Vitay, and F. H. Hamker, ‘Recurrent Spatial Attention for Facial Emotion Recognition’. [Google Scholar] [Crossref]

50. T. a. O. Shifan, L. I. Yufeng, H. Yufeng, and L. a. N. Xiaoyu, ‘Face Detection Algorithm Based on Deep Residual Network and Attention Mechanism’, Comput. Eng., vol. 47, no. 11, p. 276, Nov. 2020, doi: 10.19678/j.issn.1000-3428.0059379. [Google Scholar] [Crossref]

51. K. B. J. Bosco and Y. Chi, ‘Enhanced Age, Gender, Race Estimation Using Multi-task CNN’, Int. J. Latest Technol. Eng. Manag. Appl. Sci., vol. 14, no. 9, pp. 353–375, Oct. 2025, doi: 10.51583/IJLTEMAS.2025.1409000046. [Google Scholar] [Crossref]

52. D. Theckedath and R. R. Sedamkar, ‘Detecting Affect States Using VGG16, ResNet50 and SE-ResNet50 Networks’, SN Comput. Sci., vol. 1, no. 2, p. 79, Mar. 2020, doi: 10.1007/s42979-020-0114-9. [Google Scholar] [Crossref]

53. H.-K. Song et al., ‘Deep user identification model with multiple biometric data’, BMC Bioinformatics, vol. 21, p. 315, Jul. 2020, doi: 10.1186/s12859-020-03613-3. [Google Scholar] [Crossref]

54. Z. Wang et al., ‘Masked Face Recognition Dataset and Application’, Mar. 23, 2020, arXiv: arXiv:2003.09093. doi: 10.48550/arXiv.2003.09093. [Google Scholar] [Crossref]

55. ‘Deep Multi-task Multi-label CNN for Effective Facial Attribute Classification - UCL Discovery’. Accessed: Mar. 16, 2026. [Online]. Available: https://discovery.ucl.ac.uk/id/eprint/10091120/ [Google Scholar] [Crossref]

56. ‘Deep Multi-task Multi-label CNN for Effective Facial Attribute Classification (FAC)’. Accessed: Mar. 16, 2026. [Online]. Available: http://www.360doc.com/content/20/0225/17/13328254_894760271.shtml [Google Scholar] [Crossref]

57. C. R. Harris et al., ‘Array programming with NumPy’, Nature, vol. 585, pp. 357–362, Sep. 2020, doi: 10.1038/s41586-020-2649-2. [Google Scholar] [Crossref]

58. F. Pedregosa et al., ‘Scikit-learn: Machine Learning in Python’, J. Mach. Learn. Res., vol. 12, no. 85, pp. 2825–2830, 2011. [Google Scholar] [Crossref]

59. M. Waskom, ‘seaborn: statistical data visualisation’, J. Open Source Softw., vol. 6, no. 60, p. 3021, Apr. 2021, doi: 10.21105/joss. 03021. [Google Scholar] [Crossref]

60. ‘Computer Vision Training Dataset - Nexdata’. Accessed: Mar. 17, 2026. [Online]. Available: https://www.nexdata.ai/datasets/computervision?campaignid=21079376135&adgroupid=166456914464&keyword=face%20dataset&device=c&source=google&utm_term=face%20dataset&utm_campaign=OTS-computerv-us-20230906&utm_source=adwords&utm_medium=ppc&hsa_acc=2693094599&hsa_cam=21079376135&hsa_grp=166456914464&hsa_ad=692781022530&hsa_src=g&hsa_tgt=kwd-597818837700&hsa_kw=face%20dataset&hsa_mt=p&hsa_net=adwords&hsa_ver=3&gad_source=1&gad_campaignid=21079376135&gbraid=0AAAAAC1mjwA5xIjSkap-wm-rAQKu7CehL&gclid=CjwKCAjw1N7NBhAoEiwAcPchpxnzKgIJ0JGZi9b6S-GVcBClJJ2Pp_Fkkb7w58G285ihSKC_Qpy-aRoCOZ4QAvD_BwE [Google Scholar] [Crossref]

61. M. Wang, W. Deng, J. Hu, X. Tao, and Y. Huang, ‘Racial Faces in-the-Wild: Reducing Racial Bias by Information Maximisation Adaptation Network’, Jul. 27, 2019, arXiv: arXiv:1812.00194. doi: 10.48550/arXiv.1812.00194. [Google Scholar] [Crossref]

62. R. Raumanns, G. Schouten, J. P. W. Pluim, and V. Cheplygina, ‘Dataset Distribution Impacts Model Fairness: Single Vs. Multi-task Learning’, in Ethics and Fairness in Medical Imaging, E. Puyol-Antón, G. Zamzmi, A. Feragen, A. P. King, V. Cheplygina, M. Ganz-Benjaminsen, E. Ferrante, B. Glocker, E. Petersen, J. S. H. Baxter, I. Rekik, and R. Eagleson, Eds, Cham: Springer Nature Switzerland, 2025, pp. 14–23. doi: 10.1007/978-3-031-72787-0_2. [Google Scholar] [Crossref]

63. PaprokiAnthony, SalvadoOlivier, and FookesClinton, ‘Synthetic Data for Deep Learning in Computer Vision & Medical Imaging: A Means to Reduce Data Bias’, ACM Comput. Surv., Jun. 2024, doi: 10.1145/3663759. [Google Scholar] [Crossref]

64. ‘Synthetic data for face recognition: Current state and prospects | Request PDF’, ResearchGate, Aug. 2025, doi: 10.1016/j.imavis.2023.104688. [Google Scholar] [Crossref]

65. ‘Deep Multi-task Learning for Facial Expression Recognition and Synthesis Based on Selective Feature Sharing | Request PDF’, in ResearchGate, doi: 10.1109/ICPR48806.2021.9413000. [Google Scholar] [Crossref]

Enhanced Multi-Task CNN For Age, Gender, Race with Mask in Facial Images

Authors

Article Information

Publication Timeline

Abstract

Keywords

Downloads

References

Metrics

Views & Downloads

Similar Articles