Comparative Analysis of CNN Architectures for Clean and Non-Clean Outfit Classification in Fashion Images
Abstract
The purpose of this study is to create and contrast image classifiers which are able to identify clean clothing from dirty using machine learning and deep learning techniques. Our utilized dataset contains 200 images of an outfit which are obtained from different official fashion brand sites and well-known e-commerce platforms in Indonesia. The information is used from secondary data which are digital images of the two style categories. Image preprocessing (resizes, normalizations and data augmentations), feature extraction from VGG-16, VGG-19 and inception v3 is done. The extracted features are then fed into the classifiers namely Logistic Regression, Neural Network and Support Vector Machine (SVM). The evaluation of the model is performed by different metrics (e.g., AUC, accuracy, F1-score, precision, recall and MCC) and visual examination using MDS plot and Silhouette Plot. The results demonstrate that the integrated model involving VGG-16 and Logistic Regression performs best obtaining highest AUC when compared with other model combinations. The MDS and Silhouette Plot visualizations also supported that VGG-16 has the most superior feature separation between clean outfits and non-clean outfits. In a word, our study unveils that fashion style recognition accuracy can be improved significantly through CNN-based feature extraction and traditional classification model. We hope that our work will encourage the comparison of CNN feature extraction and classification algorithms, and also can lay the foundation for further research in image-based outfit guidance systems serving a range of fashion industry and service sectors where professional appearance is a criterion.
References
Akram, M. W., Abbas, A., & Khan, I. A. (2022). Effects of perceived value, service quality and customer trust in home delivery service staff on customer satisfaction: evidence from Pakistan. International Journal of Management Research and Emerging Science, 12(4). https://doi.org/10.56536/ijmres.v12i4.351
Arandjelovic, R., Gronat, P., Torii, A., Pajdla, T., & Sivic, J. (2016). NetVLAD: CNN Architecture for Weakly Supervised Place Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 5297–5307. https://doi.org/10.1109/CVPR.2016.572
Bengio, Y., Courville, A., & Vincent, P. (2013). Representation Learning: A Review and New Perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8), 1798–1828. https://doi.org/10.1109/TPAMI.2013.50
Bharadwaj, Prakash, K. B., & Kanagachidambaresan, G. R. (2021). Pattern Recognition and Machine Learning (pp. 105–144). https://doi.org/10.1007/978-3-030-57077-4_11
Buradagunta, S., & Balakrishna, S. (2025). A two-stage deep learning approach for optimizing fashion product recommendations. SN Computer Science, 6(4), 345. https://doi.org/10.1007/s42979-025-03909-2
Chae, M. (2022). Gender Differences in Adaptive Clothing: Applying Functional, Expressive, and Aesthetic (FEA) Needs of People with Movement Impairments. International Journal of Fashion Design, Technology and Education, 15(3), 360–370. https://doi.org/10.1080/17543266.2022.2071468
Choudhary, G., & Sethi, D. (2023). From conventional approach to machine learning and deep learning approach: an experimental and comprehensive review of image fusion techniques. Archives of Computational Methods in Engineering, 30(2), 1267-1304. https://doi.org/10.1007/s11831-022-09833-5
Deldjoo, Y., Nazary, F., Ramisa, A., Mcauley, J., Pellegrini, G., Bellogin, A., & Noia, T. D. (2023). A review of modern fashion recommender systems. ACM Computing Surveys, 56(4), 1-37. https://doi.org/10.1145/3624733
Dillon, E., & Muhammad, A. (2024). The Power of Professional Dress: Competence, Confidence, and Generational Shifts. The Lookout: Journal of Undergraduate Research & Creative Activity Journal at ECU, 26–33. https://lookout.ecu.edu/index.php/lookout/article/view/31
Elfatimi, E., Eryiğit, R., & Elfatimi, L. (2024). Deep multi-scale convolutional neural networks for automated classification of multi-class leaf diseases in tomatoes. Neural Computing and Applications, 36(2), 803-822. https://doi.org/10.1007/s00521-023-09062-2
Gómez-Talal, I., Bote-Curiel, L., Lozano-Paredes, D., Feijoo-Martínez, J. R., & Rojo-Álvarez, J. L. (2026). Evaluating Manifold Learning Techniques for Dimensionality Reduction on Industrial IoT Cybersecurity Data. IEEE Access. https://doi.org/10.1109/ACCESS.2026.3662155
Hossain, M. S., Basak, N., Mollah, M. A., Nahiduzzaman, M., Ahsan, M., & Haider, J. (2025). Ensemble-based multiclass lung cancer classification using hybrid CNN-SVD feature extraction and selection method. PLoS One, 20(3), e0318219. https://doi.org/10.1371/journal.pone.0318219
Hu, Y., Wang, P., Jia, M., Zhang, Y., Hong, J., Wan, G., ... & Li, T. (2026). EMFFTrans: Efficient Multi-Scale Feature Fusion Transformer for Road Scene Semantic Segmentation. IEEE Transactions on Intelligent Transportation Systems. https://doi.org/10.1109/TITS.2026.3679203
Kasim, S., Malek, S., Tang, J., Kiew, X. N., Cheen, S., Liew, B., ... & Shariff, R. (2025). Multiclass leukemia cell classification using hybrid deep learning and machine learning with CNN-based feature extraction. Scientific reports, 15(1), 23782. https://doi.org/10.1038/s41598-025-05585-x
Kaur, N., & Pandey, S. (2023). Predicting clothing attributes with CNN and SURF based classification model. Multimedia Tools and Applications, 82(7), 10681-10701. https://doi.org/10.1007/s11042-022-13714-1
Khalid, L., & Gong, W. (2026). Exploring AI in Fashion: A Review of Aesthetics, Personalization, Virtual Try-On, and Forecasting. Multimedia Systems, 32(3), 1–10. https://doi.org/10.1007/s00530-026-02232-x
Khan, S. H., & Iqbal, R. (2025). A comprehensive survey on architectural advances in deep CNNs: challenges, applications, and emerging research directions. arXiv preprint arXiv:2503.16546. https://doi.org/10.48550/arXiv.2503.16546
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet Classification with Deep Convolutional Neural Networks. Advances in Neural Information Processing Systems, 25, 1–10. http://code.google.com/p/cuda-convnet/
Kulkarni, S., & Harnoorkar, S. (2020). Comparative Analysis of CNN Architectures. International Research Journal of Engineering and Technology, 1–6. www.irjet.net
Lee, V., Park, S., & Lee, D. (2022). The effect of e-commerce service quality factors on customer satisfaction, purchase intention, and actual purchase in Uzbekistan. Global Business & Finance Review (GBFR), 27(3), 56-74. https://doi.org/10.17549/gbfr.2022.27.3.56
Liu, X. (2025, April). A convolutional neural network-based platform for clothing style feature extraction and new style generation: integrating optical feature extraction for enhanced design. In Fifth International Conference on Telecommunications, Optics, and Computer Science (TOCS 2024) (Vol. 13629, pp. 95-102). SPIE. https://doi.org/10.1117/12.3067441
Lv, P., Xu, H., Zhang, Y., Zhang, Q., Pan, Q., Qin, Y., ... & Chen, C. (2024). An improved multi-scale feature extraction network for rice disease and pest recognition. Insects, 15(11), 827. https://doi.org/10.3390/insects15110827
Moujahid, A., & Dornaika, F. (2025). Advanced unsupervised learning: a comprehensive overview of multi-view clustering techniques. Artificial Intelligence Review, 58(8), 234. https://doi.org/10.1007/s10462-025-11240-8
Mudgal, P. (2026). Clustering of temporal and visual data: Recent advancements. Data, 11, 7. https://doi.org/10.3390/data11010007
Nassar, S., Hedjam, R., & Belhaouari, S. B. (2026). Novel Loss Functions for Improved Data Visualization in t-SNE. Machine Learning and Knowledge Extraction, 8(2), 47. https://doi.org/10.3390/make8020047
Pan, S. J., & Yang, Q. (2010). A Survey on Transfer Learning. IEEE Transactions on Knowledge and Data Engineering, 22(10), 1345–1359. https://doi.org/10.1109/TKDE.2009.191
Panpaeng, S., Phanphaeng, P., Kumnuanta, J., Yommakit, P., Kocento, K., & Wongchompoo, P. (2023). The Application of Data Mining Techniques for Predicting Education to New Undergraduate Students at Chiang Mai Rajabhat University. 2023 IEEE International Conference on Cybernetics and Innovations (ICCI), 1–6. https://doi.org/10.1109/ICCI57424.2023.10112233
Rabbi, M. F., Sultan, M. N., Hasan, M., & Islam, M. Z. (2023). Tribal Dress Identification using Convolutional Neural Network. J. Inf. Hiding Multim. Signal Process., 14(3), 72-80.
Rousseeuw, P. J. (1987). Silhouettes: A Graphical Aid to The Interpretation and Validation of Cluster Analysis. Journal of Computational and Applied Mathematics, 20, 53–65. https://doi.org/10.1016/0377-0427(87)90125-7
Saut, M., & Bie, S. (2024). Impact of service expectation, experiential quality, and perceived value on hotel customer satisfaction. Journal of Quality Assurance in Hospitality & Tourism, 25(4), 781-809. https://doi.org/10.1080/1528008X.2022.2141414
Shao, C. Y., Baker, J. A., & Wagner, J. (2004). The Effects of Appropriateness of Service Contact Personnel Dress on Customer Expectations of Service Quality and Purchase Intention. Journal of Business Research, 57(10), 1164–1176. https://doi.org/10.1016/S0148-2963(02)00326-0
Shushi, A., & Abdulazeez, A. M. (2024). Fashion design classification based on machine learning and deep learning algorithms: a review. The Indonesian Journal of Computer Science, 13(3). https://doi.org/10.33022/ijcs.v13i3.3980
Simonyan, K., & Zisserman, A. (2014). Very Deep Convolutional Networks for Large-Scale Image Recognition. 3rd International Conference on Learning Representations, ICLR 2015 - Conference Track Proceedings, 1–12. https://arxiv.org/pdf/1409.1556
Singh, L. K., Khanna, M., Thawkar, S., & Singh, R. (2023). Nature-inspired computing and machine learning based classification approach for glaucoma in retinal fundus images. Multimedia Tools and Applications, 82(27), 42851-42899. https://doi.org/10.1007/s11042-023-15175-6
Sudirjo, F., Yuliana, A., Novilia, F., Kalalo, R. R., & Belani, S. W. (2023). The Influence Of Service Quality And Consumer Experience On Consumer Repurchase Intention. Innovative: Journal Of Social Science Research, 3(6), 3965-3973.
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., & Wojna, Z. (2016). Rethinking the Inception Architecture for Computer Vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 2818–2826). https://doi.org/10.1109/CVPR.2016.308
Wang, B., Li, Y., Zhou, M., Han, Y., Zhang, M., Gao, Z., ... & Liu, B. F. (2023). Smartphone-based platforms implementing microfluidic detection with image-based artificial intelligence. Nature Communications, 14(1), 1341. https://doi.org/10.1038/s41467-023-36017-x
Wang, S., Shi, B., Wang, N., Zhang, Y., & Zhu, Y. (2025). Convformer: Multi-Scale Masked Hybrid Convolution-Transformer Network for Hyperspectral Image Super-Resolution. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing. https://doi.org/10.1109/JSTARS.2025.3569410
Xiao, H., Rasul, K., & Vollgraf, R. (2017, August 25). Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms. Computer Science. https://arxiv.org/pdf/1708.07747
Xie, W., Li, Z., Xu, Y., Gardoni, P., & Li, W. (2022). Evaluation of different bearing fault classifiers in utilizing CNN feature extraction ability. Sensors, 22(9), 3314. https://doi.org/10.3390/s22093314
Zhang, Y., He, K., & Song, R. (2023). Image multi-feature fusion for clothing style classification. IEEE Access, 11, 107843-107854. https://doi.org/10.1109/ACCESS.2023.3320270
Copyright (c) 2026 Journal La Multiapp

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.



