Engineering and Technology | Open Access | DOI: https://doi.org/10.37547/tajet/Volume08Issue04-01

CNN-Vit: A Hybrid CNN–Vision Transformer Framework for Accurate and Real-Time Welding Defect Classification With GAN-Based Data Augmentation

Satar Jabar Hussain , Missan Oil Training Institute, Ministry of Oil, Iraq

Abstract

Deep learning–based welding defect classification often faces challenges such as limited training data, class imbalance, and high model complexity, which restrict real-time industrial applications. To address these issues, this paper proposes a hybrid CNN–Vision Transformer framework with GAN-based data augmentation for welding defect classification. First, welding images acquired using a wide dynamic range visual sensor are preprocessed through binarization, median filtering, morphological dilation, and cropping to enhance defect features. A Generative Adversarial Network (GAN) is then employed to generate synthetic samples and alleviate dataset imbalance. A lightweight CNN extracts local spatial features and reduces feature dimensionality, after which the resulting feature maps are converted into tokens and processed by a Vision Transformer encoder to capture global contextual relationships via self-attention. The proposed model classifies welding images into four categories: normal, burn-through, undercut, and welding collapse. Experimental results demonstrate that the hybrid architecture achieves improved classification accuracy and computational efficiency compared with conventional lightweight CNN models. In addition, the model attains 98.25% accuracy on the MNIST dataset, validating the effectiveness of the proposed framework.

Keywords

Welding defect, defect classification, deep learning, lightweight CNN, Vision Transformer

References

Simon, A., & Jose Nereparambil, A. (2025). A Comparative Study of Conventional vs. Automated Non-Destructive Testing Systems in Welded Joint Inspection.‏

Tadjikuziev, R., Rubidinov, S., & Mamatqulova, S. (2024). Advancements in energy-efficient welding production techniques: Innovative models and methods for combined workpiece fabrication. In E3S Web of Conferences (Vol. 583, p. 05005). EDP Sciences.‏

Derbiszewski, B., Obraniak, A., Rylski, A., Siczek, K., & Wozniak, M. (2024). Studies on the quality of joints and phenomena therein for welded automotive components made of aluminum alloy—A review. Coatings, 14(5), 601.‏

Mobaraki, M. (2025). Vision-based seam tracking and multi-modal defect detection in GMAW fillet welding using artificial intelligence (Doctoral dissertation, University of British Columbia).‏

Thi Hoa, N., Ha Minh Quan, T., & Diep, Q. B. (2025). Weld-CNN: Advancing non-destructive testing with a hybrid deep learning model for weld defect detection. Advances in Mechanical Engineering, 17(5), 16878132251341615.‏

Nambiar, A. (2025). Advancing Welding Defect Detection in Maritime Operations via Adapt-WeldNet and Defect Detection Interpretability Analysis. arXiv preprint arXiv:2508.00381.‏

Nguyen, H. G. (2024). Identification of Asymptomatic Vertebral Fracture Using Artificial Intelligence Methods (Doctoral dissertation, University of Technology Sydney (Australia)).‏

Zhang, M., Feng, M., Chen, C., Yu, X., & Lian, G. (2025). Weld defect detection: deep learning-based image processing and the mechanisms of defect formation. Archives of Computational Methods in Engineering, 1-39.‏

Ahmed, A. S., Abood, I. N., & Taha, M. S. (2026). Adaptive Multi-objective Optimization for Obstacle-aware Wireless Sensor Network Deployment: A Comparative Analysis of State-of-the-Art Algorithms. International Journal of Intelligent Engineering & Systems, 19(2).‏

Jani, J. (2025). Deep Learning-Based Spatter Detection and Weld Pool Segmentation for Automated Welding Quality Assessment (Master's thesis, The Ohio State University).‏

Ma, M., Yang, L., Liu, Y., & Yu, H. (2024). A transformer-based network with feature complementary fusion for crack defect detection. IEEE Transactions on Intelligent Transportation Systems, 25(11), 16989-17006.‏

Palma-Ramírez, D., Ross-Veitía, B. D., Font-Ariosa, P., Espinel-Hernández, A., Sanchez-Roca, A., Carvajal-Fals, H., ... & Hernández-Herrera, H. (2024). Deep convolutional neural network for weld defect classification in radiographic images. Heliyon, 10(9).‏

Asraa, S. A., Haddad, A. A. A., Hameed, R. S., & Taha, M. S. (2025). An Accurate Model for Text Document Classification Using Machine Learning Techniques. Ingenierie des Systemes d'Information, 30(4), 913.‏

Jabbar, N. K., Naderan, M., & Taha, M. S. (2025). HybridIoMT: A Dual-Phase Machine Learning Framework for Robust Cybersecurity in Internet of Medical Things. International Journal of Intelligent Engineering & Systems, 18(4).‏

Abdulghani, S. F., Shtayt, B. A., Taha, M. S., & Hashim, M. M. (2025, September). Effective Knowledge Graph Representation for Cybersecurity Using AI-Based X Data and Named Entity Relation Technique. In International Conference on Cybersecurity and Artificial Intelligence Strategies (pp. 71-89). Cham: Springer Nature Switzerland.‏

Ismael, B. M., Ngadi, A. B., Taha, M. S., & Sharif, J. B. M. (2026, February). Non-dominated sorting genetic algorithm for channel assignment in multiple radio interfaces with multiple channels. In AIP Conference Proceedings (Vol. 3393, No. 1, p. 060041). AIP Publishing LLC.‏

Chen, X., Wang, P., Pan, Q., & Lin, S. (2018). The effect of martensitic phase transformation dilation on microstructure, strain–stress and mechanical properties for welding of high-strength steel. Crystals, 8(7), 293.‏

Abood, I. N., Ahmed, A. S., & Taha, M. S. (2025). An Efficient Genetic Algorithm-based Approach for Association Rule Hiding in Privacy-preserving Data Mining: A Parallel Processing Framework. International Journal of Intelligent Engineering & Systems, 18(11).‏

Abdullah, A. M., Kaittan, A. M., & Taha, M. S. (2021). Evaluation of the stability enhancement of the conventional sliding mode controller using whale optimization algorithm. Indonesian Journal of Electrical Engineering and Computer Science, 21(2), 744-756.‏

Taha, M. S., Haddad, A. A. A., Alrashdi, N. A. Y., Mahdi, M. H., Khalid, H. N., & Yousif, Q. J. (2021, July). An advance vehicle tracking system based on Arduino electronic shields and web maps browser. In 2021 International Conference on Advanced Computer Applications (ACA) (pp. 238-243). IEEE.‏

Ismael, B. M., Ngadi, M. A., Sharif, J. B. M., & Taha, M. S. (2025). Multi-Agent Reinforcement Learning for User-Router Assignment in Multi-Radio Multi-Channel Wireless Mesh Networks. International Journal of Intelligent Engineering & Systems, 18(8).‏

Obaid, A. L., Haddad, N. M., & Taha, M. S. (2024, November). DL-SCDDS: Accurate Skin Cancer Detection and Diagnosis Scheme Based on an Improved Convolutional Neural Networks Model. In International Human-Centered Technology Conference (pp. 201-214). Cham: Springer Nature Switzerland.‏

Download and View Statistics

Views: 0   |   Downloads: 0

Copyright License

Download Citations

How to Cite

Hussain, S. J. (2026). CNN-Vit: A Hybrid CNN–Vision Transformer Framework for Accurate and Real-Time Welding Defect Classification With GAN-Based Data Augmentation. The American Journal of Engineering and Technology, 8(4), 01–11. https://doi.org/10.37547/tajet/Volume08Issue04-01