PT Unknown AU David Dueñas Mostafa Kamal Petia Radeva TI Efficient Deep Learning Ensemble for Skin Lesion Classification BT Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications PY 2023 BP 303 EP 314 AB Vision Transformers (ViTs) are deep learning techniques that have been gaining in popularity in recent years.In this work, we study the performance of ViTs and Convolutional Neural Networks (CNNs) on skin lesions classification tasks, specifically melanoma diagnosis. We show that regardless of the performance of both architectures, an ensemble of them can improve their generalization. We also present an adaptation to the Gram-OOD* method (detecting Out-of-distribution (OOD) using Gram matrices) for skin lesion images. Moreover, the integration of super-convergence was critical to success in building models with strict computing and training time constraints. We evaluated our ensemble of ViTs and CNNs, demonstrating that generalization is enhanced by placing first in the 2019 and third in the 2020 ISIC Challenge Live Leaderboards(available at https://challenge.isic-archive.com/leaderboards/live/). ER