TY - JOUR SN - 2045-2322 ID - uniromana17885 TI - Dual-modality fusion for mango disease classification using dynamic attention based ensemble of leaf & fruit images VL - 15 JF - Scientific Reports Y1 - 2025/11// A1 - Mohsin, Muhammad A1 - Hashmi, Muhammad Shadab Alam A1 - Delgado Noya, Irene A1 - Garay, Helena A1 - Abdel Samee, Nagwan A1 - Ashraf, Imran UR - http://doi.org/10.1038/s41598-025-26052-7 IS - 1 N2 - Mango is one of the most beloved fruits and plays an indispensable role in the agricultural economies of many tropical countries like Pakistan, India, and other Southeast Asian countries. Similar to other fruits, mango cultivation is also threatened by various diseases, including Anthracnose and Red Rust. Although farmers try to mitigate such situations on time, early and accurate detection of mango diseases remains challenging due to multiple factors, such as limited understanding of disease diversity, similarity in symptoms, and frequent misclassification. To avoid such instances, this study proposes a multimodal deep learning framework that leverages both leaf and fruit images to improve classification performance and generalization. Individual CNN-based pre-trained models, including ResNet-50, MobileNetV2, EfficientNet-B0, and ConvNeXt, were trained separately on curated datasets of mango leaf and fruit diseases. A novel Modality Attention Fusion (MAF) mechanism was introduced to dynamically weight and combine predictions from both modalities based on their discriminative strength, as some diseases are more prominent on leaves than on fruits, and vice versa. To address overfitting and improve generalization, a class-aware augmentation pipeline was integrated, which performs augmentation according to the specific characteristics of each class. The proposed attention-based fusion strategy significantly outperformed individual models and static fusion approaches, achieving a test accuracy of 99.08%, an F1 score of 99.03%, and a perfect ROC-AUC of 99.96% using EfficientNet-B0 as the base. To evaluate the model?s real-world applicability, an interactive web application was developed using the Django framework and evaluated through out-of-distribution (OOD) testing on diverse mango samples collected from public sources. These findings underline the importance of combining visual cues from multiple organs of plants and adapting model attention to contextual features for real-world agricultural diagnostics. KW - Plant disease detection Multimodal approach Class-aware augmentation Modality attention fusion Out-of-distribution AV - public ER -