Deep transfer learning-based bird species classification using mel spectrogram images

Artículo Materias > Ingeniería Universidad Europea del Atlántico > Investigación > Producción Científica
Universidad Internacional Iberoamericana México > Investigación > Producción Científica
Universidad de La Romana > Investigación > Producción Científica Abierto Inglés The classification of bird species is of significant importance in the field of ornithology, as it plays an important role in assessing and monitoring environmental dynamics, including habitat modifications, migratory behaviors, levels of pollution, and disease occurrences. Traditional methods of bird classification, such as visual identification, were time-intensive and required a high level of expertise. However, audio-based bird species classification is a promising approach that can be used to automate bird species identification. This study aims to establish an audio-based bird species classification system for 264 Eastern African bird species employing modified deep transfer learning. In particular, the pre-trained EfficientNet technique was utilized for the investigation. The study adapts the fine-tune model to learn the pertinent patterns from mel spectrogram images specific to this bird species classification task. The fine-tuned EfficientNet model combined with a type of Recurrent Neural Networks (RNNs) namely Gated Recurrent Unit (GRU) and Long short-term memory (LSTM). RNNs are employed to capture the temporal dependencies in audio signals, thereby enhancing bird species classification accuracy. The dataset utilized in this work contains nearly 17,000 bird sound recordings across a diverse range of species. The experiment was conducted with several combinations of EfficientNet and RNNs, and EfficientNet-B7 with GRU surpasses other experimental models with an accuracy of 84.03% and a macro-average precision score of 0.8342. metadata Shaikh, Asadullah; Baowaly, Mrinal Kanti; Sarkar, Bisnu Chandra; Walid, Md. Abul Ala; Ahamad, Md. Martuza; Singh, Bikash Chandra; Silva Alvarado, Eduardo René; Ashraf, Imran y Samad, Md. Abdus mail SIN ESPECIFICAR, SIN ESPECIFICAR, SIN ESPECIFICAR, SIN ESPECIFICAR, SIN ESPECIFICAR, SIN ESPECIFICAR, eduardo.silva@funiber.org, SIN ESPECIFICAR, SIN ESPECIFICAR (2024) Deep transfer learning-based bird species classification using mel spectrogram images. PLOS ONE, 19 (8). e0305708. ISSN 1932-6203

Vista Previa

Texto
journal.pone.0305708.pdf
Available under License Creative Commons Attribution.
Descargar (1MB) | Vista Previa

URL Oficial: http://doi.org/10.1371/journal.pone.0305708

Resumen

The classification of bird species is of significant importance in the field of ornithology, as it plays an important role in assessing and monitoring environmental dynamics, including habitat modifications, migratory behaviors, levels of pollution, and disease occurrences. Traditional methods of bird classification, such as visual identification, were time-intensive and required a high level of expertise. However, audio-based bird species classification is a promising approach that can be used to automate bird species identification. This study aims to establish an audio-based bird species classification system for 264 Eastern African bird species employing modified deep transfer learning. In particular, the pre-trained EfficientNet technique was utilized for the investigation. The study adapts the fine-tune model to learn the pertinent patterns from mel spectrogram images specific to this bird species classification task. The fine-tuned EfficientNet model combined with a type of Recurrent Neural Networks (RNNs) namely Gated Recurrent Unit (GRU) and Long short-term memory (LSTM). RNNs are employed to capture the temporal dependencies in audio signals, thereby enhancing bird species classification accuracy. The dataset utilized in this work contains nearly 17,000 bird sound recordings across a diverse range of species. The experiment was conducted with several combinations of EfficientNet and RNNs, and EfficientNet-B7 with GRU surpasses other experimental models with an accuracy of 84.03% and a macro-average precision score of 0.8342.

Tipo de Documento:	Artículo
Clasificación temática:	Materias > Ingeniería
Divisiones:	Universidad Europea del Atlántico > Investigación > Producción Científica Universidad Internacional Iberoamericana México > Investigación > Producción Científica Universidad de La Romana > Investigación > Producción Científica
Depositado:	19 Sep 2024 23:30
Ultima Modificación:	19 Sep 2024 23:30
URI:	https://repositorio.uniromana.edu.do/id/eprint/14280

Acciones (logins necesarios)

Ver Objeto

Single-cell omics for nutrition research: an emerging opportunity for human-centric investigations

Understanding how dietary compounds affect human health is challenged by their molecular complexity and cell-type–specific effects. Conventional multi-cell type (bulk) analyses obscure cellular heterogeneity, while animal and standard in vitro models often fail to replicate human physiology. Single-cell omics technologies—such as single-cell RNA sequencing, as well as single-cell–resolved proteomic and metabolomic approaches—enable high-resolution investigation of nutrient–cell interactions and reveal mechanisms at a single-cell resolution. When combined with advanced human-derived in vitro systems like organoids and organ-on-chip platforms, they support mechanistic studies in physiologically relevant contexts. This review outlines emerging applications of single-cell omics in nutrition research, emphasizing their potential to uncover cell-specific dietary responses, identify nutrient-sensitive pathways, and capture interindividual variability. It also discusses key challenges—including technical limitations, model selection, and institutional biases—and identifies strategic directions to facilitate broader adoption in the field. Collectively, single-cell omics offer a transformative framework to advance human-centric nutrition research.

Producción Científica

Manuela Cassotta mail manucassotta@gmail.com, Yasmany Armas Diaz mail , Danila Cianciosi mail , Bei Yang mail , Zexiu Qi mail , Ge Chen mail , Santos Gracia Villar mail santos.gracia@uneatlantico.es, Luis Alonso Dzul López mail luis.dzul@uneatlantico.es, Giuseppe Grosso mail , José L. Quiles mail , Jianbo Xiao mail , Maurizio Battino mail maurizio.battino@uneatlantico.es, Francesca Giampieri mail francesca.giampieri@uneatlantico.es,

Cassotta

open

Image-Based Dietary Energy and Macronutrients Estimation with ChatGPT-5: Cross-Source Evaluation Across Escalating Context Scenarios

Background/Objectives: Estimating energy and macronutrients from food images is clinically relevant yet challenging, and rigorous evaluation requires transparent accuracy metrics with uncertainty and clear acknowledgement of reference data limitations across heterogeneous sources. This study assessed ChatGPT-5, a general-purpose vision-language model, across four scenarios differing in the amount and type of contextual information provided, using a composite dataset to quantify accuracy for calories and macronutrients. Methods: A total of 195 dishes were evaluated, sourced from Allrecipes.com, the SNAPMe dataset, and Home-prepared, weighed meals. Each dish was evaluated under Case 1 (image only), Case 2 (image plus standardized non-visual descriptors), Case 3 (image plus ingredient lists with amounts), and Case 4 (replicates Case 3 but excluding the image). The primary endpoint was kcal Mean Absolute Error (MAE); secondary endpoints included Median Absolute Error (MedAE) and Root Mean Square Error (RMSE) for kcal and macronutrients (protein, carbohydrates, and lipids), all reported with 95% Confidence Intervals (CIs) via dish-level bootstrap resampling and accompanied by absolute differences (Δ) between scenarios. Inference settings were standardized to support reproducibility and variance estimation. Source stratified analyses and quartile summaries were conducted to examine heterogeneity by curation level and nutrient ranges, with additional robustness checks for error complexity relationships. Results and Discussion: Accuracy improved from Case 1 to Case 2 and further in Case 3 for energy and all macronutrients when summarized by MAE, MedAE, and RMSE with 95% CIs, with absolute reductions (Δ) indicating material gains as contextual information increased. In contrast to Case 3, estimation accuracy declined in Case 4, underscoring the contribution of visual cues. Gains were largest in the Home-prepared dietitian-weighed subset and smaller yet consistent for Allrecipes.com and SNAPMe, reflecting differences in reference curation and measurement fidelity across sources. Scenario-level trends were concordant across sources, and stratified and quartile analyses showed coherent patterns of decreasing absolute errors with the provision of structured non-visual information and detailed ingredient data. Conclusions: ChatGPT-5 can deliver practically useful calorie and macronutrient estimates from food images, particularly when augmented with standardized nonvisual descriptors and detailed ingredients, as evidenced by reductions in MAE, MedAE, and RMSE with 95% CIs across scenarios. The decline in accuracy observed when the image was omitted, despite providing detailed ingredient information, indicates that visual cues contribute meaningfully to estimation performance and that improvements are not solely attributable to arithmetic from ingredient lists. Finally, to promote generalizability, it is recommended that future studies include repeated evaluations across diverse datasets, ensure public availability of prompts and outputs, and incorporate systematic comparisons with non-artificial-intelligence baselines.

Producción Científica

Marcela Rodríguez- Jiménez mail , Gustavo Daniel Martín-del-Campo-Becerra mail , Sandra Sumalla Cano mail sandra.sumalla@uneatlantico.es, Jorge Crespo-Álvarez mail jorge.crespo@uneatlantico.es, Iñaki Elío Pascual mail inaki.elio@uneatlantico.es,

Rodríguez- Jiménez

open

Dual-modality fusion for mango disease classification using dynamic attention based ensemble of leaf & fruit images

Mango is one of the most beloved fruits and plays an indispensable role in the agricultural economies of many tropical countries like Pakistan, India, and other Southeast Asian countries. Similar to other fruits, mango cultivation is also threatened by various diseases, including Anthracnose and Red Rust. Although farmers try to mitigate such situations on time, early and accurate detection of mango diseases remains challenging due to multiple factors, such as limited understanding of disease diversity, similarity in symptoms, and frequent misclassification. To avoid such instances, this study proposes a multimodal deep learning framework that leverages both leaf and fruit images to improve classification performance and generalization. Individual CNN-based pre-trained models, including ResNet-50, MobileNetV2, EfficientNet-B0, and ConvNeXt, were trained separately on curated datasets of mango leaf and fruit diseases. A novel Modality Attention Fusion (MAF) mechanism was introduced to dynamically weight and combine predictions from both modalities based on their discriminative strength, as some diseases are more prominent on leaves than on fruits, and vice versa. To address overfitting and improve generalization, a class-aware augmentation pipeline was integrated, which performs augmentation according to the specific characteristics of each class. The proposed attention-based fusion strategy significantly outperformed individual models and static fusion approaches, achieving a test accuracy of 99.08%, an F1 score of 99.03%, and a perfect ROC-AUC of 99.96% using EfficientNet-B0 as the base. To evaluate the model’s real-world applicability, an interactive web application was developed using the Django framework and evaluated through out-of-distribution (OOD) testing on diverse mango samples collected from public sources. These findings underline the importance of combining visual cues from multiple organs of plants and adapting model attention to contextual features for real-world agricultural diagnostics.

Producción Científica

Muhammad Mohsin mail , Muhammad Shadab Alam Hashmi mail , Irene Delgado Noya mail irene.delgado@uneatlantico.es, Helena Garay mail helena.garay@uneatlantico.es, Nagwan Abdel Samee mail , Imran Ashraf mail ,

Mohsin

Socio-economic status, food security and adherence to the Mediterranean diet in five Mediterranean countries: the DELICIOUS project

Food security is a universal need worldwide. This study explored the relationship between food security and adherence to the Mediterranean diet in the context of the DELICIOUS project. A survey involving 2,011 parents of children and adolescents aged 6–17 years was conducted. Adherence to the Mediterranean diet was assessed through the KIDMED score. Information regarding the ease of accessing Mediterranean foods, economic allowance, employment and residence was collected. Logistic regressions analyses were performed to test the associations. Individuals living in rural areas and reporting difficulty in obtaining all studied foods were less likely to follow the Mediterranean diet. Higher adherence was associated with a household monthly income higher than €4000. No associations with family status and no differences across countries were found. The progressive shift away from the Mediterranean diet may depend not only on cultural preferences for unhealthier, industrial alternatives but also on family budgets and food accessibility.

Producción Científica

Francesca Scazzina mail , Alice Rosi mail , Francesca Giampieri mail francesca.giampieri@uneatlantico.es, Carlos Poveda-Loor mail , Osama Abdelkarim mail , Mohamed Aly mail , Evelyn Frias-Toral mail , Juancho Pons mail , Laura Vázquez-Araújo mail , Sandra Sumalla Cano mail sandra.sumalla@uneatlantico.es, Iñaki Elío Pascual mail inaki.elio@uneatlantico.es, Lorenzo Monasta mail , Nadia Paladino mail , Ana Mata mail , Adrián Chacón mail , Pablo Busó mail , Giuseppe Grosso mail ,

Scazzina

open

Enhancing detection of epileptic seizures using transfer learning and EEG brain activity signals

Epileptic seizures are neurological events characterized by sudden and excessive electrical discharges in the brain, leading to disruptions in brain function. Epileptic seizures can lead to life-threatening situations such as status epilepticus, which is characterized by prolonged or recurrent seizures and may lead to respiratory distress, aspiration pneumonia, and cardiac arrhythmias. Therefore, there is a need for an automated approach that can efficiently diagnose epileptic seizures at an early stage. The primary objective of this study is to develop a highly accurate approach for the early diagnosis of epileptic seizures. We use electroencephalography (EEG) signal data based on different brain activities to conduct experiments for epileptic seizure detection. For this purpose, a novel transfer learning technique called random forest-gated recurrent unit (RFGR) is proposed. The EEG brain activity signal data is fed into the RFGR model to generate a new feature set. The newly generated features are based on the class prediction probabilities extracted by the RFGR and are utilized to train models. Extensive experiments are carried out to investigate the performance of the proposed approach. Results demonstrate that the RFGR, when used with the random forest model, outperforms state-of-the-art techniques, achieving a high accuracy of 99.00 %. Additionally, explainable artificial intelligence analysis is utilized to provide transparent and understandable explanations of the decision-making processes of the proposed approach.

Producción Científica

Erol Kına mail , Ali Raza mail , Prudhvi Chowdary Are mail , Carmen Lilí Rodríguez Velasco mail carmen.rodriguez@uneatlantico.es, Julién Brito Ballester mail julien.brito@uneatlantico.es, Isabel de la Torre Diez mail , Naveed Anwer Butt mail , Imran Ashraf mail ,

Kına

Deep transfer learning-based bird species classification using mel spectrogram images

Resumen

Acciones (logins necesarios)

TEMÁTICA

ACCESO

IDIOMA

Filtros

Enlaces: