Ensemble Partition Sampling (EPS) for Improved Multi-Class Classification
Artículo
Materias > Ingeniería
Universidad Europea del Atlántico > Investigación > Producción Científica
Fundación Universitaria Internacional de Colombia > Investigación > Producción Científica
Universidad Internacional Iberoamericana México > Investigación > Artículos y libros
Universidad Internacional Iberoamericana Puerto Rico > Investigación > Producción Científica
Universidad Internacional do Cuanza > Investigación > Producción Científica
Abierto
Inglés
Classification is a commonly used technique in data mining and is applied in various fields such as sentiment analysis, fraud detection, and fault diagnosis. Multiclass classification, which involves more than two classes, is more complex than binary classification. There are mainly two ways to approach multiclass classification, one is to expand the binary classifier into a multiclass classifier through various strategies and the other is to divide the multiclass classification problem into multiple binary problems (binarization). Two popular approaches for binarization are One vs One (OvO) and One vs All (OvA). It is simpler to aggregate the outputs of all binary classifiers as the number of classifiers decreases. However, it causes an imbalance of positive and negative sample numbers, which affects the classification effect of each binary classifier. In this article, we contribute to the field of ensemble learning and multi-class classification by proposing a new method called Ensemble Partition Sampling (EPS). This article presents a new approach to multiclass classification using an "Ensemble Partition Sampling" method within the "one-vs-all" (OvA) framework. The primary goal of this method is to tackle the problem of data imbalance by incorporating ensemble learning and preprocessing techniques into each binary dataset. The study found that Ensemble Partition Sampling (EPS) is the most effective method for imbalanced and multiclass imbalanced classification, outperforming other methods including OvA, SMOTE, k-means-SMOTE, Bagging-RB, DES-MI, OvO-EASY, and OvO-SMB. The study used CART, Random Forest, and SVM as classifiers, and the results consistently showed that EPS outperformed all other algorithms. The findings suggest that EPS is a highly effective method for improving classification performance in imbalanced and multiclass imbalanced datasets.
metadata
Jabir, Brahim; Díez, Isabel De la Torre; Bautista Thompson, Ernesto; Ramírez-Vargas, Debora L. y Kuc Castilla, Ángel Gabriel
mail
SIN ESPECIFICAR
(2023)
Ensemble Partition Sampling (EPS) for Improved Multi-Class Classification.
IEEE Access.
p. 1.
ISSN 2169-3536
Resumen
Classification is a commonly used technique in data mining and is applied in various fields such as sentiment analysis, fraud detection, and fault diagnosis. Multiclass classification, which involves more than two classes, is more complex than binary classification. There are mainly two ways to approach multiclass classification, one is to expand the binary classifier into a multiclass classifier through various strategies and the other is to divide the multiclass classification problem into multiple binary problems (binarization). Two popular approaches for binarization are One vs One (OvO) and One vs All (OvA). It is simpler to aggregate the outputs of all binary classifiers as the number of classifiers decreases. However, it causes an imbalance of positive and negative sample numbers, which affects the classification effect of each binary classifier. In this article, we contribute to the field of ensemble learning and multi-class classification by proposing a new method called Ensemble Partition Sampling (EPS). This article presents a new approach to multiclass classification using an "Ensemble Partition Sampling" method within the "one-vs-all" (OvA) framework. The primary goal of this method is to tackle the problem of data imbalance by incorporating ensemble learning and preprocessing techniques into each binary dataset. The study found that Ensemble Partition Sampling (EPS) is the most effective method for imbalanced and multiclass imbalanced classification, outperforming other methods including OvA, SMOTE, k-means-SMOTE, Bagging-RB, DES-MI, OvO-EASY, and OvO-SMB. The study used CART, Random Forest, and SVM as classifiers, and the results consistently showed that EPS outperformed all other algorithms. The findings suggest that EPS is a highly effective method for improving classification performance in imbalanced and multiclass imbalanced datasets.
| Tipo de Documento: | Artículo |
|---|---|
| Palabras Clave: | Ensemble Partition Sampling (EPS); One vs One (OvO); One vs All (OvA); Multi-Class Classification; Imbalanced learning; multiclass imbalanced classification |
| Clasificación temática: | Materias > Ingeniería |
| Divisiones: | Universidad Europea del Atlántico > Investigación > Producción Científica Fundación Universitaria Internacional de Colombia > Investigación > Producción Científica Universidad Internacional Iberoamericana México > Investigación > Artículos y libros Universidad Internacional Iberoamericana Puerto Rico > Investigación > Producción Científica Universidad Internacional do Cuanza > Investigación > Producción Científica |
| Depositado: | 09 May 2023 23:30 |
| Ultima Modificación: | 09 May 2023 23:30 |
| URI: | https://repositorio.unini.edu.mx/id/eprint/7028 |
Acciones (logins necesarios)
![]() |
Ver Objeto |
<a class="ep_document_link" href="/26722/1/nutrients-18-00257.pdf"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>
en
open
Background/Objectives: The growing integration of Artificial Intelligence (AI) and chatbots in health professional education offers innovative methods to enhance learning and clinical preparedness. This study aimed to evaluate the educational impact and perceptions in university students of Human Nutrition and Dietetics, regarding the utility, usability, and design of the E+DIEting_Lab chatbot platform when implemented in clinical nutrition training. Methods: The platform was piloted from December 2023 to April 2025 involving 475 students from multiple European universities. While all 475 students completed the initial survey, 305 finished the follow-up evaluation, representing a 36% attrition rate. Participants completed surveys before and after interacting with the chatbots, assessing prior experience, knowledge, skills, and attitudes. Data were analyzed using descriptive statistics and independent samples t-tests to compare pre- and post-intervention perceptions. Results: A total of 475 university students completed the initial survey and 305 the final evaluation. Most university students were females (75.4%), with representation from six languages and diverse institutions. Students reported clear perceived learning gains: 79.7% reported updated practical skills in clinical dietetics and communication were updated, 90% felt that new digital tools improved classroom practice, and 73.9% reported enhanced interpersonal skills. Self-rated competence in using chatbots as learning tools increased significantly, with mean knowledge scores rising from 2.32 to 2.66 and skills from 2.39 to 2.79 on a 0–5 Likert scale (p < 0.001 for both). Perceived effectiveness and usefulness of chatbots as self-learning tools remained positive but showed a small decline after use (effectiveness from 3.63 to 3.42; usefulness from 3.63 to 3.45), suggesting that hands-on experience refined, but did not diminish, students’ overall favorable views of the platform. Conclusions: The implementation and pilot evaluation of the E+DIEting_Lab self-learning virtual patient chatbot platform demonstrate that structured digital simulation tools can significantly improve perceived clinical nutrition competences. These findings support chatbot adoption in dietetics curricula and inform future digital education innovations.
Iñaki Elío Pascual mail inaki.elio@uneatlantico.es, Kilian Tutusaus mail kilian.tutusaus@uneatlantico.es, Imanol Eguren García mail imanol.eguren@uneatlantico.es, Álvaro Lasarte García mail , Arturo Ortega-Mansilla mail arturo.ortega@uneatlantico.es, Thomas Prola mail thomas.prola@uneatlantico.es, Sandra Sumalla Cano mail sandra.sumalla@uneatlantico.es,
Elío Pascual
<a href="/26964/1/s44196-025-01123-9_reference.pdf" class="ep_document_link"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>
en
open
Suicide Ideation Detection Using Social Media Data and Ensemble Machine Learning Model
Identifying the emotional state of individuals has useful applications, particularly to reduce the risk of suicide. Users’ thoughts on social media platforms can be used to find cues on the emotional state of individuals. Clinical approaches to suicide ideation detection primarily rely on evaluation by psychologists, medical experts, etc., which is time-consuming and requires medical expertise. Machine learning approaches have shown potential in automating suicide detection. In this regard, this study presents a soft voting ensemble model (SVEM) by leveraging random forest, logistic regression, and stochastic gradient descent classifiers using soft voting. In addition, for the robust training of SVEM, a hybrid feature engineering approach is proposed that combines term frequency-inverse document frequency and the bag of words. For experimental evaluation, “Suicide Watch” and “Depression” subreddits on the Reddit platform are used. Results indicate that the proposed SVEM model achieves an accuracy of 94%, better than existing approaches. The model also shows robust performance concerning precision, recall, and F1, each with a 0.93 score. ERT and deep learning models are also used, and performance comparison with these models indicates better performance of the SVEM model. Gated recurrent unit, long short-term memory, and recurrent neural network have an accuracy of 92% while the convolutional neural network obtains an accuracy of 91%. SVEM’s computational complexity is also low compared to deep learning models. Further, this study highlights the importance of explainability in healthcare applications such as suicidal ideation detection, where the use of LIME provides valuable insights into the contribution of different features. In addition, k-fold cross-validation further validates the performance of the proposed approach.
Erol KINA mail , Jin-Ghoo Choi mail , Abid Ishaq mail , Rahman Shafique mail , Mónica Gracia Villar mail monica.gracia@uneatlantico.es, Eduardo René Silva Alvarado mail eduardo.silva@funiber.org, Isabel de la Torre Diez mail , Imran Ashraf mail ,
KINA
<a href="/26965/1/s40203-025-00539-7.pdf" class="ep_document_link"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>
en
open
Human metapneumovirus (hMPV) is one of the potential pandemic pathogens, and it is a concern for elderly subjects and immunocompromised patients. There is no vaccine or specific antiviral available for hMPV. We conducted an in-silico study to predict initial antiviral candidates against human metapneumovirus. Our methodology included protein modeling, stability assessment, molecular docking, molecular simulation, analysis of non-covalent interactions, bioavailability, carcinogenicity, and pharmacokinetic profiling. We pinpointed four plant-derived bio-compounds as antiviral candidates. Among the compounds, apigenin showed the highest binding affinity, with values of − 8.0 kcal/mol for the hMPV-F protein and − 7.6 kcal/mol for the hMPV-N protein. Molecular dynamic simulations and further analyses confirmed that the protein-ligand docked complexes exhibited acceptable stability compared to two standard antiviral drugs. Additionally, these four compounds yielded satisfactory outcomes in bioavailability, drug-likeness, and ADME-Tox (absorption, distribution, metabolism, excretion, and toxicity) and STopTox analyses. This study highlights the potential of apigenin and xanthoangelol E as an initial antiviral candidate, underscoring the necessity for wet-lab evaluation, preclinical and clinical trials against human metapneumovirus infection.
Hasan Huzayfa Rahaman mail , Afsana Khan mail , Nadim Sharif mail , Wasifuddin Ahmed mail , Nazmul Sharif mail , Rista Majumder mail , Silvia Aparicio Obregón mail silvia.aparicio@uneatlantico.es, Rubén Calderón Iglesias mail ruben.calderon@uneatlantico.es, Isabel De la Torre Díez mail , Shuvra Kanti Dey mail ,
Rahaman
<a class="ep_document_link" href="/27153/1/fpls-16-1720471.pdf"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>
en
open
Introduction: Jackfruit cultivation is highly affected by leaf diseases that reduce yield, fruit quality, and farmer income. Early diagnosis remains challenging due to the limitations of manual inspection and the lack of automated and scalable disease detection systems. Existing deep-learning approaches often suffer from limited generalization and high computational cost, restricting real-time field deployment. Methods: This study proposes CNNAttLSTM, a hybrid deep-learning architecture integrating Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM) units, and an attention mechanism for multi-class classification of algal leaf spot, black spot, and healthy jackfruit leaves. Each image is divided into ordered 56×56 spatial patches, treated as pseudo-temporal sequences to enable the LSTM to capture contextual dependencies across different leaf regions. Spatial features are extracted via Conv2D, MaxPooling, and GlobalAveragePooling layers; temporal modeling is performed by LSTM units; and an attention mechanism assigns adaptive weights to emphasize disease-relevant regions. Experiments were conducted on a publicly available Kaggle dataset comprising 38,019 images, using predefined training, validation, and testing splits. Results: The proposed CNNAttLSTM model achieved 99% classification accuracy, outperforming the baseline CNN (86%) and CNN–LSTM (98%) models. It required only 3.7 million parameters, trained in 45 minutes on an NVIDIA Tesla T4 GPU, and achieved an inference time of 22 milliseconds per image, demonstrating high computational efficiency. The patch-based pseudo-temporal approach improved spatial–temporal feature representation, enabling the model to distinguish subtle differences between visually similar disease classes. Discussion: Results show that combining spatial feature extraction with temporal modeling and attention significantly enhances robustness and classification performance in plant disease detection. The lightweight design enables real-time and edge-device deployment, addressing a major limitation of existing deep-learning techniques. The findings highlight the potential of CNNAttLSTM for scalable, efficient, and accurate agricultural disease monitoring and broader precision agriculture applications.
Gaurav Tuteja mail , Fuad Ali Mohammed Al-Yarimi mail , Amna Ikram mail , Rupesh Gupta mail , Ateeq Ur Rehman mail , Jeewan Singh mail , Irene Delgado Noya mail irene.delgado@uneatlantico.es, Luis Alonso Dzul López mail luis.dzul@uneatlantico.es,
Tuteja
<a class="ep_document_link" href="/27154/1/s41598-026-37191-w_reference.pdf"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>
en
open
End-to-end emergency response protocol for tunnel accidents augmentation with reinforcement learning
Autonomous unmanned aerial vehicles (UAVs) offer cost-effective and flexible solutions for a wide range of real-world applications, particularly in hazardous and time-critical environments. Their ability to navigate autonomously, communicate rapidly, and avoid collisions makes UAVs well suited for emergency response scenarios. However, real-time path planning in dynamic and unpredictable environments remains a major challenge, especially in confined tunnel infrastructures where accidents may trigger fires, smoke propagation, debris, and rapid environmental changes. In such conditions, conventional preplanned or model-based navigation approaches often fail due to limited visibility, narrow passages, and the absence of reliable localization signals. To address these challenges, this work proposes an end-to-end emergency response framework for tunnel accidents based on Multi-Agent Reinforcement Learning (MARL). Each UAV operates as an independent learning agent using an Independent Q-Learning paradigm, enabling real-time decision-making under limited computational resources. To mitigate premature convergence and local optima during exploration, Grey Wolf Optimization (GWO) is integrated as a policy-guidance mechanism within the reinforcement learning (RL) framework. A customized reward function is designed to prioritize victim discovery, penalize unsafe behavior, and explicitly discourage redundant exploration among agents. The proposed approach is evaluated using a frontier-based exploration simulator under both single-agent and multi-agent settings with multiple goals. Extensive simulation results demonstrate that the proposed framework achieves faster goal discovery, improved map coverage, and reduced rescue time compared to state-of-the-art GWO-based exploration and random search algorithms. These results highlight the effectiveness of lightweight MARL-based coordination for autonomous UAV-assisted tunnel emergency response.
Hafiz Muhammad Raza ur Rehman mail , M. Junaid Gul mail , Rabbiya Younas mail , Muhammad Zeeshan Jhandir mail , Roberto Marcelo Álvarez mail roberto.alvarez@uneatlantico.es, Yini Airet Miró Vera mail yini.miro@uneatlantico.es, Imran Ashraf mail ,
ur Rehman
