Ensemble Partition Sampling (EPS) for Improved Multi-Class Classification
Article
Subjects > Engineering
Europe University of Atlantic > Research > Scientific Production
Fundación Universitaria Internacional de Colombia > Research > Scientific Production
Ibero-american International University > Research > Articles and books
Ibero-american International University > Research > Scientific Production
Universidad Internacional do Cuanza > Research > Scientific Production
Abierto
Inglés
Classification is a commonly used technique in data mining and is applied in various fields such as sentiment analysis, fraud detection, and fault diagnosis. Multiclass classification, which involves more than two classes, is more complex than binary classification. There are mainly two ways to approach multiclass classification, one is to expand the binary classifier into a multiclass classifier through various strategies and the other is to divide the multiclass classification problem into multiple binary problems (binarization). Two popular approaches for binarization are One vs One (OvO) and One vs All (OvA). It is simpler to aggregate the outputs of all binary classifiers as the number of classifiers decreases. However, it causes an imbalance of positive and negative sample numbers, which affects the classification effect of each binary classifier. In this article, we contribute to the field of ensemble learning and multi-class classification by proposing a new method called Ensemble Partition Sampling (EPS). This article presents a new approach to multiclass classification using an "Ensemble Partition Sampling" method within the "one-vs-all" (OvA) framework. The primary goal of this method is to tackle the problem of data imbalance by incorporating ensemble learning and preprocessing techniques into each binary dataset. The study found that Ensemble Partition Sampling (EPS) is the most effective method for imbalanced and multiclass imbalanced classification, outperforming other methods including OvA, SMOTE, k-means-SMOTE, Bagging-RB, DES-MI, OvO-EASY, and OvO-SMB. The study used CART, Random Forest, and SVM as classifiers, and the results consistently showed that EPS outperformed all other algorithms. The findings suggest that EPS is a highly effective method for improving classification performance in imbalanced and multiclass imbalanced datasets.
metadata
Jabir, Brahim and Díez, Isabel De la Torre and Bautista Thompson, Ernesto and Ramírez-Vargas, Debora L. and Kuc Castilla, Ángel Gabriel
mail
UNSPECIFIED
(2023)
Ensemble Partition Sampling (EPS) for Improved Multi-Class Classification.
IEEE Access.
p. 1.
ISSN 2169-3536
Abstract
Classification is a commonly used technique in data mining and is applied in various fields such as sentiment analysis, fraud detection, and fault diagnosis. Multiclass classification, which involves more than two classes, is more complex than binary classification. There are mainly two ways to approach multiclass classification, one is to expand the binary classifier into a multiclass classifier through various strategies and the other is to divide the multiclass classification problem into multiple binary problems (binarization). Two popular approaches for binarization are One vs One (OvO) and One vs All (OvA). It is simpler to aggregate the outputs of all binary classifiers as the number of classifiers decreases. However, it causes an imbalance of positive and negative sample numbers, which affects the classification effect of each binary classifier. In this article, we contribute to the field of ensemble learning and multi-class classification by proposing a new method called Ensemble Partition Sampling (EPS). This article presents a new approach to multiclass classification using an "Ensemble Partition Sampling" method within the "one-vs-all" (OvA) framework. The primary goal of this method is to tackle the problem of data imbalance by incorporating ensemble learning and preprocessing techniques into each binary dataset. The study found that Ensemble Partition Sampling (EPS) is the most effective method for imbalanced and multiclass imbalanced classification, outperforming other methods including OvA, SMOTE, k-means-SMOTE, Bagging-RB, DES-MI, OvO-EASY, and OvO-SMB. The study used CART, Random Forest, and SVM as classifiers, and the results consistently showed that EPS outperformed all other algorithms. The findings suggest that EPS is a highly effective method for improving classification performance in imbalanced and multiclass imbalanced datasets.
Item Type: | Article |
---|---|
Uncontrolled Keywords: | Ensemble Partition Sampling (EPS); One vs One (OvO); One vs All (OvA); Multi-Class Classification; Imbalanced learning; multiclass imbalanced classification |
Subjects: | Subjects > Engineering |
Divisions: | Europe University of Atlantic > Research > Scientific Production Fundación Universitaria Internacional de Colombia > Research > Scientific Production Ibero-american International University > Research > Articles and books Ibero-american International University > Research > Scientific Production Universidad Internacional do Cuanza > Research > Scientific Production |
Date Deposited: | 09 May 2023 23:30 |
Last Modified: | 09 May 2023 23:30 |
URI: | https://repositorio.unini.edu.mx/id/eprint/7028 |
Actions (login required)
![]() |
View Item |
<a href="/17788/1/s40537-025-01167-w.pdf" class="ep_document_link"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>
en
open
Detecting hate in diversity: a survey of multilingual code-mixed image and video analysis
The proliferation of damaging content on social media in today’s digital environment has increased the need for efficient hate speech identification systems. A thorough examination of hate speech detection methods in a variety of settings, such as code-mixed, multilingual, visual, audio, and textual scenarios, is presented in this paper. Unlike previous research focusing on single modalities, our study thoroughly examines hate speech identification across multiple forms. We classify the numerous types of hate speech, showing how it appears on different platforms and emphasizing the unique difficulties in multi-modal and multilingual settings. We fill research gaps by assessing a variety of methods, including deep learning, machine learning, and natural language processing, especially for complicated data like code-mixed and cross-lingual text. Additionally, we offer key technique comparisons, suggesting future research avenues that prioritize multi-modal analysis and ethical data handling, while acknowledging its benefits and drawbacks. This study attempts to promote scholarly research and real-world applications on social media platforms by acting as an essential resource for improving hate speech identification across various data sources.
Hafiz Muhammad Raza Ur Rehman mail , Mahpara Saleem mail , Muhammad Zeeshan Jhandir mail , Eduardo René Silva Alvarado mail eduardo.silva@funiber.org, Helena Garay mail helena.garay@uneatlantico.es, Imran Ashraf mail ,
Raza Ur Rehman
<a href="/17794/1/s41598-025-95836-8.pdf" class="ep_document_link"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>
en
open
Accurate solar and photovoltaic (PV) power forecasting is essential for optimizing grid integration, managing energy storage, and maximizing the efficiency of solar power systems. Deep learning (DL) models have shown promise in this area due to their ability to learn complex, non-linear relationships within large datasets. This study presents a systematic literature review (SLR) of deep learning applications for solar PV forecasting, addressing a gap in the existing literature, which often focuses on traditional ML or broader renewable energy applications. This review specifically aims to identify the DL architectures employed, preprocessing and feature engineering techniques used, the input features leveraged, evaluation metrics applied, and the persistent challenges in this field. Through a rigorous analysis of 26 selected papers from an initial set of 155 articles retrieved from the Web of Science database, we found that Long Short-Term Memory (LSTM) networks were the most frequently used algorithm (appearing in 32.69% of the papers), closely followed by Convolutional Neural Networks (CNNs) at 28.85%. Furthermore, Wavelet Transform (WT) was found to be the most prominent data decomposition technique, while Pearson Correlation was the most used for feature selection. We also found that ambient temperature, pressure, and humidity are the most common input features. Our systematic evaluation provides critical insights into state-of-the-art DL-based solar forecasting and identifies key areas for upcoming research. Future research should prioritize the development of more robust and interpretable models, as well as explore the integration of multi-source data to further enhance forecasting accuracy. Such advancements are crucial for the effective integration of solar energy into future power grids.
Oussama Khouili mail , Mohamed Hanine mail , Mohamed Louzazni mail , Miguel Ángel López Flores mail miguelangel.lopez@uneatlantico.es, Eduardo García Villena mail eduardo.garcia@uneatlantico.es, Imran Ashraf mail ,
Khouili
en
close
Measurement of chest muscle mass in COVID-19 patients on mechanical ventilation using tomography
Background: Sarcopenia, characterized by a reduction in skeletal muscle mass and function, is a prevalent complication in the Intensive Care Unit (ICU) and is related to increased mortality. This study aims to determine whether muscle and fat mass measurements at the T12 and L1 vertebrae using chest tomography can predict mortality among critically ill COVID-19 patients requiring invasive mechanical ventilation (MV). Methods: Fifty-one critically ill COVID-19 patients on MV underwent chest tomography within 72 h of ICU admission. Muscle mass was measured using the Core Slicer program. Results: After adjustment for potential confounding factors related to background and clinical parameters, a 1-unit increase in muscle mass, subcutaneous, and intra-abdominal fat mass at the L1 level was associated with approximately 1–2% lower odds of negative outcomes and in-hospital mortality. No significant association was found between muscle mass at the T12 level and patient outcomes. Furthermore, no significant results were observed when considering a 1-standard deviation increase as the exposure variable. Conclusion: Measuring muscle mass using chest tomography at the T12 level does not effectively predict outcomes for ICU patients. However, muscle and fat mass at the L1 level may be associated with a lower risk of negative outcomes. Additional studies should explore other potential markers or methods to improve prognostic accuracy in this critically ill population.
Natalia Daniela Llobera mail , Evelyn Frias-Toral mail , Mariel Aquino mail , María Jimena Reberendo mail , Laura Cardona Díaz mail , Adriana García mail , Martha Montalván mail , Álvaro Velarde Sotres mail alvaro.velarde@uneatlantico.es, Sebastián Chapela mail ,
Llobera
<a href="/17569/1/Food%20Frontiers%20-%202025%20-%20Romero%E2%80%90Marquez%20-%20Olive%20Leaf%20Extracts%20With%20High%20%20Medium%20%20or%20Low%20Bioactive%20Compounds%20Content.pdf" class="ep_document_link"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>
en
open
Alzheimer's disease (AD) involves β-amyloid plaques and tau hyperphosphorylation, driven by oxidative stress and neuroinflammation. Cyclooxygenase-2 (COX-2) and acetylcholinesterase (AChE) activities exacerbate AD pathology. Olive leaf (OL) extracts, rich in bioactive compounds, offer potential therapeutic benefits. This study aimed to assess the anti-inflammatory, anti-cholinergic, and antioxidant effects of three OL extracts (low, mid, and high bioactive content) in vitro and their protective effects against AD-related proteinopathies in Caenorhabditis elegans models. OL extracts were characterized for phenolic composition, AChE and COX-2 inhibition, as well as antioxidant capacity. Their effects on intracellular and mitochondrial reactive oxygen species (ROS) were tested in C. elegans models expressing human Aβ and tau proteins. Gene expression analyses examined transcription factors (DAF-16, skinhead [SKN]-1) and their targets (superoxide dismutase [SOD]-2, SOD-3, GST-4, and heat shock protein [HSP]-16.2). High-OL extract demonstrated superior AChE and COX-2 inhibition and antioxidant capacity. Low- and high-OL extracts reduced Aβ aggregation, ROS levels, and proteotoxicity via SKN-1/NRF-2 and DAF-16/FOXO pathways, whereas mid-OL showed moderate effects through proteostasis modulation. In tau models, low- and high-OL extracts mitigated mitochondrial ROS levels via SOD-2 but had limited effects on intracellular ROS levels. High-OL extract also increased GST-4 levels, whereas low and mid extracts enhanced GST-4 levels. OL extracts protect against AD-related proteinopathies by modulating oxidative stress, inflammation, and proteostasis. High-OL extract showed the most promise for nutraceutical development due to its robust phenolic profile and activation of key antioxidant pathways. Further research is needed to confirm long-term efficacy.
Jose M. Romero‐Marquez mail , María D. Navarro‐Hortal mail , Alfonso Varela‐López mail , Rubén Calderón Iglesias mail ruben.calderon@uneatlantico.es, Juan G. Puentes mail , Francesca Giampieri mail francesca.giampieri@uneatlantico.es, Maurizio Battino mail maurizio.battino@uneatlantico.es, Cristina Sánchez‐González mail , Jianbo Xiao mail , Roberto García‐Ruiz mail , Sebastián Sánchez mail , Tamara Y. Forbes‐Hernández mail , José L. Quiles mail jose.quiles@uneatlantico.es,
Romero‐Marquez
<a href="/17570/1/eFood%20-%202025%20-%20Navarro%E2%80%90Hortal%20-%20Effects%20of%20a%20Garlic%20Hydrophilic%20Extract%20Rich%20in%20Sulfur%20Compounds%20on%20Redox%20Biology%20and.pdf" class="ep_document_link"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>
en
open
Garlic is a horticultural product highly valued for its culinary and medicinal attributes. The aim of this study was to evaluate the composition of a garlic hydrophilic extract as well as the influence on redox biology, Alzheimer's Disease (AD) markers and aging, using Caenorhabditis elegans as experimental model. The extract was rich in sulfur compounds, highlighting the presence of other compounds like phenolics, and the antioxidant property was corroborated. Regarding AD markers, the acetylcholinesterase inhibitory capacity was demonstrated in vitro. Although the extract did not modify the amyloid β-induced paralysis degree, it was able to improve, in a dose-dependent manner, some locomotive parameters affected by the hyperphosphorylated tau protein in C. elegans. It could be related to the effect found on GFP-transgenic stains, mainly regarding to the increase in the gene expression of HSP-16.2. Moreover, an initial investigation into the aging process revealed that the extract successfully inhibited the accumulation of intracellular and mitochondrial reactive oxygen species in aged worms. These results provide valuable insights into the multifaceted impact of garlic extract, particularly in the context of aging and neurodegenerative processes. This study lays a foundation for further research avenues exploring the intricate molecular mechanisms underlying garlic effects and its translation into potential therapeutic interventions for age-related neurodegenerative conditions.
María D. Navarro‐Hortal mail , Jose M. Romero‐Marquez mail , Johura Ansary mail , Cristina Montalbán‐Hernández mail , Alfonso Varela‐López mail , Francesca Giampieri mail francesca.giampieri@uneatlantico.es, Jianbo Xiao mail , Rubén Calderón Iglesias mail ruben.calderon@uneatlantico.es, Maurizio Battino mail maurizio.battino@uneatlantico.es, Cristina Sánchez‐González mail , Tamara Y. Forbes‐Hernández mail , José L. Quiles mail jose.quiles@uneatlantico.es,
Navarro‐Hortal