Depression Intensity Classification from Tweets Using FastText Based Weighted Soft Voting Ensemble

Artículo Materias > Ingeniería
Materias > Psicología
Universidad Europea del Atlántico > Investigación > Producción Científica
Fundación Universitaria Internacional de Colombia > Investigación > Producción Científica
Universidad Internacional Iberoamericana México > Investigación > Producción Científica
Universidad Internacional Iberoamericana Puerto Rico > Investigación > Producción Científica
Universidad Internacional do Cuanza > Investigación > Producción Científica
Abierto Inglés Predicting depression intensity from microblogs and social media posts has numerous benefits and applications, including predicting early psychological disorders and stress in individuals or the general public. A major challenge in predicting depression using social media posts is that the existing studies do not focus on predicting the intensity of depression in social media texts but rather only perform the binary classification of depression and moreover noisy data makes it difficult to predict the true depression in the social media text. This study intends to begin by collecting relevant Tweets and generating a corpus of 210000 public tweets using Twitter public application programming interfaces (APIs). A strategy is devised to filter out only depression-related tweets by creating a list of relevant hashtags to reduce noise in the corpus. Furthermore, an algorithm is developed to annotate the data into three depression classes: ‘Mild,’ ‘Moderate,’ and ‘Severe,’ based on International Classification of Diseases-10 (ICD-10) depression diagnostic criteria. Different baseline classifiers are applied to the annotated dataset to get a preliminary idea of classification performance on the corpus. Further FastText-based model is applied and fine-tuned with different preprocessing techniques and hyperparameter tuning to produce the tuned model, which significantly increases the depression classification performance to an 84% F1 score and 90% accuracy compared to baselines. Finally, a FastText-based weighted soft voting ensemble (WSVE) is proposed to boost the model’s performance by combining several other classifiers and assigning weights to individual models according to their individual performances. The proposed WSVE outperformed all baselines as well as FastText alone, with an F1 of 89%, 5% higher than FastText alone, and an accuracy of 93%, 3% higher than FastText alone. The proposed model better captures the contextual features of the relatively small sample class and aids in the detection of early depression intensity prediction from tweets with impactful performances. metadata Rizwan, Muhammad; Mushtaq, Muhammad Faheem; Rafiq, Maryam; Mehmood, Arif; Diez, Isabel de la Torre; Gracia Villar, Mónica; Garay, Helena y Ashraf, Imran mail SIN ESPECIFICAR, SIN ESPECIFICAR, SIN ESPECIFICAR, SIN ESPECIFICAR, SIN ESPECIFICAR, monica.gracia@uneatlantico.es, helena.garay@uneatlantico.es, SIN ESPECIFICAR (2024) Depression Intensity Classification from Tweets Using FastText Based Weighted Soft Voting Ensemble. Computers, Materials & Continua, 78 (2). pp. 2047-2066. ISSN 1546-2226

[img] Texto
TSP_CMC_37347.pdf
Available under License Creative Commons Attribution.

Descargar (861kB)

Resumen

Predicting depression intensity from microblogs and social media posts has numerous benefits and applications, including predicting early psychological disorders and stress in individuals or the general public. A major challenge in predicting depression using social media posts is that the existing studies do not focus on predicting the intensity of depression in social media texts but rather only perform the binary classification of depression and moreover noisy data makes it difficult to predict the true depression in the social media text. This study intends to begin by collecting relevant Tweets and generating a corpus of 210000 public tweets using Twitter public application programming interfaces (APIs). A strategy is devised to filter out only depression-related tweets by creating a list of relevant hashtags to reduce noise in the corpus. Furthermore, an algorithm is developed to annotate the data into three depression classes: ‘Mild,’ ‘Moderate,’ and ‘Severe,’ based on International Classification of Diseases-10 (ICD-10) depression diagnostic criteria. Different baseline classifiers are applied to the annotated dataset to get a preliminary idea of classification performance on the corpus. Further FastText-based model is applied and fine-tuned with different preprocessing techniques and hyperparameter tuning to produce the tuned model, which significantly increases the depression classification performance to an 84% F1 score and 90% accuracy compared to baselines. Finally, a FastText-based weighted soft voting ensemble (WSVE) is proposed to boost the model’s performance by combining several other classifiers and assigning weights to individual models according to their individual performances. The proposed WSVE outperformed all baselines as well as FastText alone, with an F1 of 89%, 5% higher than FastText alone, and an accuracy of 93%, 3% higher than FastText alone. The proposed model better captures the contextual features of the relatively small sample class and aids in the detection of early depression intensity prediction from tweets with impactful performances.

Tipo de Documento: Artículo
Palabras Clave: Depression classification; deep learning; FastText; machine learning
Clasificación temática: Materias > Ingeniería
Materias > Psicología
Divisiones: Universidad Europea del Atlántico > Investigación > Producción Científica
Fundación Universitaria Internacional de Colombia > Investigación > Producción Científica
Universidad Internacional Iberoamericana México > Investigación > Producción Científica
Universidad Internacional Iberoamericana Puerto Rico > Investigación > Producción Científica
Universidad Internacional do Cuanza > Investigación > Producción Científica
Depositado: 14 Mar 2024 23:30
Ultima Modificación: 14 Mar 2024 23:30
URI: https://repositorio.unini.edu.mx/id/eprint/11264

Acciones (logins necesarios)

Ver Objeto Ver Objeto

<a href="/10290/1/Influence%20of%20E-learning%20training%20on%20the%20acquisition%20of%20competences%20in%20basketball%20coaches%20in%20Cantabria.pdf" class="ep_document_link"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>

en

open

Influence of E-learning training on the acquisition of competences in basketball coaches in Cantabria

The main aim of this study was to analyse the influence of e-learning training on the acquisition of competences in basketball coaches in Cantabria. The current landscape of basketball coach training shows an increasing demand for innovative training models and emerging pedagogies, including e-learning-based methodologies. The study sample consisted of fifty students from these courses, all above 16 years of age (36 males, 14 females). Among them, 16% resided outside the autonomous community of Cantabria, 10% resided more than 50 km from the city of Santander, 36% between 10 and 50 km, 14% less than 10 km, and 24% resided within Santander city. Data were collected through a Google Forms survey distributed by the Cantabrian Basketball Federation to training course students. Participation was voluntary and anonymous. The survey, consisting of 56 questions, was validated by two sports and health doctors and two senior basketball coaches. The collected data were processed and analysed using Microsoft® Excel version 16.74, and the results were expressed in percentages. The analysis revealed that 24.60% of the students trained through the e-learning methodology considered themselves fully qualified as basketball coaches, contrasting with 10.98% of those trained via traditional face-to-face methodology. The results of the study provide insights into important characteristics that can be adjusted and improved within the investigated educational process. Moreover, the study concludes that e-learning training effectively qualifies basketball coaches in Cantabria.

Producción Científica

Josep Alemany Iturriaga mail josep.alemany@uneatlantico.es, Álvaro Velarde-Sotres mail alvaro.velarde@uneatlantico.es, Javier Jorge mail , Kamil Giglio mail ,

Alemany Iturriaga

<a href="/11642/1/s41598-024-57547-4.pdf" class="ep_document_link"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>

en

open

Adaptive neighborhood rough set model for hybrid data processing: a case study on Parkinson’s disease behavioral analysis

Extracting knowledge from hybrid data, comprising both categorical and numerical data, poses significant challenges due to the inherent difficulty in preserving information and practical meanings during the conversion process. To address this challenge, hybrid data processing methods, combining complementary rough sets, have emerged as a promising approach for handling uncertainty. However, selecting an appropriate model and effectively utilizing it in data mining requires a thorough qualitative and quantitative comparison of existing hybrid data processing models. This research aims to contribute to the analysis of hybrid data processing models based on neighborhood rough sets by investigating the inherent relationships among these models. We propose a generic neighborhood rough set-based hybrid model specifically designed for processing hybrid data, thereby enhancing the efficacy of the data mining process without resorting to discretization and avoiding information loss or practical meaning degradation in datasets. The proposed scheme dynamically adapts the threshold value for the neighborhood approximation space according to the characteristics of the given datasets, ensuring optimal performance without sacrificing accuracy. To evaluate the effectiveness of the proposed scheme, we develop a testbed tailored for Parkinson’s patients, a domain where hybrid data processing is particularly relevant. The experimental results demonstrate that the proposed scheme consistently outperforms existing schemes in adaptively handling both numerical and categorical data, achieving an impressive accuracy of 95% on the Parkinson’s dataset. Overall, this research contributes to advancing hybrid data processing techniques by providing a robust and adaptive solution that addresses the challenges associated with handling hybrid data, particularly in the context of Parkinson’s disease analysis.

Producción Científica

Imran Raza mail , Muhammad Hasan Jamal mail , Rizwan Qureshi mail , Abdul Karim Shahid mail , Angel Olider Rojas Vistorte mail angel.rojas@uneatlantico.es, Md Abdus Samad mail , Imran Ashraf mail ,

Raza

<a href="/11265/1/Food%20Frontiers%20-%202024%20-%20Cassotta%20-%20Human%E2%80%90based%20new%20approach%20methodologies%20to%20accelerate%20advances%20in%20nutrition%20research.pdf" class="ep_document_link"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>

en

open

Human‐based new approach methodologies to accelerate advances in nutrition research

Much of nutrition research has been conventionally based on the use of simplistic in vitro systems or animal models, which have been extensively employed in an effort to better understand the relationships between diet and complex diseases as well as to evaluate food safety. Although these models have undeniably contributed to increase our mechanistic understanding of basic biological processes, they do not adequately model complex human physiopathological phenomena, creating concerns about the translatability to humans. During the last decade, extraordinary advancement in stem cell culturing, three-dimensional cell cultures, sequencing technologies, and computer science has occurred, which has originated a wealth of novel human-based and more physiologically relevant tools. These tools, also known as “new approach methodologies,” which comprise patient-derived organoids, organs-on-chip, multi-omics approach, along with computational models and analysis, represent innovative and exciting tools to forward nutrition research from a human-biology-oriented perspective. After considering some shortcomings of conventional in vitro and vivo approaches, here we describe the main novel available and emerging tools that are appropriate for designing a more human-relevant nutrition research. Our aim is to encourage discussion on the opportunity to explore innovative paths in nutrition research and to promote a paradigm-change toward a more human biology-focused approach to better understand human nutritional pathophysiology, to evaluate novel food products, and to develop more effective targeted preventive or therapeutic strategies while helping in reducing the number and replacing animals employed in nutrition research.

Producción Científica

Manuela Cassotta mail manucassotta@gmail.com, Danila Cianciosi mail , Maria Elexpuru Zabaleta mail maria.elexpuru@uneatlantico.es, Iñaki Elío Pascual mail inaki.elio@uneatlantico.es, Sandra Sumalla Cano mail sandra.sumalla@uneatlantico.es, Francesca Giampieri mail francesca.giampieri@uneatlantico.es, Maurizio Battino mail maurizio.battino@uneatlantico.es,

Cassotta

<a href="/11322/1/journal.pone.0298582.pdf" class="ep_document_link"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>

en

open

Design and development of patient health tracking, monitoring and big data storage using Internet of Things and real time cloud computing

With the outbreak of the COVID-19 pandemic, social isolation and quarantine have become commonplace across the world. IoT health monitoring solutions eliminate the need for regular doctor visits and interactions among patients and medical personnel. Many patients in wards or intensive care units require continuous monitoring of their health. Continuous patient monitoring is a hectic practice in hospitals with limited staff; in a pandemic situation like COVID-19, it becomes much more difficult practice when hospitals are working at full capacity and there is still a risk of medical workers being infected. In this study, we propose an Internet of Things (IoT)-based patient health monitoring system that collects real-time data on important health indicators such as pulse rate, blood oxygen saturation, and body temperature but can be expanded to include more parameters. Our system is comprised of a hardware component that collects and transmits data from sensors to a cloud-based storage system, where it can be accessed and analyzed by healthcare specialists. The ESP-32 microcontroller interfaces with the multiple sensors and wirelessly transmits the collected data to the cloud storage system. A pulse oximeter is utilized in our system to measure blood oxygen saturation and body temperature, as well as a heart rate monitor to measure pulse rate. A web-based interface is also implemented, allowing healthcare practitioners to access and visualize the collected data in real-time, making remote patient monitoring easier. Overall, our IoT-based patient health monitoring system represents a significant advancement in remote patient monitoring, allowing healthcare practitioners to access real-time data on important health metrics and detect potential health issues before they escalate.

Producción Científica

Md. Milon Islam mail , Imran Shafi mail , Sadia Din mail , Siddique Farooq mail , Isabel de la Torre Díez mail , Jose Breñosa mail josemanuel.brenosa@uneatlantico.es, Julio César Martínez Espinosa mail ulio.martinez@unini.edu.mx, Imran Ashraf mail ,

Islam

<a href="/11324/1/navarro-hortal-et-al-2024-in-vitro-and-in-vivo-insights-into-a-broccoli-byproduct-as-a-healthy-ingredient-for-the.pdf" class="ep_document_link"><img class="ep_doc_icon" alt="[img]" src="/style/images/fileicons/text.png" border="0"/></a>

en

open

In Vitro and In Vivo Insights into a Broccoli Byproduct as a Healthy Ingredient for the Management of Alzheimer’s Disease and Aging through Redox Biology

Broccoli has gained popularity as a highly consumed vegetable due to its nutritional and health properties. This study aimed to evaluate the composition profile and the antioxidant capacity of a hydrophilic extract derived from broccoli byproducts, as well as its influence on redox biology, Alzheimer’s disease markers, and aging in the Caenorhabditis elegans model. The presence of glucosinolate was observed and antioxidant capacity was demonstrated both in vitro and in vivo. The in vitro acetylcholinesterase inhibitory capacity was quantified, and the treatment ameliorated the amyloid-β- and tau-induced proteotoxicity in transgenic strains via SOD-3 and SKN-1, respectively, and HSP-16.2 for both parameters. Furthermore, a preliminary study on aging indicated that the extract effectively reduced reactive oxygen species levels in aged worms and extended their lifespan. Utilizing broccoli byproducts for nutraceutical or functional foods could manage vegetable processing waste, enhancing productivity and sustainability while providing significant health benefits.

Producción Científica

María D. Navarro-Hortal mail , Jose M. Romero-Márquez mail , M. Asunción López-Bascón mail , Cristina Sánchez-González mail , Jianbo Xiao mail , Sandra Sumalla Cano mail sandra.sumalla@uneatlantico.es, Maurizio Battino mail maurizio.battino@uneatlantico.es, Tamara Y. Forbes-Hernande mail tamara.forbes@unini.edu.mx, José L. Quiles mail jose.quiles@uneatlantico.es,

Navarro-Hortal