eprintid: 27968
rev_number: 8
eprint_status: archive
userid: 2
dir: disk0/00/02/79/68
datestamp: 2026-03-30 21:37:32
lastmod: 2026-03-30 21:37:33
status_changed: 2026-03-30 21:37:32
type: article
metadata_visibility: show
creators_name: Raza, Muhammad Amjad
creators_name: Mehmood, Nasir
creators_name: Siddiqui, Hafeez Ur Rehman
creators_name: Saleem, Adil Ali
creators_name: Álvarez, Roberto Marcelo
creators_name: Miró Vera, Yini Airet
creators_name: Díez, Isabel de la Torre
creators_id: 
creators_id: 
creators_id: 
creators_id: 
creators_id: roberto.alvarez@uneatlantico.es
creators_id: yini.miro@uneatlantico.es
creators_id: 
title: Human Activity Recognition in Domestic Settings Based on Optical Techniques and Ensemble Models
ispublished: pub
subjects: uneat_eng
divisions: uneatlantico_produccion_cientifica
divisions: unincol_produccion_cientifica
divisions: uninimx_produccion_cientifica
divisions: uninipr_produccion_cientifica
divisions: unic_produccion_cientifica
divisions: uniromana_produccion_cientifica
full_text_status: public
keywords: deep learning; human activity recognition; LSTM; PoseNet; skeleton-based recognition; smart home;
            Transformer
abstract: Human activity recognition (HAR) is essential in many applications, such as smart homes, assisted
            living, healthcare monitoring, rehabilitation, physiotherapy, and geriatric care. Conventional methods of
            HAR use wearable sensors, e.g., acceleration sensors and gyroscopes. However, they are limited by issues
            such as sensitivity to position, user inconvenience, and potential health risks with long-term use. Optical
            camera systems that are vision-based provide an alternative that is not intrusive; however, they are
            susceptible to variations in lighting, intrusions, and privacy issues. The paper uses an optical method of
            recognizing human domestic activities based on pose estimation and deep learning ensemble models. The
            skeletal keypoint features proposed in the current methodology are extracted from video data using PoseNet
            to generate a privacy-preserving representation that captures key motion dynamics without being sensitive to
            changes in appearance. A total of 30 subjects (15 male and 15 female) were sampled across 2734 activity
            samples, including nine daily domestic activities. There were six deep learning architectures, namely, the
            Transformer (Transformer), Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), Multilayer Perceptron
            (MLP), One-Dimensional Convolutional Neural Network (1D CNN), and a hybrid Convolutional Neural Network–Long
            Short-Term Memory (CNN–LSTM) architecture. The results on the hold-out test set show that the CNN–LSTM
            architecture achieves an accuracy of 98.78% within our experimental setting. Leave-One-Subject-Out
            cross-validation further confirms robust generalization across unseen individuals, with CNN–LSTM achieving a
            mean accuracy of 97.21% ± 1.84% across 30 subjects. The results demonstrate that vision-based pose
            estimation with deep learning is a useful, precise, and non-intrusive approach to HAR in smart healthcare
            and home automation systems.
date: 2026-02
publication: Sensors
volume: 26
number: 5
pagerange: 1516
id_number: doi:10.3390/s26051516
refereed: TRUE
issn: 1424-8220
official_url: http://doi.org/10.3390/s26051516
access: open
language: en
citation:   Artículo Materias > Ingeniería <http://repositorio.unini.edu.mx/view/subjects/uneat=5Feng.html> Universidad Europea del Atlántico > Investigación > Producción Científica <http://repositorio.unini.edu.mx/view/divisions/uneatlantico=5Fproduccion=5Fcientifica.html>
Fundación Universitaria Internacional de Colombia > Investigación > Producción Científica <http://repositorio.unini.edu.mx/view/divisions/unincol=5Fproduccion=5Fcientifica.html>
Universidad Internacional Iberoamericana México > Investigación > Artículos y libros <http://repositorio.unini.edu.mx/view/divisions/uninimx=5Fproduccion=5Fcientifica.html>
Universidad Internacional Iberoamericana Puerto Rico > Investigación > Producción Científica <http://repositorio.unini.edu.mx/view/divisions/uninipr=5Fproduccion=5Fcientifica.html>
Universidad Internacional do Cuanza > Investigación > Producción Científica <http://repositorio.unini.edu.mx/view/divisions/unic=5Fproduccion=5Fcientifica.html>
Universidad de La Romana > Investigación > Producción Científica <http://repositorio.unini.edu.mx/view/divisions/uniromana=5Fproduccion=5Fcientifica.html> Abierto Inglés Human activity recognition (HAR) is essential in many applications, such as smart homes, assisted living, healthcare monitoring, rehabilitation, physiotherapy, and geriatric care. Conventional methods of HAR use wearable sensors, e.g., acceleration sensors and gyroscopes. However, they are limited by issues such as sensitivity to position, user inconvenience, and potential health risks with long-term use. Optical camera systems that are vision-based provide an alternative that is not intrusive; however, they are susceptible to variations in lighting, intrusions, and privacy issues. The paper uses an optical method of recognizing human domestic activities based on pose estimation and deep learning ensemble models. The skeletal keypoint features proposed in the current methodology are extracted from video data using PoseNet to generate a privacy-preserving representation that captures key motion dynamics without being sensitive to changes in appearance. A total of 30 subjects (15 male and 15 female) were sampled across 2734 activity samples, including nine daily domestic activities. There were six deep learning architectures, namely, the Transformer (Transformer), Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), Multilayer Perceptron (MLP), One-Dimensional Convolutional Neural Network (1D CNN), and a hybrid Convolutional Neural Network–Long Short-Term Memory (CNN–LSTM) architecture. The results on the hold-out test set show that the CNN–LSTM architecture achieves an accuracy of 98.78% within our experimental setting. Leave-One-Subject-Out cross-validation further confirms robust generalization across unseen individuals, with CNN–LSTM achieving a mean accuracy of 97.21% ± 1.84% across 30 subjects. The results demonstrate that vision-based pose estimation with deep learning is a useful, precise, and non-intrusive approach to HAR in smart healthcare and home automation systems. metadata Raza, Muhammad Amjad; Mehmood, Nasir; Siddiqui, Hafeez Ur Rehman; Saleem, Adil Ali; Álvarez, Roberto Marcelo; Miró Vera, Yini Airet y Díez, Isabel de la Torre mail SIN ESPECIFICAR, SIN ESPECIFICAR, SIN ESPECIFICAR, SIN ESPECIFICAR, roberto.alvarez@uneatlantico.es, yini.miro@uneatlantico.es, SIN ESPECIFICAR     <http://repositorio.unini.edu.mx/id/eprint/27968/1/sensors-26-01516-v2.pdf>     (2026) Human Activity Recognition in Domestic Settings Based on Optical Techniques and Ensemble Models.  Sensors, 26 (5).  p. 1516.  ISSN 1424-8220     
document_url: http://repositorio.unini.edu.mx/id/eprint/27968/1/sensors-26-01516-v2.pdf