Introducing

Prezi AI.

Your new presentation assistant.

Refine, enhance, and tailor your content, source relevant images, and edit visuals quicker than ever before.

Detecting emotions through body laguage

Sama Ehab Said Ashour

Updated July 13, 2024

Transcript

Mental Health Assessment: analyzing body language patterns and emotional expressions.

Dataset : EmpatheticDialogues

using Deep Learning & Chatbot API. For Daily Mood Tracking and Theraputistic reports.

Mental Health Assessment: analyzing body language patterns and emotional expressions.

- dataset collected on the Amazon Mechanical Turk

- containing 24,850 one-to-one open-domain conversations.

- The dataset provides 32 evenly distributed emotion labels.

- Train: 19208, Valid: 2755 and Test: 2541

Basma Tarik 213113

Sama Ehab 213633

Supervised by Dr. Wael Gomaa

Dataset : EmoReact

Outlines

Motivation

Objective

Advancing Mental Health Care with Nonverbal Cues
The Need for Enhanced Virtual Assistance: understand human more accurately.
Bridging the Gap in Emotion Identification
Contributing to such significant work is deeply rewarding and inspiring.

Conclusion

- Dataset of children between the ages of 4 - 14 years old.

- contains 1102 videos. Train: 434, Val:305 and Test 368:

- This dataset is annotated for 17 affective states, six basic emotions, nine complex emotions :

- happiness

- sadness

- surprise

- fear

- disgust

- anger

- Amazon Mechanical Turk (AMT) was used for generating the labels

- Each video was annotated by threeworkers for seventeen labels.

- curiosity

- uncertainty

- excitement

- attentiveness

-exploration

- confusion

-anxiety

-embarrassment

-frustration.

Develop a deep learning emotion detection model, using frames and images.

Explore practical applications in mental health monitoring and virtual asstistance.

Use a wide spectrum of emotions to be detected by the model.

Introduction

- The development of a comprehensive model that uses deep learning to analyze and integrate data from body language, voice, and facial expressions.

- Enabling the identification of emotions in individuals who may struggle to express themselves verbally.

- Tracking daily moods and sending reports to therapists that can provide valuable support to individuals.

Solution Domains

Human-Robot Interaction

Behavior Analysis

ASD Diagnosis and Support

Background
Introduction
Motivation
Problem statement
Objective
Related Work
Proposed Solution
Dataset
Methodology
System Architecture
Results and Discussion
Comparative Results
Future Work
Conclusion
References

Abstract System Architecture

- People with disabilities face challenges in expressing themselves:

- Speech and language disabilities

- Hearing impairments

- Visual impairments

- Mobility challenges can all impact communication.

Emotic

Model Architecture

Introduction

Proposed Solution

Model Architecture

importance of body language

Background

Results and Discussion

Comparitive Results

- Effective communication stems from observing body language.

Resnet50 best Multi-accuracy: 94.5%

Body Language: Nonverbal communication through physical behaviors, gestures, and postures conveying emotions which express people's personality or intents.

Facial Expressions: Nonverbal signals communicated through facial muscle movements, crucial for expressing emotions.

The approach utilizes body language and emotional expressions as mental health indicators, involving

-data collection

- computer vision

- preprocessing

- deep learning model with integrated neural networks and attention mechanisms.

Its user-friendly application enables early intervention and personalized care by mental health professionals, with continuous updates improving adaptability and accuracy in identifying diverse traits and body language patterns associated with various mental health conditions.

Demo

- It's a key component in understanding how others are feeling and responding accordingly to their emotion.

Previous Work (EMOTIC dataset):

Focused on a limited set of emotions (6 out of 26).
Achieved high accuracy (97%) on this reduced set.
Limited applicability due to narrow range of emotions.

Our Work (EMOTIC dataset):

Analyzed all 26 emotions for a more comprehensive evaluation.
Achieved accuracy of 94.331% on the broader range.
Demonstrated effectiveness of deep learning for complex emotion recognition.

Our Best Performing Model ( ResNet50 on EMOTIC dataset):

ResNet-50 architecture achieved 94.5% accuracy and 82.89% recall rate.
Deep learning techniques, particularly ResNet-50, showed promising results.

Elements of personal communication

55% nonverbal

38% vocal

7% words only.

Dr. Albert Mehrabian, a researcher of body language

Gestures

Hand movements

Posture

Eye Contact

Physical contact such as handshakes

Tone of Voice

Dataset : EMOTIC

Related work 1/2

Refrences

Future Work

Pose-based Body Language Recognition for Emotion and Psychiatric Symptom Interpretation (2021)

EMOTIC:

- Amazon Mechanical Turk (AMT) was used for generating the labels

- 34, 315 images

- It consists of images featuring people in real-world environments

- The dataset uses an extended list of 26 emotion categorie

for annotation

- 80% training , 20% testing

[1] J. Smith and A. Johnson, “Deep learning approach for emotion recognition from hu man body movements with feedforward deep convolution neural networks,” Journal of Machine Learning Research, 2021.
[2] M. Romeo, D. Hern´andez Garc´ıa, T. Han, A. Cangelosi, and K. Jokinen, “Predicting apparent personality from body language: benchmarking deep learning architectures for adaptive social human–robot interaction,” pp. 1167–1179, 2021, received 25 Jan 2021, Accepted 05 Aug 2021, Published online: 27 Oct 2021.
[3] R. Kosti, J. M. Alvarez, A. Recasens, and A. Lapedriza, “Emotic: Emotions in context dataset,” pp. 61–69, 2017.
[4] H. F. Alhasson, G. M. Alsaheel, N. S. Alharbi, A. A. Alsalamah, J. M. Alhujilan, and S. S. Alharbi, “Bi-model engagement emotion recognition based on facial and upper body landmarks and machine learning approaches,” International Journal of E-Services and Mobile Applications (IJESMA), vol. 15, no. 1, pp. 1–13, 2023.
[5] Z. Yang, A. Kay, Y. Li, W. Cross, and J. Luo, “Pose-based body language recogni tion for emotion and psychiatric symptom interpretation,” in 2020 25th International Conference on Pattern Recognition (ICPR). Milan, Italy: IEEE, January 2021.
[6] Z. Lu, J. Zeng, S. Shan, and X. Chen, “Zero-shot facial expression recognition with multi label label propagation,” in Asian Conference on Computer Vision. Cham: Springer International Publishing, 2018, pp. 19–34

Deep learning models are effective for emotion recognition from body language.
Potential applications in human-computer interaction and related fields.
Future research should focus on model refinement, exploring other modalities, and doing data augmentation to adress the dataset biases .

Introduction :

- Develop efficient body language recognition for emotion inference from RGB video.

- Create interpretable models for psychiatric symptom detection in healthcare.

Dataset:

- Collected and labeled by mental health professionals

- Include 32 body language, 24 emotions, 24 symptoms and psychiatric verdict.

- The dataset consists of 144 video clips, with 48 for training, 48 for validation, and 48 for testing

Problem Statement

Related Work 2/3

Predicting apparent personality from body language: benchmarking deep learning architectures for adaptive social human–robot interaction (2021)

Introduction :

- enable social robots to understand users' apparent personalities based on body language

- Deep learning architectures.

-Mental health patients face communication obstacles when conveying their emotions.

-Virtual assistants can aid expression but lack comprehensive emotion recognition and understanding of other's intensions .

-Despite their great utilities, Virtual Assistants' limitations include empathy, user's personality, and privacy concerns.

Methodology :

-Pose Feature Representations: NTraj Feature, NTraj+ Feature

- ST-ConvPose for learning spatial and temporal relations among joints

- ResNet-50: for high-level pose feature encoding

- knn classifier: for body language prediction with limited data.

- LSTM: for emotion interpretation

Dataset:

-ChaLearn First Impression dataset

- provides both audio and visual information from 10,000 clips.

Methodolgy :

3DCNN: spatio-temporal features, Apparent Personality Prediction.
3DResNet: activity recognition, audiovisual layer.
VGG-16 (VGG DAN+): successful in the ChaLearn 2016 competition.
CNN + LSTM: 2D convolutional & recursive layers.

Results:

VGGDAN+ as the most successful architecture across all personality traits.
conscientiousness was the most successfully predicted personality trait by the models (62% balanced accuracy for VGG DAN+ and 3DResNet).

Limitation:

structure of the data is inconsistent across the HHI sessions
dataset recordings are less informative than ideal

Results:

- The model's performance varies: 79.9% with predictions.

Related Work 2/2

Bi-Model Engagement Emotion Recognition Based on Facial and Upper-Body Landmarks and Machine Learning Approaches

Introduction:

-proposes a system for emotion recognition that combines facial expressions and upper-body movements.

-uses machine learning algorithms to classify emotions from facial landmarks and body pose data.

Limmetation:

- Data Bias

- Emotion Inference Relies on Body Language only.

Dataset:

-EMOTIC contains images with people in natural settings.

-MELD includes recordings of movie dialogues with emotional content.

-they selected 6 specific emotions: Aversion, Engagement, Excitement, Pleasure, Annoyance, and Disconnection. (23,185 images)

Methodolgy :

-MediaPipe Holistic model is used to extract facial landmarks (468 points) and body pose landmarks (33 points) from images.

-bi-model approach that combines separate models for facial expressions and upper-body movements.

-Individual models: Support Vector Machine (SVM), Logistic Regression, Random Forest.

-Ensemble model: combines predictions from all three individual models for improved accuracy.

Results:

-97% accuracy on EMOTIC

-99% accuracy on MELD

Limitation:

-Controlled Settings

-imited Information on Class Imbalance Correction

Results and Discussion

Experment 1

Demo

Transformers best result: 38.43%

Experment 2

implementing 5-fold cross-validation.

rnn-cnn: 38.8%

How to improve it ?

-Increase the Number of Epochs

-Data Augmentation

-vision transformers

Choose a template

Modular - Dark (AI Assisted)

Revolutionize your presentations with our Modular Prezi AI-assisted presentation template, a versatile and customizable solution that adapts to your unique content, providing a visually stunning and cohesive framework for professionals, educators, and creatives.

Sheet Music (AI Assisted)

Elevate your presentations with our Sheet Music Prezi AI-assisted presentation template, seamlessly blending aesthetics and functionality for a harmonious visual experience.

Colorful Nature - Dark (AI Assisted)

A whimsical flower motif sets the fun tone for this Prezi AI-assisted presentation template. Just add your own text, images, videos, or other content to create a memorable and engaging presentation your audience will love. Like all Prezi templates, it’s easily customizable.

See more templates →

Presentations from around the world

OPERACIONES ALGEBRAICAS U1 10MO

Andrea Polits

Devin Bristol's Worldview Assignment

Cornerstone Middle

Creative Report

Gabriele Roncoroni

See staff picks →

Learn more about creating dynamic, engaging presentations with Prezi

Why Prezi is better