Introducing
Your new presentation assistant.
Refine, enhance, and tailor your content, source relevant images, and edit visuals quicker than ever before.
Trending searches
Dataset : EmpatheticDialogues
using Deep Learning & Chatbot API. For Daily Mood Tracking and Theraputistic reports.
Mental Health Assessment: analyzing body language patterns and emotional expressions.
- dataset collected on the Amazon Mechanical Turk
- containing 24,850 one-to-one open-domain conversations.
- The dataset provides 32 evenly distributed emotion labels.
- Train: 19208, Valid: 2755 and Test: 2541
Basma Tarik 213113
Sama Ehab 213633
Supervised by Dr. Wael Gomaa
Dataset : EmoReact
Outlines
Motivation
Objective
Conclusion
- Dataset of children between the ages of 4 - 14 years old.
- contains 1102 videos. Train: 434, Val:305 and Test 368:
- This dataset is annotated for 17 affective states, six basic emotions, nine complex emotions :
- happiness
- sadness
- surprise
- fear
- disgust
- anger
- Amazon Mechanical Turk (AMT) was used for generating the labels
- Each video was annotated by threeworkers for seventeen labels.
- curiosity
- uncertainty
- excitement
- attentiveness
-exploration
- confusion
-anxiety
-embarrassment
-frustration.
Introduction
- The development of a comprehensive model that uses deep learning to analyze and integrate data from body language, voice, and facial expressions.
- Enabling the identification of emotions in individuals who may struggle to express themselves verbally.
- Tracking daily moods and sending reports to therapists that can provide valuable support to individuals.
Solution Domains
Abstract System Architecture
- People with disabilities face challenges in expressing themselves:
- Speech and language disabilities
- Hearing impairments
- Visual impairments
- Mobility challenges can all impact communication.
Emotic
&
Model Architecture
&
Introduction
Proposed Solution
Model Architecture
importance of body language
Background
Results and Discussion
Comparitive Results
- Effective communication stems from observing body language.
Resnet50 best Multi-accuracy: 94.5%
Body Language: Nonverbal communication through physical behaviors, gestures, and postures conveying emotions which express people's personality or intents.
Facial Expressions: Nonverbal signals communicated through facial muscle movements, crucial for expressing emotions.
The approach utilizes body language and emotional expressions as mental health indicators, involving
-data collection
- computer vision
- preprocessing
- deep learning model with integrated neural networks and attention mechanisms.
Its user-friendly application enables early intervention and personalized care by mental health professionals, with continuous updates improving adaptability and accuracy in identifying diverse traits and body language patterns associated with various mental health conditions.
Demo
- It's a key component in understanding how others are feeling and responding accordingly to their emotion.
Previous Work (EMOTIC dataset):
Our Work (EMOTIC dataset):
Our Best Performing Model ( ResNet50 on EMOTIC dataset):
Elements of personal communication
55% nonverbal
38% vocal
7% words only.
Dr. Albert Mehrabian, a researcher of body language
Gestures
Hand movements
Posture
Eye Contact
Physical contact such as handshakes
Tone of Voice
Dataset : EMOTIC
Related work 1/2
Refrences
Future Work
Pose-based Body Language Recognition for Emotion and Psychiatric Symptom Interpretation (2021)
- Amazon Mechanical Turk (AMT) was used for generating the labels
- 34, 315 images
- It consists of images featuring people in real-world environments
- The dataset uses an extended list of 26 emotion categorie
for annotation
- 80% training , 20% testing
Introduction :
- Develop efficient body language recognition for emotion inference from RGB video.
- Create interpretable models for psychiatric symptom detection in healthcare.
Dataset:
- Collected and labeled by mental health professionals
- Include 32 body language, 24 emotions, 24 symptoms and psychiatric verdict.
- The dataset consists of 144 video clips, with 48 for training, 48 for validation, and 48 for testing
Problem Statement
Related Work 2/3
Predicting apparent personality from body language: benchmarking deep learning architectures for adaptive social human–robot interaction (2021)
Introduction :
- enable social robots to understand users' apparent personalities based on body language
- Deep learning architectures.
-Mental health patients face communication obstacles when conveying their emotions.
-Virtual assistants can aid expression but lack comprehensive emotion recognition and understanding of other's intensions .
-Despite their great utilities, Virtual Assistants' limitations include empathy, user's personality, and privacy concerns.
Methodology :
-Pose Feature Representations: NTraj Feature, NTraj+ Feature
- ST-ConvPose for learning spatial and temporal relations among joints
- ResNet-50: for high-level pose feature encoding
- knn classifier: for body language prediction with limited data.
- LSTM: for emotion interpretation
Dataset:
-ChaLearn First Impression dataset
- provides both audio and visual information from 10,000 clips.
Methodolgy :
Results:
Limitation:
Results:
- The model's performance varies: 79.9% with predictions.
Related Work 2/2
Bi-Model Engagement Emotion Recognition Based on Facial and Upper-Body Landmarks and Machine Learning Approaches
Introduction:
-proposes a system for emotion recognition that combines facial expressions and upper-body movements.
-uses machine learning algorithms to classify emotions from facial landmarks and body pose data.
Limmetation:
- Data Bias
- Emotion Inference Relies on Body Language only.
Dataset:
-EMOTIC contains images with people in natural settings.
-MELD includes recordings of movie dialogues with emotional content.
-they selected 6 specific emotions: Aversion, Engagement, Excitement, Pleasure, Annoyance, and Disconnection. (23,185 images)
Methodolgy :
-MediaPipe Holistic model is used to extract facial landmarks (468 points) and body pose landmarks (33 points) from images.
-bi-model approach that combines separate models for facial expressions and upper-body movements.
-Individual models: Support Vector Machine (SVM), Logistic Regression, Random Forest.
-Ensemble model: combines predictions from all three individual models for improved accuracy.
Results:
-97% accuracy on EMOTIC
-99% accuracy on MELD
Limitation:
-Controlled Settings
-imited Information on Class Imbalance Correction
Results and Discussion
Experment 1
Demo
Transformers best result: 38.43%
Experment 2
implementing 5-fold cross-validation.
rnn-cnn: 38.8%
How to improve it ?
-Increase the Number of Epochs
-Data Augmentation
-vision transformers