HSEmotion Team at ABAW-10 Competition: Facial Expression Recognition, Valence-Arousal Estimation, Action Unit Detection and Fine-Grained Violence Classification
arXiv cs.AI / 3/16/2026
💬 OpinionModels & Research
Key Points
- The paper reports results for the 10th ABAW competition across frame-wise facial expression recognition, valence-arousal estimation, action unit detection, and fine-grained violence classification.
- They propose a fast approach using facial embedding extraction with pre-trained EfficientNet-based emotion recognition models, deploying a threshold to trust the model's prediction or fall back to a simple MLP trained on AffWild2 embeddings.
- Estimated class scores are smoothed with a sliding window to mitigate noise in frame-wise predictions.
- For the violence detection task, they evaluate several pre-trained frame-embedding architectures and aggregation methods for video classification, showing significant improvements over existing baselines on four ABAW tasks.




![[Boost]](/_next/image?url=https%3A%2F%2Fmedia2.dev.to%2Fdynamic%2Fimage%2Fwidth%3D800%252Cheight%3D%252Cfit%3Dscale-down%252Cgravity%3Dauto%252Cformat%3Dauto%2Fhttps%253A%252F%252Fdev-to-uploads.s3.amazonaws.com%252Fuploads%252Fuser%252Fprofile_image%252F3833034%252F44fa15e0-8eb9-4843-a424-a4a7b3538f43.jpeg&w=3840&q=75)