Emotion Classification from Facial Images as a Meta-Sapiens Task
Loading...
Date
2025-05
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
The Ohio State University
Abstract
In this paper, we propose discrete emotion classification from facial images as a new downstream task for Sapiens, Meta’s state-of-the-art human-vision foundational model. Currently, Sapiens only focuses on the physical aspects of human vision such as pose estimation and body part segmentation. Our model, MotivNet, extends Sapiens' human understanding and attempts to recognize underlying nuances conveyed by human physiognomy. We define three criteria to evaluate MotivNet's viability as a Sapiens task: benchmark performance, model similarity, and data similarity. Throughout this paper, we describe the components of MotivNet, our training approach, and our results compared to current benchmarks. We show that MotivNet achieved results comparable to existing benchmarks and meets the listed criteria, validating MotivNet as a downstream task and pushing forward Sapiens' capabilities as a human-centric model.
Description
Keywords
Computer Vision, Emotion Recognition, Deep Learning, Machine Learning, Meta-Sapiens