Emotion Classification from Facial Images as a Meta-Sapiens Task

Loading...
Thumbnail Image

Date

2025-05

Journal Title

Journal ISSN

Volume Title

Publisher

The Ohio State University

Research Projects

Organizational Units

Journal Issue

Abstract

In this paper, we propose discrete emotion classification from facial images as a new downstream task for Sapiens, Meta’s state-of-the-art human-vision foundational model. Currently, Sapiens only focuses on the physical aspects of human vision such as pose estimation and body part segmentation. Our model, MotivNet, extends Sapiens' human understanding and attempts to recognize underlying nuances conveyed by human physiognomy. We define three criteria to evaluate MotivNet's viability as a Sapiens task: benchmark performance, model similarity, and data similarity. Throughout this paper, we describe the components of MotivNet, our training approach, and our results compared to current benchmarks. We show that MotivNet achieved results comparable to existing benchmarks and meets the listed criteria, validating MotivNet as a downstream task and pushing forward Sapiens' capabilities as a human-centric model.

Description

Keywords

Computer Vision, Emotion Recognition, Deep Learning, Machine Learning, Meta-Sapiens

Citation