MGMILA: Eulerian Motion-aware MILA for Micro-gesture Recognition
-
-
Abstract
Micro-gesture is an imperceptible non-verbal behaviour characterised by low-intensity movement. However, its low-intensity and short-duration nature pose challenges for traditional action recognition models. To address this, we propose micro-gesture Mamba-inspired linear attention (MGMILA), a motion-aware framework integrating Mamba-inspired linear attention (MILA), a linear complexity model optimized for video-based micro-gesture recognition. Additionally, we design motion extraction module variants, motion as layer (MAL), motion as content (MAC), and motion as gate (MAG) to enhance spatiotemporal motion localization. Furthermore, we introduce human segmentation mask prediction as an auxiliary task to guide the network in attending to human-related regions, thereby improving its motion perception and recognition capability. Experiments on iMiGUE, spontaneous micro gesture (SMG), and MA-52 demonstrate state-of-the-art (SOTA) performance, validating the effectiveness of our approach.
-
-