I. Introduction
Understanding animal motion in natural environments is essential for insights into ecology, evolutionary biology, and neuroscience. While controlled laboratory settings have been informative, the complexity of outdoor environments poses significant sensing challenges like occlusion and lighting variability. In human motion capture, deep learning has addressed these challenges but requires extensive ground truth data currently impractical for wildlife [1].