I. Introduction
The analysis of sports videos plays a crucial role in enhancing athletic performance, refining coaching strategies, and providing valuable insights into player dynamics. In the realm of tennis, the ability to recognize and understand a sequence of events (e.g., players hitting the ball and the ball bouncing on the court) from video clips has gained significant interest due to its potential to empower players, coaches, and analysts alike. However, achieving accurate event detection in tennis videos presents unique challenges that demand innovative solutions.