1. INTRODUCTION
Given its wide range of applications (recommendation, playlist generation, synchronization, dj-ing, audio or audio/video editing, beat-synchronous analysis), tempo estimation remains a major tasks in Music Information Retrieval (MIR). At its core, tempo estimation seeks to estimate the periodicity of the dominant rhythm pulse of a music audio signal, often expressed in beat per minute (BPM). Formulated in such a manner it has a strong resemblance to the task of pitch estimation. Recently, there has been a notable shift in the task of pitch estimation towards the adoption of Self-Supervised Learning (SSL). This has shown superiority over the conventional supervised models [1], [2]. In this work we explore the adaptation of such pitch-based SSL systems to the task of tempo estimation.