1. Introduction
The problem of automatic camera scene prediction on smartphones appeared soon after the introduction of the first mobile cameras. While the initial scene classification approaches were using only manually designed features and some simple machine learning algorithms, the availability of much more powerful AI hardware such as NPUs, GPUs and DSPs made it possible to use considerably more accurate and efficient deep learning-based solutions. Neverthe less, this task has not been properly addressed in the literature until the introduction of the Camera Scene Detection Dataset (CamSDD) dataset in [36], where this problem was carefully defined and training data for 30 different camera scene categories was provided along with a fast baseline solution. In this challenge, we take one step further in solving this task by imposing additional efficiency-related constraints on the developed models.