1. Introduction
In precision agriculture, Unmanned Aerial Vehicles (UAVs) have emerged as a pivotal tool for monitoring agricultural landscapes efficiently. UAVs provide much higher resolution images compared to satellite images, thus capturing fine grained details on the agricultural fields. Accurate anomaly detection in UAV images is crucial for the early identification of potential issues such as pest infestations, diseases, and nutrient deficiencies. The dynamic and diverse nature of agricultural fields further compounds the challenge, as anomalies can vary greatly in appearance due to factors such as crop type, growth stage, and environmental conditions as compared to other anomaly detection settings, compared in Figure 1. Thus, there is a need for a completely label free approach to training anomaly detection models so that it can be applied across different crops for different kinds of anomalies. Traditionally, super-vised learning methods have been used for anomaly detection systems [2], [4], [11], [21]. These methods are inherently limited by their dependence on large sets of annotated data, which are labor-intensive to create and may not capture the full spectrum of possible anomalies. Even in case of un-supervised and self supervised methods [6], [12], [15], [29], [30] where explicit anomaly labels are not used, there is a dependence on using only "normal" data for training thus making it necessary for a user to curate normal data that does not contain any types of anomalies.