I. Introduction
Modern data mining methods have seen a widespread and growing application in the field of materials science for regression-based predictive modeling due to their effectiveness in extracting and utilizing the hidden information from the materials datasets and aid in the process of materials discov-ery [1]–[7]. This has been made possible due to the availability of computationally calculated large materials databases [8], [9] as well as easy-to-use data mining tools and advance-ment in the machine learning (ML) and deep learning (DL) algorithms to extract hidden information from raw inputs and build accurate and robust models for various material properties [10]–[15]. Since materials property prediction is a regression-based task and the representation used as model input to train various ML/DL methods usually comprises of a one-dimensional numerical vector obtained by pre-processing raw materials input, traditional ML algorithms [10], [11] and DL models composed of fully connected layers [16]–[21] are extensively used. However, due to the costly and time-consuming nature of the methods involved in obtaining the experimental and, in some cases even computational data, the majority of the materials datasets are small in size, limiting the highly accurate models to a selected few materials properties with a large amount of data [22], [23]. Moreover, limited generalized hand-engineered representations available from the raw materials data [24], [25] make it harder to improve the accuracy of predictive models built on such small and specialized training datasets. Therefore, various advanced data mining techniques such as transfer learning (TL) [26]–[29] and representation learning (RL) [30]–[35] are often applied to tackle the bottleneck of small data size by reusing the existing knowledge in a bid to boost the predictive performance of the model.