I. Introduction
Artistic works serve as a reflection of artists’ inspirations and perspectives on life, expressed in a creative and imaginative manner that has evolved alongside human history. While these artworks encapsulate profound introspections, subjective interpretations of existence, and fervent admiration for life, the stories they tell remain unheard. As a result, traditional static artworks often fail to resonate with ordinary viewers, hindering their ability to appreciate the inner spirit and beauty encapsulated within. However, in the era of deep learning, advancements in technology have made it more accessible for regular users to customize and stylize artistic content. This transformative process is gradually reshaping the way people create, consume, and share art, from everyday applications such as style image manipulation to virtual concepts like the metaverse. Consequently, there is a growing demand to rebuild the interactions between artists and audiences. This leads us to an intriguing question: “Can we breathe life into existing artworks, facilitating more vivid and resonating human-art interactions, all while preserving their original content and style?” This question gives rise to the problem of “3D Animatable Artforms”, which involves the extraction of three-dimensional facial information from the given artistic face images while simultaneously maintaining the integrity of their content and style.