1. Introduction
Image style transfer is a popular topic that aims to apply a desired painting style to an input content image. The transfer model requires the information of "what content" in the input image and "which painting style" to be used [17], [29]. Conventional style transfer methods require a content image accompanied by a style image to provide the content and style information [2], [7], [13], [24], [30]. However, people have specific aesthetic needs. Usually, finding a single style image that perfectly matches one’s requirements is inconvenient or infeasible. Text or language is a natural interface to describe the preferred style. Instead of using a style image, using text to describe style preference is easier to obtain and more adjustable. Furthermore, achieving perceptually pleasing artist-aware stylization typically requires learning from collections of art, as one reference image is typically not representative enough. In this work, we learn arbitrary artist-aware image style transfer, which transfers the painting styles of any artist to the target image using texts and/or images. Most studies on universal style transfer [24], [29] limit their applications using reference images as style indicators that are less creative or flexible. Text-driven style transfer has been studied [9], [17] and has shown promising results using a simple text prompt. However, these approaches require either costly data collection and labeling or online optimization for every content and style. Instead, our proposed Text-driven artistic aware Style Transfer model, TxST, overcomes these two problems and achieves better and more efficient stylization.