Loading [MathJax]/extensions/MathMenu.js
Name your style: text-guided artistic style transfer | IEEE Conference Publication | IEEE Xplore

Name your style: text-guided artistic style transfer


Abstract:

Image style transfer has attracted widespread attention in the past years. Despite its remarkable results, it requires additional style images available as references, ma...Show More

Abstract:

Image style transfer has attracted widespread attention in the past years. Despite its remarkable results, it requires additional style images available as references, making it less flexible and inconvenient. Using text is the most natural way to describe the style. Text can describe implicit abstract styles, like styles of specific artists or art movements. In this work, we propose a text-driven style transfer (TxST) that leverages advanced image-text encoders to control arbitrary style transfer. We introduce a contrastive training strategy to effectively extract style descriptions from the image-text model (i.e., CLIP), which aligns stylization with the text description. To this end, we also propose a novel cross-attention module to fuse style and content features. Finally, we achieve an arbitrary artist-aware style transfer to learn and transfer specific artistic characters such as Picasso, oil painting, or a rough sketch. Extensive experiments demonstrate that our approach outperforms the state-of-the-art methods. Moreover, it can mimic the styles of one or many artists to achieve attractive results, thus highlighting a promising future direction.
Date of Conference: 17-24 June 2023
Date Added to IEEE Xplore: 14 August 2023
ISBN Information:

ISSN Information:

Conference Location: Vancouver, BC, Canada
References is not available for this document.

1. Introduction

Image style transfer is a popular topic that aims to apply a desired painting style to an input content image. The transfer model requires the information of "what content" in the input image and "which painting style" to be used [17], [29]. Conventional style transfer methods require a content image accompanied by a style image to provide the content and style information [2], [7], [13], [24], [30]. However, people have specific aesthetic needs. Usually, finding a single style image that perfectly matches one’s requirements is inconvenient or infeasible. Text or language is a natural interface to describe the preferred style. Instead of using a style image, using text to describe style preference is easier to obtain and more adjustable. Furthermore, achieving perceptually pleasing artist-aware stylization typically requires learning from collections of art, as one reference image is typically not representative enough. In this work, we learn arbitrary artist-aware image style transfer, which transfers the painting styles of any artist to the target image using texts and/or images. Most studies on universal style transfer [24], [29] limit their applications using reference images as style indicators that are less creative or flexible. Text-driven style transfer has been studied [9], [17] and has shown promising results using a simple text prompt. However, these approaches require either costly data collection and labeling or online optimization for every content and style. Instead, our proposed Text-driven artistic aware Style Transfer model, TxST, overcomes these two problems and achieves better and more efficient stylization.

Select All
1.
K. nichol, painter by numbers wikiart, 2016, [online] Available: https://www.kaggle.com/c/painter-by-numbers.
2.
Jie An, Siyu Huang, Yibing Song, Dejing Dou, Wei Liu and Jiebo Luo, "Artflow: Unbiased image style transfer via reversible neural flows", CVPR, 2021.
3.
Federico Bianchi, Giuseppe Attanasio, Raphael Pisoni, Silvia Terragni, Gabriele Sarti and Sri Lakshmi, "Contrastive language-image pre-training for the italian language", 2021.
4.
Ting Chen, Simon Kornblith, Kevin Swersky, Mohammad Norouzi and Geoffrey E. Hinton, "Big self-supervised models are strong semi-supervised learners", NeurIPS, 2020.
5.
Nancy Chinchor, "Muc-4 evaluation metrics", Proceedings of the 4th Conference on Message Understanding, 1992.
6.
M. Cimpoi, S. Maji, I. Kokkinos, S. Mohamed and A. Vedaldi, "Describing textures in the wild", CVPR, 2014.
7.
Yingying Deng, Fan Tang, Weiming Dong, Chongyang Ma, Xingjia Pan, Lei Wang, et al., "Stytr 2: Image style transfer with transformers", CVPR, 2022.
8.
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly et al., "An image is worth 16x16 words: Transformers for image recognition at scale", ICLR, 2021.
9.
Tsu-Jui Fu, Xin Eric Wang and William Yang Wang, "Language-driven artistic style transfer", ECCV, 2022.
10.
Rinon Gal, Or Patashnik, Haggai Maron, Gal Chechik and Daniel Cohen-Or, "Stylegan-nada: Clip-guided domain adaptation of image generators", 2021.
11.
Leon A Gatys, Alexander S Ecker and Matthias Bethge, "Image style transfer using convolutional neural networks", CVPR, 2016.
12.
Nisha Huang, Fan Tang, Weiming Dong and Changsheng Xu, "Draw your art dream: Diverse digital art synthesis with multimodal guided diffusion", MM ’22, pp. 1085-1094, 2022.
13.
Xun Huang and Serge Belongie, "Arbitrary style transfer in real-time with adaptive instance normalization", ICCV, 2017.
14.
Diederik P Kingma and Max Welling, "Auto-encoding variational bayes", 2014.
15.
Nicholas Kolkin, Jason Salavon and Gregory Shakhnarovich, "Style transfer by relaxed optimal transport and self-similarity", CVPR, 2019.
16.
Alexander Kuhnle and Ann Copestake, "Shapeworld - a new test methodology for multimodal language understanding", 2017.
17.
Gihyun Kwon and Jong Chul Ye, "Clipstyler: Image style transfer with a single text condition", 2021.
18.
Jonas Köhler, Andreas Krämer and Frank Noé, "Smooth normalizing flows", NeurIPS, 2021.
19.
Yijun Li, Chen Fang, Jimei Yang, Zhaowen Wang, Xin Lu and Ming-Hsuan Yang, "Universal style transfer via feature transforms", NeurIPS, 2017.
20.
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, et al., "Microsoft coco: Common objects in context", ECCV, 2014.
21.
Songhua Liu, Tianwei Lin, Dongliang He, Fu Li, Meiling Wang, Xin Li, et al., "Adaattn: Revisit attention mechanism in arbitrary neural style transfer", ICCV, 2021.
22.
Zhisong Liu, Robin Courant and Vicky Kalogeiton, "Funnynet: Audiovisual learning of funny moments in videos", ACCV, 2022.
23.
Zhi-Song Liu, Vicky Kalogeiton and Marie-Paule Cani, "Multiple style transfer via variational autoencoder", ICIP, 2021.
24.
Dae Young Park and Kwang Hee Lee, "Arbitrary style transfer with style-attentional networks", CVPR, 2019.
25.
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark et al., Learning transferable visual models from natural language supervision, 2021.
26.
Aditya Ramesh, Mikhail Pavlov, Gabriel Goh, Scott Gray, Chelsea Voss, Alec Radford, et al., Zero-shot text-to-image generation, 2021.
27.
Nerdy Rodent, Vqgan-clip, 2022, [online] Available: https://github.com/nerdyrodent/VQGAN-CLIP.
28.
Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser and Björn Ommer, "High-resolution image synthesis with latent diffusion models", CVPR, pp. 10674-10685, 2022.
29.
Artsiom Sanakoyeu, Dmytro Kotovenko, Sabine Lang and Björn Ommer, "A style-aware content loss for real-time hd style transfer", ECCV, 2018.
30.
Artsiom Sanakoyeu, Dmytro Kotovenko, Sabine Lang and Bjorn Ommer, "A style-aware content loss for real-time hd style transfer", ECCV, 2018.
Contact IEEE to Subscribe

References

References is not available for this document.