Sketchformer: Transformer-Based Representation for Sketched Structure | IEEE Conference Publication | IEEE Xplore

Sketchformer: Transformer-Based Representation for Sketched Structure


Abstract:

Sketchformer is a novel transformer-based representation for encoding free-hand sketches input in a vector form, i.e. as a sequence of strokes. Sketchformer effectively a...Show More

Abstract:

Sketchformer is a novel transformer-based representation for encoding free-hand sketches input in a vector form, i.e. as a sequence of strokes. Sketchformer effectively addresses multiple tasks: sketch classification, sketch based image retrieval (SBIR), and the reconstruction and interpolation of sketches. We report several variants exploring continuous and tokenized input representations, and contrast their performance. Our learned embedding, driven by a dictionary learning tokenization scheme, yields state of the art performance in classification and image retrieval tasks, when compared against baseline representations driven by LSTM sequence to sequence architectures: SketchRNN and derivatives. We show that sketch reconstruction and interpolation are improved significantly by the Sketchformer embedding for complex sketches with longer stroke sequences.
Date of Conference: 13-19 June 2020
Date Added to IEEE Xplore: 05 August 2020
ISBN Information:

ISSN Information:

Conference Location: Seattle, WA, USA
References is not available for this document.

Select All
1.
D. Ha and D. Eck, "A neural representation of sketch drawings", Proc. ICLR, 2018.
2.
P. Xu, Y. Huang, T. Yuan, K. Pang, Y-Z. Song, T. Xiang, et al., "Sketchmate: Deep hashing for million-scale human sketch retrieval", Proc. CVPR, 2018.
3.
J. Collomosse, T. Bui and H. Jin, "Livesketch: Query perturbations for guided sketch-based visual search", Proc. CVPR, pp. 1-9, 2019.
4.
T. Bui, L. Ribeiro, M. Ponti and J. Collomosse, "Compact descriptors for sketch-based image retrieval using a triplet loss convolutional neural network", Computer Vision and Image Understanding (CVIU), 2017.
5.
Yonggang Qi, Yi-Zhe Song, Honggang Zhang and Jun Liu, "Sketch-based image retrieval via siamese convolutional neural network", Proc. ICIP, pp. 2460-2464, 2016.
6.
J. Collomosse, T. Bui, M. Wilber, C. Fang and H. Jin, "Sketching with style: Visual search with sketches and aesthetic context", Proc. ICCV, 2017.
7.
Patsorn Sangkloy, Nathan Burnell, Cusuh Ham and James Hays, "The sketchy database: Learning to retrieve badly drawn bunnies", Proc, 2016.
8.
Jacob Devlin, Ming-Wei Chang, Kenton Lee and Kristina Toutanova, "BERT: Pre-training of deep bidirectional transformers for language understanding", Proc. Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 4171-4186, 2019.
9.
Dai Zihang, Yang Zhilin, Yang Yiming, W Cohen William, Carbonell Jaime, V Le Quoc, et al., Transformer-xl: Attentive language models beyond a fixed-length context, 2019.
10.
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. Gomez, et al., "Attention is all you need", InProc. NeurIPS, 2017.
11.
The Quick Draw! Dataset.
12.
R. Hu, T. Wang and J. Collomosse, "A bag-of-regions approach to sketch based image retrieval", Proc. ICIP, 2011.
13.
S. James, R. Hu, T. Wang and J. Collomosse, "Markov random fields for sketch based video retrieval", Proc. ICMR, 2013.
14.
Y. Cao, H. Wang, C. Wang, Z. Li, L. Zhang and L. Zhang, "Mindfinder: Interactive sketch based image search on millions of images", Proc, 2010.
15.
Mathias Eitz, James Hays and Marc Alexa, "How do humans sketch objects", Proc, vol. 31, pp. 44:1-44:10, 2012.
16.
Rui Hu and John Collomosse, "A performance evaluation of gradient field HOG descriptor for sketch based image retrieval", Computer Vision and Image Understanding (CVIU), vol. 117, no. 7, pp. 790-806, 2013.
17.
R. Hu, S. James, T. Wang and J. Collomosse, "Motion-sketch based video retrieval using a trellis levenshtein distance", Proc. ICPR, pp. 121-124, 2010.
18.
Tu Bui and John Collomosse, "Scalable sketch-based image retrieval using color gradient features", Proc. ICCV Workshops, pp. 1-8, 2015.
19.
Rosália G Schneider and Tinne Tuytelaars, "Sketch classification and classification-driven analysis using fisher vectors", ACM Transactions on Graphics (TOG), vol. 33, no. 6, 2014.
20.
Hua Zhang, Si Liu, Changqing Zhang, Wenqi Ren, Rui Wang and Xiaochun Cao, "Sketchnet: Sketch classification with web images", Proc. CVPR, pp. 1105-1113, 2016.
21.
Jiang Wang, Yang Song, Thomas Leung, Chuck Rosenberg, Jingbin Wang, James Philbin, et al., "Learning fine-grained image similarity with deep ranking", Proc. CVPR, pp. 1386-1393, 2014.
22.
Filip Radenović, Giorgos Tolias and Ondřej Chum, "CNN image retrieval learns from BoW: Unsupervised fine-tuning with hard examples", Proc. ECCV, pp. 3-20, 2016.
23.
Albert Gordo, Jon Almazán, Jerome Revaud and Diane Lar-lus, "Deep image retrieval: Learning global representations for image search", Proc. ECCV, pp. 241-257, 2016.
24.
Qian Yu, Feng Liu, Yi-Zhe Song, Tao Xiang, Timothy M Hospedales and Chen-Change Loy, "Sketch me that shoe", Proc. CVPR, pp. 799-807, 2016.
25.
T. Bui, L. Ribeiro, M. Ponti and J. Collomosse, "Generalisation and sharing in triplet convnets for sketch based visual search" in CoRR Abs, 2016.
26.
M. Wilber, C. Fang, H. Jin, A. Hertzmann, J. Collomosse and S. Belongie, "Bam! the behance artistic media dataset for recognition beyond photography", Proc. ICCV, 2017.
27.
O. Seddati, S. Dupont and S. Mahoudi, "Quadruplet networks for sketch-based image retrieval", Proc. ICMR, 2017.
28.
T. Bui, L. Ribeiro, M. Ponti and J. Collomosse, "Sketching out the details: Sketch-based image retrieval using convolutional neural networks with multi-stage regression", Elsevier Computers Graphics, 2018.
29.
K. Pang, K. Li, Y. Yang, H. Zhang, T. Hospedales, T. Xiang, et al., "Generalising fine-grained sketch-based image retrieval", Proc. CVPR, 2019.
30.
S. Dey, P. Riba, A. Dutta, J. Llados and Y. Song, "Doodle to search: Practical zero-shot sketch-based image retrieval", Proc. CVPR, 2019.
Contact IEEE to Subscribe

References

References is not available for this document.