Text-to-drawing synthesis with artistic control | CLIPDraw & StyleCLIPDraw
https://www.louisbouchard.ai/clipdraw/
https://arxiv.org/pdf/2111.03133.pdf
This new model by Peter Schaldenbrand et al., called StyleCLIPDraw, which is an improvement upon CLIPDraw by Kevin Frans et al., takes an image and text as inputs and can generate a new image based on your text and following the style in the image.
CLIP is a model developed by OpenAI that can basically associate a line of text with an image. Both the text and images will be encoded similarly so that they will be very close to each other in the new space they are encoded in if they mean the same thing. Using CLIP, the researchers could understand the text from the user input and generate an image out of it. If you are not familiar with CLIP yet, I would recommend reading the article I wrote on Toward’s AI about it together with DALL-E earlier this year.
References
- CLIPDraw: Frans, K., Soros, L.B. and Witkowski, O., 2021. CLIPDraw: exploring text-to-drawing synthesis through language-image encoders. https://arxiv.org/abs/2106.14843
- StyleCLIPDraw: Schaldenbrand, P., Liu, Z. and Oh, J., 2021. StyleCLIPDraw: Coupling Content and Style in Text-to-Drawing Synthesis. https://arxiv.org/abs/2111.03133
- StyleCLIPDraw code: https://github.com/pschaldenbrand/StyleCLIPDraw