References
- Alexey Dosovitskiy et al., 2020, An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale: https://arxiv.org/abs/2010.11929
- CLIP: https://github.com/openai/CLIP
- DALL-E: https://openai.com/blog/dall-e/
- OpenAI resources: https://openai.com
- Aditya Ramesh, Mikhail Pavlov, Gabriel Goh, Scott Gray, Chelsea Voss, Alec Radford, Mark Chen, and Ilya Sutskever, 2021, Zero-Shot Text-to-Image Generation: https://arxiv.org/abs/2102.12092
- Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin, 2017, Attention Is All You Need, https://arxiv.org/abs/1706.03762
- GPT-4V system card: https://openai.com/research/gpt-4v-system-card
- Divergent semantic association: Honghua Chen and Nai Ding, 2023, Probing the Creativity of Large Language Models: Can models produce divergent semantic association?: https://arxiv.org/abs/2310.11158