Saturday, January 28, 2023

Google AI Unveils Muse, a New Text-To-Image Transformer Model - Daniel Dominguez, Infoq

Google AI released a research paper about Muse, a new Text-To-Image Generation via Masked Generative Transformers that can produce photos of a high quality comparable to those produced by rival models like the DALL-E 2 and Imagen at a rate that is far faster. Muse is trained to predict randomly masked image tokens using the text embedding from a large language model that has already been trained. This job involves masked modeling in discrete token space. Muse uses a 900 million parameter model called a masked generative transformer to create visuals instead of pixel-space diffusion or autoregressive models.

https://www.infoq.com/news/2023/01/google-muse-text-to-image/