One of the approaches identified in this study is Cross-modal Semantic Matching Generative Adversarial Networks (CSM-GAN), which is used to increase semantic consistency between text descriptions and synthesised pictures for fine-grained text- to-image creation.
Abstract—This survey reviews text-to-image generation by using different approaches. One of the approaches identified in this study is Cross-modal Semantic Matching Generative Adversarial Networks (CSM-GAN), which is used to increase semantic consistency between text descriptions and synthesised pictures for fine-grained text- to-image creation. This includes other two modules, Text Encoder Module and Textual-Visual Semantic Matching Module. We further discussed about Imagen which is a text- to-image diffusion model with photorealism and deep language understanding, which is used on the COCO dataset. Lastly, we discussed about Text to image synthesis used to automates image generation using conditional generative models and GAN, enhancing artificial intelligence and deep learning. Based on these approaches we present a review of text to image generation using generative AI. Keywords— Generative AI, Diffusion model, Text-to- image, Imagen, CSM-GAN