Prompt engineering for image generation: Difference between revisions

no edit summary
No edit summary
No edit summary
Line 5: Line 5:
A [[prompt]] usually includes a subject term, while any other parts of the prompt are optional (figure 1). However, [[modifiers]] are often added to improve the resulting images and provide more control over the creation process. These modifiers are applied through experimentation or based on best practices learned from experience or online resources. <ref name="”2”"></ref> Modifiers can either alter the style of the generated image, for example, or boost its quality. There can be overlapping effects between style modifiers and quality boosters. Once a style modifier has been added, solidifiers (using repetition) can be applied to any of the other types of modifiers. The textual prompt can be divided into two main components: the physical and factual content of the image, and the stylistic considerations in the way the physical content is displayed. <ref name="”2”">Oppenlaender, J (2022). A Taxonomy of Prompt Modifiers for Text-To-Image Generation. arXiv:2204.13988v2</ref><ref name="”7”">Witteveen, S and Andrews, M (2022). Investigating Prompt Engineering in Diffusion Models. arXiv:2211.15462v1 https://arxiv.org/pdf/2211.15462.pdf</ref>
A [[prompt]] usually includes a subject term, while any other parts of the prompt are optional (figure 1). However, [[modifiers]] are often added to improve the resulting images and provide more control over the creation process. These modifiers are applied through experimentation or based on best practices learned from experience or online resources. <ref name="”2”"></ref> Modifiers can either alter the style of the generated image, for example, or boost its quality. There can be overlapping effects between style modifiers and quality boosters. Once a style modifier has been added, solidifiers (using repetition) can be applied to any of the other types of modifiers. The textual prompt can be divided into two main components: the physical and factual content of the image, and the stylistic considerations in the way the physical content is displayed. <ref name="”2”">Oppenlaender, J (2022). A Taxonomy of Prompt Modifiers for Text-To-Image Generation. arXiv:2204.13988v2</ref><ref name="”7”">Witteveen, S and Andrews, M (2022). Investigating Prompt Engineering in Diffusion Models. arXiv:2211.15462v1 https://arxiv.org/pdf/2211.15462.pdf</ref>


To enhance the quality of the output images, it is common to include specific keywords before and after the image description following the formula prompt = [keyword1, . . . , keywordm−1] [description] [keywordm, . . . , keywordn]. For example, a user wanting to generate an image of a cat using a text-to-image model may use a specific prompt template that includes a description of a painting of a calico cat and keywords such as highly detailed, cinematic lighting, dramatic atmosphere, and others. This approach helps to provide additional information to the model and improve the generated image's quality. <ref name="”8”">Pavlichenko, N, Zhdanov and Ustalov, D (2022) Best Prompts for Text-to-Image Models and How to Find Them. arXiv:2209.11711v2</ref>
To enhance the quality of the output images, it is common to include specific keywords before and after the image description following the formula prompt = [keyword1, . . . , keywordm−1] [description] [keywordm, . . . , keywordn]. For example, a user wanting to generate an image of a cat using a text-to-image model may use a specific [[prompt template]] that includes a description of a painting of a calico cat and keywords such as highly detailed, cinematic lighting, dramatic atmosphere, and others. This approach helps to provide additional information to the model and improve the generated image's quality. <ref name="”8”">Pavlichenko, N, Zhdanov and Ustalov, D (2022) Best Prompts for Text-to-Image Models and How to Find Them. arXiv:2209.11711v2</ref>


According to Oppenlaender (2022), there are several opportunities for future research on this field of study:  
According to Oppenlaender (2022), there are several opportunities for future research on this field of study:  
370

edits