Jump to content

Prompt engineering: Difference between revisions

No edit summary
Line 13: Line 13:
[[File:Prompt writing elements.png|thumb|Figure 1. Prompt writing elements. Source: Oppenlaender (2022)]]
[[File:Prompt writing elements.png|thumb|Figure 1. Prompt writing elements. Source: Oppenlaender (2022)]]


A prompt usually includes a subject term, while any other parts of the prompt are optional (figure 1). However, modifiers are often added to improve the resulting images and provide more control over the creation process. These modifiers are applied through experimentation or based on best practices learned from experience or online resources. <ref name="”2”"></ref> Modifiers can either alter the style of the generated image, for example, or boost its quality. There can be overlapping effects between style modifiers and quality boosters. Once a style modifier has been added, solidifiers (using repetition) can be applied to any of the other types of modifiers. The textual prompt can be divided into two main components: the physical and factual content of the image, and the stylistic considerations in the way the physical content is displayed. <ref name="”2”"></ref><ref name="”7”">Witteveen, S and Andrews, M (2022). Investigating Prompt Engineering in Diffusion Models. arXiv:2211.15462v1 https://arxiv.org/pdf/2211.15462.pdf</ref>
A [[prompt]] usually includes a subject term, while any other parts of the prompt are optional (figure 1). However, [[modifiers]] are often added to improve the resulting images and provide more control over the creation process. These modifiers are applied through experimentation or based on best practices learned from experience or online resources. <ref name="”2”"></ref> Modifiers can either alter the style of the generated image, for example, or boost its quality. There can be overlapping effects between style modifiers and quality boosters. Once a style modifier has been added, solidifiers (using repetition) can be applied to any of the other types of modifiers. The textual prompt can be divided into two main components: the physical and factual content of the image, and the stylistic considerations in the way the physical content is displayed. <ref name="”2”"></ref><ref name="”7”">Witteveen, S and Andrews, M (2022). Investigating Prompt Engineering in Diffusion Models. arXiv:2211.15462v1 https://arxiv.org/pdf/2211.15462.pdf</ref>


To enhance the quality of the output images, it is common to include specific keywords before and after the image description following the formula prompt = [keyword1, . . . , keywordm−1] [description] [keywordm, . . . , keywordn]. For example, a user wanting to generate an image of a cat using a text-to-image model may use a specific prompt template that includes a description of a painting of a calico cat and keywords such as highly detailed, cinematic lighting, dramatic atmosphere, and others. This approach helps to provide additional information to the model and improve the generated image's quality. <ref name="”8”">Pavlichenko, N, Zhdanov and Ustalov, D (2022) Best Prompts for Text-to-Image Models and How to Find Them. arXiv:2209.11711v2</ref>
To enhance the quality of the output images, it is common to include specific keywords before and after the image description following the formula prompt = [keyword1, . . . , keywordm−1] [description] [keywordm, . . . , keywordn]. For example, a user wanting to generate an image of a cat using a text-to-image model may use a specific prompt template that includes a description of a painting of a calico cat and keywords such as highly detailed, cinematic lighting, dramatic atmosphere, and others. This approach helps to provide additional information to the model and improve the generated image's quality. <ref name="”8”">Pavlichenko, N, Zhdanov and Ustalov, D (2022) Best Prompts for Text-to-Image Models and How to Find Them. arXiv:2209.11711v2</ref>