AI Art Application and Improvements Handbook

This AI Art Application and Improvements Handbook is intended to help people create free useful media for the public domain using AI art generators in practice with a focus getting things done in practice at all skill-levels. It informs about notable potential and existing applications and equips the reader with information about how to best implement these specific applications.

Launching
There are many ways you can use these tools. Main ways include:
 * You can install Stable Diffusion locally if you have a good graphics card. Whether that is a good idea depends on your hardware and needs. If you do so, AUTOMATIC1111 WebUI is probably the most advanced software to use but alternatives are listed here and also have benefits.
 * You can use a web platform like playgroundai.com to use it online (on many sites that is possible for free)
 * You can use an extension for an art software like for Krita or for Photoshop

Prompts
Which prompts work best differs by AI generator. The promptomania prompt builder is a great place to get started with prompts and to have a cheatsheet of different art styles one could use. It is missing many styles but may become more complete over time and be good enough for learning purposes. Many sites such as openart.ai and playgroundai.com let you see many other filterable/searchable images along with their prompts which you could build upon and learn from.

Here is a further comprehensive resource and here a list of resources for Stable Diffusion. You can use style studies (selected comprehensive ones: 1 2 3 4) to learn more about which styles you could use and could combine multiple styles. However, which style to use it not the tricky part or necessary to learn, you could just add phrases like "comic style", "3D render", "matte painting" to the prompt. When sites offer pre-made styles they usually just attach several terms to the end of the prompt.

Misgenerations and creating improved versions
As you can see below there still are some issues with these images. People who have better AI art skills may be able to generate much better images. Usually one may need to do slight manual editing.

Moreover, over time these images could be improved by their uploaders or other people using for example tools including:
 * the the Clipdrop cleanup tool
 * inpainting (requires some skills)
 * AI art web platforms' [# face restoration]
 * upscaling features
 * manually editing the images in image editors like GIMP or Photoshop
 * recreating the image using the same or similar prompts (example)
 * AI text removal tools (example)

If you can improve an existing image or an image you uploaded earlier on Wikimedia Commons, upload it as a new version, not as a separate new file. If the image has text, it can be removed via the listed ways. However, to prevent text from being anywhere in the image is best to use negative prompts, albeit that can be problematic for example when you'd like to generate a street scene with store texts being visible in the background. This is a good example of a specific skill to learn when generating AI art: creating texts that fit neatly into the image.

You need to continuously adjust the prompt until you get good results, sometimes and at some point it is better to just generate a new image from the same prompt rather than adjust the prompt (make sure the seed is set to random and not always the same except if you want to make the image look like the one just generated).

You can also generate a new image from the image just generated via img2img and then put it underneath the newly generated image as a layer in GIMP. Then cut out the upper layer to have the former visible at the places where you'd like it to (example).

Negative prompt
If you see things in your generated image that you don't want there or anticipate that the AI generator may add them or misunderstand your prompt in certain ways add these as negative prompt terms.

Examples of useful negative prompt terms to use when you generate…
 * humans:  (TBA)
 * rooms:

Add more terms as unwanted things show up when you generate to exclude them from the next images. You can also use a result in img2img and try to remove the unwanted parts e.g. by using the prior prompt but an additional negative prompt term if cleanup tools don't remove such well.

Parameters
Some images have their parameters specified. A step count of around 40 often yields best results. Setting the prompt strength too high such as over 10 makes it more difficult to get a good picture.

Differences between generators
Stable Diffusion is open source so that one is recommended and focused on here. However, Midjourney may as of 2023 often generate better images in many cases and DALL-E probably as well in some or many cases. A difference between SD and DALL-E for example is that in SD the prompts are phrased like tags separated by commas, not whole sentences or similar. See these pages for a comparison between software results for the same prompts as well as the style studies linked above.

Paleoart of the ancient past
AI art can be used to create realistic-looking scenes that depict the past either how we it may have looked like to the best of our knowledge, for example including high-resolution depictions of extinct ancient organisms. For accuracy, substantial skills are required. For such images, img2img techniques can be used.

Good anthropological knowledge may be required to be able to create an image that is not clearly inaccurate and likely a realistic depiction. For example, a major flaw is that AI art generators are likely to generate hairstyles that were impossible to highly unlikely in the deep past of pre-humans and ancient humans. See also "Inaccurate paleoart" on WMC. Crowd-reviewing systems and practices may evolve that provide feedback so that AI art engineers can modify their images according to best available scientific knowledge. Future developments may enable combination of paleontological data and tools and paleoart techniques with AI art software to enable more accurate and useful images. For now, if you do not have good anthropological knowledge try to collaborate with somebody who has before putting your image out in the public domain for other people to use.

Caricatures and public characters
In the 2020s it became more easily possible to create artworks using public characters due to the emergence of AI art generators like Stable Diffusion.

This
 * it democratized the creation of caricatures and political art
 * enabled problematic
 * enabled humorous art using known characters, including fictional characters (prime example: 'Harry Spotter')

It works well with some specific public characters without any kind of extra training. Some of these are well-known to be easily generatable in realistic-looking ways such as Vladmir Putin.

Historical scenes
AI art can be used to create realistic-looking scenes that depict the past either how we it may have looked like to the best of our knowledge or how stories depict it. The latter may also include images for imaginary stories of the past, illustrating how imaginaries of past people may have looked like in more visual ways.

Whether or not there are still some minor glitches may not matter very much when you're interested in visualizing for example how ordinary daily life experienced by average people may have looked like in high resolution or when creating the first image of some historical events that are in the public domain rather than locked away.

Using tools like one can train AIs on a set of images based on a historical figure. Below are some examples which may deviate somewhat from how (Ferrandino d'Aragona) looked like at an older age according to the artistic drawing that is the first image here and the second image that was drawn a whole hundred years after he died:

As just explained, AI generators still have problems with generating faces and other issues. Please keep that in mind since correcting that can require significant skills and may limit the usefulness or realism of the images.

Images can also focus on historical events entirely without any kind of historic character, realistic or not, in the foreground.

Educational games
AI art can be used to generate the images for board games, for example for the cards. These can be educational games or otherwise useful. Note that in such cases you should only generate the image, not full cards because the text for example will be gibberish.

Objects and topics for which no free media is available
For example it can show how pulp science fiction comics looked like or how what a science fiction subgenre is about or what the styles and themes of it are. It can illustrate how a certain style or object looks like and other things but it requires a disclaimer that the image is AI-generated. One way this can be useful is showing people which media is currently missing but would be useful in terms of the concept.

Illustrating contents of books
For the last image, text was removed with a text removal tool as listed above and then added via GIMP.

Illustrating technologies, ideas and concepts
Especially useful if no other or only low-quality images are available for the concept

Creative children's games and sketches
Children could make drawings, then use these drawings as image input for img2img generation, describing what the image is intended to show. The child's description is then used for the prompt that is added to the sketch input. This may enable children to build up their creativity and imaginative skills.
 * Pro imagination creative AI art kids game

There could be an app for that where voice input is possible or adults could help kids where the kids first make the sketch and the adult takes a photo and asks what it's supposed to show so that the AI art generates images which the child can refine and use as inspiration for further images, for modifications to the image and feedback and so on. It reduces the level of cognitive and technical minimum skills required for artistic engagement enabling novel ways of imaginative play, especially for children.

As part of games
Beyond more accessible card art design and similar applications, the AI art generation itself could be part of games. These are simply entertaining and could also raise skills of AI art generation.

Multiple (e.g. two) players take turns at generating an AI art image by altering the prompt or writing a new prompt. The starting player draws a scene, a being, or something similar. The second tries to generate an image where what is depicted is turned on its head or is changed in another specified way such as being destroyed or successfully defeated. One can either take turns and the first image where the specified intention was achieved wins that round or the second player could have multiple tries with the best outcome being a successful image at first try. This works best when the prompt is only changed and not completely replaced so that the object is similar, one may also specify the seed to remain the same.
 * Sketch Wars

Similar to the word guessing game Taboo, people must create an image that enables others to quickly correctly guess the concept they are trying to depict. Multiple specified terms can't be used in the prompt. Only e.g. three tries are allowed for the image and the concepts aren't as simple as "tree" but relatively difficult to visualize.
 * Concept Guessing

Known problems and current state of avoiding them
There are many ways known problems could get identified and fixed or mitigated. These include:


 * Updates to the AI txt2txt image generator software
 * Models specifically tuned for specific purposes, especially Stable Diffusion ones since that software is open source; see Citivai Models
 * Manual improvements during prompting or via inpainting, img2img, and image editor software

Whether or not and which of such problems will persist is unknown and has not yet been thoroughly investigated. At one point it may be possible to for example use Wikidata items instead of words. For example there is work on user-provided concepts (like an object or a style) learned from few images so that these concepts (e.g. objects or styles) can be incorporated via the newly associated word/s.