IRIS Institutional Research Information System - AIR Archivio Istituzionale della Ricerca

The quality of text-to-image generation is continuously improving, yet the boundaries of its applicability are still unclear. In particular, refinement of the text input with the objective of achieving better results – commonly called prompt engineering – so far seems to have not been geared towards work with preexisting texts. We investigate whether text-to-image generation and prompt engineering could be used to generate basic illustrations of popular fairytales. Using Midjourney v4, we engage in action research with a dual aim: to attempt to generate 5 believable illustrations for each of 5 popular fairytales, and to define a prompt engineering process that starts from a pre-existing text and arrives at an illustration of it. We arrive at a tentative 4-stage process: i) initial prompt, ii) composition adjustment, iii) style refinement, and iv) variation selection. We also discuss three reasons why the generation model struggles with certain illustrations: difficulties with counts, bias from stereotypical configurations and inability to depict overly fantastic situations. Our findings are not limited to the specific generation model and are intended to be generalisable to future ones.

Grimm in Wonderland: Prompt Engineering with Midjourney to Illustrate Fairytales / M. Ruskov (CEUR WORKSHOP PROCEEDINGS). - In: IRCDL 2023 : Information and Research Science Connecting to Digital and Library Science 2023 / [a cura di] A. Bardi, A. Falcon, S. Ferilli, S. Marchesin, D. Redavid. - Aachen : CEUR-WS, 2023 Mar 28. - pp. 180-191 (( Intervento presentato al 19. convegno Information and Research Science Connecting to Digital and Library Science tenutosi a Bari nel 2023.

Grimm in Wonderland: Prompt Engineering with Midjourney to Illustrate Fairytales

M. Ruskov^Primo

2023

Abstract

The quality of text-to-image generation is continuously improving, yet the boundaries of its applicability are still unclear. In particular, refinement of the text input with the objective of achieving better results – commonly called prompt engineering – so far seems to have not been geared towards work with preexisting texts. We investigate whether text-to-image generation and prompt engineering could be used to generate basic illustrations of popular fairytales. Using Midjourney v4, we engage in action research with a dual aim: to attempt to generate 5 believable illustrations for each of 5 popular fairytales, and to define a prompt engineering process that starts from a pre-existing text and arrives at an illustration of it. We arrive at a tentative 4-stage process: i) initial prompt, ii) composition adjustment, iii) style refinement, and iv) variation selection. We also discuss three reasons why the generation model struggles with certain illustrations: difficulties with counts, bias from stereotypical configurations and inability to depict overly fantastic situations. Our findings are not limited to the specific generation model and are intended to be generalisable to future ones.

Scheda breve

Scheda completa

Scheda completa (DC)

	Presenza di coautori internazionali
	
				No
			
	Lingua del contributo
	
				English
			
	Parole chiave
	
				text-to-image generation; prompt engineering; action research; fairytales
			
	Settori scientifico-disciplinari del contributo (sola visualizzazione)
	
				Settore INF/01 - Informatica
			
	Tipo
	
				Intervento a convegno
			
	Revisione (peer review)
	
				Esperti anonimi
			
	Classificazione in base al tipo di ricerca
	
				Ricerca applicata
			
	Classificazione della pubblicazione
	
				Pubblicazione scientifica
			
	Titolo del progetto
	
	Titolo Progetto
	
									Values across Space and Time (VAST)
								
	Acronimo
	
									VAST
								
	Nome finanziatore
	
										EUROPEAN COMMISSION
									
	Finanziamento
	
									H2020
								
	N. Contratto
	
									101004949
								
	Titolo del volume
	
				IRCDL 2023 : Information and Research Science Connecting to Digital and Library Science 2023
			
	Curatori del volume
	
				A. Bardi, A. Falcon, S. Ferilli, S. Marchesin, D. Redavid
			
	Primo luogo di pubblicazione
	
				Aachen
			
	Editore
	
				CEUR-WS
			
	Data di pubblicazione
	
				28-mar-2023
			
	Pagina iniziale
	
				180
			
	Pagina finale
	
				191
			
	Numero di pagine
	
				12
			
	Collana
	
				CEUR WORKSHOP PROCEEDINGS
			
	Numero del volume
	
				3365
			
	Tipo di volume
	
				Volume a diffusione internazionale
			
	Contributo pubblicato in Open Access GOLD o DIAMOND
	
				Diamond
			
	Nome del convegno
	
				Information and Research Science Connecting to Digital and Library Science
			
	Luogo del convegno
	
				Bari
			
	Anno del convegno
	
				2023
			
	Numero del convegno
	
				19
			
	Tipo di convegno
	
				Convegno nazionale
			
	Sezione
	
				Intervento inviato
			
	URL
	
				https://ceur-ws.org/Vol-3365/paper6.pdf
			
	Banca dati sorgente
	
				orcid
			
	Identificativo SCOPUS
	
				2-s2.0-85152035311
			
	Adesione alla policy Open Access di Ateneo
	
				Aderisco
			
	Tutti gli autori
	
						M. Ruskov
					
	Tipologia
	
				Book Part (author)
			
	Fulltext
	
				open
			
	Tipologia sito docente
	
				273
			
	Citazione
	
				Grimm in Wonderland: Prompt Engineering with Midjourney to Illustrate Fairytales / M. Ruskov (CEUR WORKSHOP PROCEEDINGS). - In: IRCDL 2023 : Information and Research Science Connecting to Digital and Library Science 2023 / [a cura di] A. Bardi, A. Falcon, S. Ferilli, S. Marchesin, D. Redavid. - Aachen : CEUR-WS, 2023 Mar 28. - pp. 180-191 (( Intervento presentato al 19. convegno Information and Research Science Connecting to Digital and Library Science tenutosi a Bari nel 2023.
			
	Tipologia
	
				info:eu-repo/semantics/bookPart
			
	Numero autori
	
				1
			
	Tipologia
	
				Prodotti della ricerca::03 - Contributo in volume
			
	Appare nelle tipologie:
	
				03 - Contributo in volume

File in questo prodotto:

File	Dimensione	Formato
paper6.pdf accesso aperto Tipologia: Publisher's version/PDF Dimensione 19.09 MB Formato Adobe PDF Visualizza/Apri	19.09 MB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1013468

Citazioni

ND

10

ND

ND

social impact