Article "Fuzzy Ingenuity" on AI image generators now published

The current issue of the Journal of Interdisciplinary Image Studies Image (37/2023) features an article from the RHET AI Center on AI-based image generators. In their paper Fuzzy Ingenuity: Creative Potentials and Mechanics of Fuzziness in Processes of Image Creation with AI-Based Text-to-Image Generators, the authors Dr. Erwin Feyersinger, Lukas Kohmann and Michael Pelzer examine the creative potentials and mechanisms of fuzziness that arise in the creation of images using AI-based text-to-image generators. Different theoretical perspectives are discussed to make these mechanisms of fuzziness tangible.

These four images were created using Stable Diffusion and the title of the article "Fuzzy Ingenuity: Creative Potentials and Mechanics of Fuzziness in Processes of Image Creation with AI-Based Text-to-Image Generators" as a prompt.

This issue of Image emerged from the workshop Dall‑E, Midjourney, Stable Diffusion: Responses from Media Studies toward a "New Paradigm" of Image Production, and includes other papers dealing with AI image generators such as Dall‑E, Midjourney, and Stable Diffusion.

Abstract: Fuzzy Ingenuity. Creative Potentials and Mechanics of Fuzziness in Processes of Image Creation with AI-Based Text-to-Image Generators

This explorative paper focuses on fuzziness of meaning and visual representation in connection with text prompts, image results, and the mapping between them by discussing the question: How does the fuzziness inherent in artificial intelligence-based text-to-image generators such as DALL·E 2, Midjourney, or Stable Diffusion influence creative processes of image production – and how can we grasp its mechanics from a theoretical perspective? In addressing these questions, we explore three connected interdisciplinary approaches: (1) Text-to-image generators give new relevance to Hegel’s notion of language as ‘the imagination which creates signs’. They reinforce how language itself inevitably acts as a meaning-transforming system and extend the formative dimension of language with a technology-driven facet. (2) From the perspective of speech act theory, we discuss this explorative interaction with an algorithm as performative utterances. (3) In further examining the pragmatic dimension of this interaction, we discuss the creative potential arising from the visual feedback loops it includes. Following this thought, we show that the fuzzy variety of images which DALL·E 2 presents in response to one and the same text prompt contributes to a highly accelerated form of externalized visual thinking.

Article "Fuzzy Ingenuity" on AI image generators now published

Beteiligte Institutionen

Gefördert durch