AI生成画像は何を「欲して」いるのか？～ビジネスチャンスを紐解く～

Published：2025/10/23 8:48:47

タイトル & 超要約：AI生成画像の「欲求」をビジネスに！✨

詳細な学術論文を、ギャルが分かりやすく解説していくよ～！💖 IT企業の新規事業開発担当者さん、必見だよ👀

ギャル的キラキラポイント✨

● AI生成画像って、意外と「現実」を求めてるんだって！写真みたいな表現がウケるってこと😉 ● 具体的な「モノ」を生成する技術が、ビジネスで重要になるって！🤩 ● いろんな表現方法を組み合わせると、新しい価値が生まれる可能性大だってさ！💎

詳細解説

背景

AI画像生成技術って、テキストから画像を作るのが主流（しゅりゅう）じゃん？🤔 でも、その裏側はめっちゃ複雑で、まるで魔法🧙‍♀️ IT業界では、この技術を活かしたサービスがどんどん生まれてるから、ちゃんと理解することが大事だよ💕

方法

W.J.T. Mitchellの論文を参考に、AI生成画像が「何を欲しているか」を研究してるんだって！🧐 今までの研究は、写真の解釈（かいしゃく）とか作った人の意図に注目してたけど、今回はAI画像自体に注目したってとこが新しいよね🌟

続きは「らくらく論文」アプリで

What do AI-Generated Images Want?

Amanda Wasielewski

W.J.T. Mitchell's influential essay 'What do pictures want?' shifts the theoretical focus away from the interpretative act of understanding pictures and from the motivations of the humans who create them to the possibility that the picture itself is an entity with agency and wants. In this article, I reframe Mitchell's question in light of contemporary AI image generation tools to ask: what do AI-generated images want? Drawing from art historical discourse on the nature of abstraction, I argue that AI-generated images want specificity and concreteness because they are fundamentally abstract. Multimodal text-to-image models, which are the primary subject of this article, are based on the premise that text and image are interchangeable or exchangeable tokens and that there is a commensurability between them, at least as represented mathematically in data. The user pipeline that sees textual input become visual output, however, obscures this representational regress and makes it seem like one form transforms into the other -- as if by magic.

cs / cs.CY / cs.AI

Arxivで見る