Assessing Creativity in Text-to-Image Generation: A Quantitative Analysis using Structured Human Rating Metrics
DOI:
https://doi.org/10.15157/IJITIS.2025.8.2.355-373Keywords:
Text-to-Image Generation, Creativity Assessment, Human Evaluation, Midjourney, DALL•E 2, Stable Diffusion, Aesthetic Computing, Generative AI, Multimodal Models, Computational CreativityAbstract
This research examines the creativity of text-to-image (T2I) generation models using a systematic human rating framework to evaluate four important dimensions of creativity: originality, relevance, aesthetic appeal, and imaginativeness. The advanced development of generative AI tools DALL•E 2, Mid journey, and Stable Diffusion creates subjective barriers to measuring their creative output. This evaluation analyses 100 pictures generated by DALL•E 2 versus Midjourney versus Stable Diffusion through testing a wide spectrum of commands from various artistic domains. The evaluation demonstrates that Mid journey offers better artistic results than DALL•E 2 and Stable Diffusion when comparing artistic achievements between the models. DALL•E 2 stands out for its relevance because it produces prompts with extremely strong semantic alignment to the provided instructions. The total creativity score for Stable Diffusion falls below its rivals but the model presents occasional improvements in originality. The framework quality shows itself through high agreement among evaluators. The evaluation needs multiple assessment methods to identify distinctive creative abilities of T2I models while providing important guidance for AI development in creative domains in the future. The research has established comprehensive evaluation standards that future investigations in creative AI must follow because of the essential need for methodological rigor.