Evaluating Effectiveness of AI-Generated Photos: A Pre-Study

This pre-study examines the effectiveness of AI-generated images compared to traditional stock photos within educational contexts. Using AI models such as DALL-E 3 and Stable Diffusion, it evaluates the realism, aesthetic appeal, and educational value of generated images. Findings suggest that AI-generated images could transform educational materials by offering customisable, relevant, and technically precise 'photos' aids that enrich learning experiences.

pre-study ai-generated photos

7 November 2024 5-minute read

Introduction

In the digital era, imagery serves as a critical component in educational materials, traditionally dominated by stock photos. AI image generation tools like DALL-E and Stable Diffusion mark a new era in image creation, bringing more choice and customisation. This pre-study investigates the effectiveness of AI-generated images against traditional stock photos, particularly in an educational context.

Research Context

While AI's role in image production from a technological perspective has already been studied, its practical application in education remains underexplored. This study addresses this gap by assessing the effectiveness of AI-generated images as a possible replacement or complement to traditional stock photos in educational settings.

Methodology

Model and Prompt Selection

We selected prominent AI models, including DALL-E 3 and various configurations of Stable Diffusion, tasked with creating images for ten distinct educational scenes:

  • DALL-E 3 (basic and advanced prompt)
  • Stable Diffusion Juggernaut Lightning (advanced prompt)
  • Stable Diffusion 3.5 Large (advanced prompt)
Example Advanced Prompt
A realistic, high-quality image of a smaller AI literacy workshop setting. The scene is set in a cosy, well-lit room with a contemporary design, featuring a diverse group of four adult participants, two men and two women of different ethnicities, each working intently on their laptops. A middle-aged man, acting as the instructor, stands near a digital display showing engaging but unlabelled AI graphics. The room is equipped with modern tech but feels intimate and less crowded, with natural light streaming in through large windows.

Image Evaluation Criteria

Images were evaluated on realism, relevance, aesthetic quality, diversity, potential for educational use, technical accuracy, scalability, customiseability, contextual appropriateness, ethical considerations, and legal compliance.

Evaluation Methods

The study employed quantitative scoring, focus groups, and reverse image searches to assess the images' uniqueness and practicality.

Results

The results of our pre-study show the potential of AI-generated images to provide realistic, relevant, and aesthetically appealing content to use to illustrate educational situations.

results of one scene for four model and prompt combinations
Figure 1. Results one scene for the four model/prompt combinations.

Realism and Technical Accuracy

AI-generated images produced by models like DALL-E 3 and especially Stable Diffusion demonstrated a nice degree of photorealism that often already comes close to traditional stock photos. The AI-generated images were particularly effective in depicting complex, realistic settings that mimicked real-world environments. The technical accuracy of these images was notable, especially in scenarios involving advanced prompts where the AI had to generate detailed technological setups or scientific equipment. However, several iterations were sometimes needed to correct errors such as floating hands and typing without a laptop.

Relevance to the Prompt

The relevance of AI-generated images to the provided prompts was exemplary. Images generated from advanced prompts showed a robust alignment with the intended educational content, effectively illustrating both abstract concepts and specific scenarios detailed in the prompts.

results of one scene for four model and prompt combinations
Figure 2. Results one scene for the four model/prompt combinations.

Differences in Model Capabilities

The choice of model (DALL-E 3 vs. Stable Diffusion versions like Juggernaut Lightning and SD 3.5 Large) plays a significant role in the quality and uniqueness of the generated images. Stable Diffusion's advanced versions, optimised for detail and realism, tend to generate images that are already well-suited for professional use.

Aesthetic Quality and Customisability

Visually, AI-generated images adhered to a high standard of aesthetic quality, with clean, professional compositions that would be at home in textbooks or digital course materials. The ability to customise these images was a significant advantage over traditional stock photos. AI models could adjust elements like ethnicity, setting, and age to better reflect the diversity of the student population and the specifics of the educational content.

Potential for Educational Use

The images' scalability and customisation potential confirmed their suitability for broad educational use. The ability to generate unique images tailored to course content allows for a more dynamic and engaging learning experience.

results of one scene for four model and prompt combinations
Figure 3. Results one scene for the four model/prompt combinations.

Legal and Licensing Compliance

The AI-generated images and used tools complied with current copyright and licensing standards, making them a safe choice for educational use. However, it remains vital to ensure that all elements within the images are appropriately licensed and do not infringe on copyright protections, especially when these images are used in publicly available educational materials.

Reverse Image Search Findings

Reverse image search tests showed that while AI-generated images were unique, they occasionally bore similarities to existing images found online, particularly those generated from less detailed prompts and DALL-E. This highlighted the importance of using detailed and unique prompts to generate truly unique images that do not unintentionally mimic copyrighted visuals.

reverse image search results for DALL-E images
Figure 4. Reverse image searches of DALL-E images often reveal many similar visuals on the web.

Discussion

This pre-study indicates that the potential of AI-generated graphics in educational settings primarily lies in their adaptability and relevance, though this relies on carefully crafted prompts. Future research should evaluate a broader range of image generation models and further investigate prompt engineering to enhance the educational value of AI-generated images.

Conclusion

AI-generated images hold considerable promise in enhancing educational materials by providing customisable and relevant content. Their effective implementation necessitates careful attention to continuous refinement to avoid legal and bias-related pitfalls.

On-the-Job AI Coaching »