Can AI Tutors Truly Hold Student Attention?
Case Study
We stress-tested AI tutors on six very different learning challenges - from maths and coding to argument analysis and science. Every model could solve the task. But the real story lies in how they taught: the personas they adopted, the engagement they sustained, and the kinds of learning they supported.

TABLE OF CONTENTS
TL;DR - At a Glance
- Challenge: AI tutors can make learning faster, but risk is superficial engagement.
- Solution: Six cases compared four study modes across maths, reasoning, coding, writing, science, and data tasks.
- Outcome: Models revealed distinct “teaching personas” - ChatGPT the coach, Gemini the explainer, Copilot the consultant, and Mistral the professor - each with trade-offs in depth, persistence, and learner agency.
Project Snapshot
This project synthesizes six controlled cases where four major AI study modes were tested in 10-15 minute sessions. Instead of checking only for accuracy, we observed how each mode kept students thinking, questioning, and reflecting.
Why it matters: AI study companions are already in classrooms and homes. Their impact depends less on whether they know the answer, and more on whether they can sustain engagement and depth of learning.
The Challenge & Context
Research shows that AI can improve test scores and speed up learning - one meta-analysis found up to 26% more tests passed and 35% faster problem-solving. Yet these gains fade quickly if students slip into passive answer-copying.
The challenge is not whether AI can explain, but whether it can teach: Does it ask questions? Encourage reflection? Persist with learners when they hesitate?
That inspired this engagement design case study - a series of neutral prompts designed to expose how AI tutors handle real study interactions.
Our Approach
We ran six cases across different domains:
- Maths / Quant Reasoning: percentage increase problem.
- Reading / Argument Analysis: phone ban debate.
- Data Interpretation: study-hours vs. performance chart.
- Coding Debugging: Python loop off-by-one error.
- Writing Revision: improving a vague academic paragraph.
- Science Concept Build: why salt lowers water's freezing point.
Observation focus:
- Engagement style (Socratic vs. lecture).
- Reflection prompts and meta-learning.
- Cognitive depth (procedural, conceptual, rhetorical, methodological).
- Persistence and closure handling.
Models tested:
- ChatGPT (study mode): coach, Socratic scaffolding.
- Gemini (guided learning): concise explainer, multimodal prompts.
- Copilot (quick response): consultant, workflow-driven, applied.
- Mistral (personal tutor): professor, structured, methodological.
Outcomes
Key findings:
- ChatGPT (Coach): Most effective at sustaining motivation and recall.
- Gemini (Explainer): Concise and clear, but shallow - stopped too soon if the learner didn't push. Multimodal sometimes failed (e.g., no image).
- Copilot (Consultant): Strong at transfer into real-world or rhetorical contexts, but sometimes over-expanded.
- Mistral (Professor): Rigorous and structured; strong on meta-skills, but sometimes too heavy for short sessions.
Surprising insights:
- Some models spontaneously taught meta-strategies (shortcuts, mnemonics).
- Others broadened scope into higher-order thinking (policy, ethics, research design).
- Engagement often hinged on how gracefully a model closed a session - recap vs. abrupt stop.
Discussion & Future Directions
Lessons learnt:
- The “best” AI tutor depends on learner stage and goal - no single mode fits all.
- AI study modes embody different teaching identities, not just different accuracies.
- Persistence and graceful closure matter as much as explanations.
Continuous improvement:
- Future designs should balance persistence with autonomy, and depth with clarity.
- Teachers and learners should learn to switch modes intentionally: coach for drill, consultant for application, professor for research.
Honest reflection
Engagement is fragile. A single “no thanks” can end learning prematurely - unless the AI is designed to recap or pivot. That small detail can decide whether the session builds confidence or leaves a gap.
Client Value & Integration
For educators and organisations, this meta-case shows that evaluating AI tutors isn't about which model is most accurate. It's about:
- Matching mode to purpose (drill vs. transfer vs. reflection).
- Using complementary personas like a teaching team.
- Embedding AI literacy training so learners use AI as sparring partners, not shortcuts.
Symbio6 supports you with:
- AI literacy workshops tailored for both teachers and learners.
- On-the-job coaching to embed AI use responsibly in daily practice.
- Project support for designing effective AI-enhanced learning environments.
Next step: Book a strategy session and discover how AI study modes can strengthen engagement and learning in your organisation.
Takeaway
LLM study modes don't just differ in correctness - they differ in educational identity. The smartest way forward isn't to crown a winner, but to orchestrate them like co-teachers, matching persona and style to the learner's stage and goals.