The Future of the Essay in the Age of AI: A Practical Guide

Generative Artificial Intelligence (GenAI) has fundamentally disrupted one of higher education’s most enduring pedagogical tools: the essay. For centuries, the essay has served as both a means of learning and a method of assessment, asking students to demonstrate research skills, critical thinking, argument construction, and disciplinary knowledge through extended written work. The arrival of tools like Claude, Perplexity, and Gemini has created what many perceive as an existential crisis for the essay. If students can generate coherent, well-structured essays in seconds, what future does this assessment form have?

Sarah Midford (2025), writing from the perspective of the History discipline, poses this question directly: Is the essay history?  Rather than declaring the essay obsolete, Midford argues that what has changed is not the essay’s value but our understanding of what we are assessing. When AI can produce work that meets surface-level criteria of a successful essay—coherent arguments, synthesised information, appropriate citations, historiographical engagement—the essay ceases to measure a student’s individual capability in those specific domains. This is not a reason to abandon essays, but rather a compelling reason to redesign what and how we assess (Midford, 2025).

This requires educators to ask fundamental questions: What are we assessing when we assign an essay? Are we measuring students’ ability to produce polished prose, or their capacity to think critically, synthesise sources, construct arguments, grapple with conflicting evidence, and develop disciplinary expertise? If AI can handle surface-level writing tasks, what distinctly human skills must remain at the centre of essay pedagogy? As Corbin, Dawson, and Liu (2025) note, the fundamental issue is not academic integrity or cheating detection but assessment validity—our confidence that a grade reflects a student’s actual learning. If our assessments do not accurately measure what students know and can do, concerns about AI-generated submissions become secondary to the fundamental failure of the assessment itself.

This guide provides practical strategies for educators to audit, redesign, and implement essay assessments that remain valid, equitable, and pedagogically robust in an AI-abundant environment. Drawing on recent scholarship, institutional guidance from Australian universities, and frameworks such as the AI Assessment Scale (AIAS), the guide presents four actionable topics that support a coherent thesis: essays must evolve from products to processes, centring human judgment, disciplinary thinking, and authentic intellectual insight while integrating AI transparently.

1. Purpose first: Reaffirm the essay as a mode of learning and disciplinary practice

The challenge

Midford (2025) observes that History essays have historically served multiple, interlocking pedagogical functions: assessing students’ ability to construct coherent arguments, synthesise complex information from multiple sources, demonstrate critical thinking, analyse primary sources, contextualise historical events, and engage with historiographical debates. A successful History essay weaves together chronology, causation, and context with theory, methods, and analytical interpretation. This is not unique to History; similar integration of multiple cognitive processes characterises essays across the humanities and social sciences.

The problem arises when educators have defaulted to essays as convenient, scalable assessment instruments without interrogating their purpose. Before AI, essays functioned simultaneously as learning tools (helping students synthesise knowledge, interrogate evidence, and develop arguments) and as summative assessment devices (measuring attainment of learning outcomes). AI has exposed a critical flaw in this dual function: if the essay’s purpose is to generate a written artifact, AI can do that efficiently. If the goal of the essay is to promote intellectual development through working with evidence and constructing arguments, then AI serves little purpose unless we explicitly teach students to use it as a learning tool rather than a learning shortcut (Midford, 2025).

The first step in redesigning essays for the AI age is to reaffirm their pedagogical purpose and locate that purpose in the process of writing them. Why are we assigning essays in the first place? Is the goal to develop research skills, critical analysis, argumentation, synthesis, disciplinary knowledge, the ability to grapple with conflicting evidence, or written communication? Once we clarify the purpose, we can design assessments that measure those specific capacities and make the intellectual journey visible.

How-to: Practical actions

Articulate purpose explicitly—and link it to process

In every subject outline and assessment brief, clearly state why the essay is being assigned and what specific learning outcomes it measures. Crucially, include the “invisible” learning: the struggle with evidence, the evolution of thinking, and the grappling with debates. For example: “This essay assesses your ability to synthesise conflicting scholarly perspectives, construct an evidence-based argument, and apply [specific theory] to a real-world case. You will be graded not only on your final argument but on how your thinking evolved as you encountered challenging sources and refined your position.”

Frame prompts that resist generic AI outputs and demand disciplinary thinking

AI can easily answer generic essay prompts that have minimal contextualisation. Instead, design prompts that require integration of course-specific materials, personal reflection, discipline-specific analysis, or application to unfamiliar contexts. For example: “Using three texts from our course readings and two primary sources from [archive], analyse how [specific theory] applies to [current Australian policy issue]. Which reading initially offered the most useful framework? How did your thinking change as you examined primary evidence? What arguments remain contested among historians/scholars?”

Use formative essays to track thinking over time

Rather than assigning a single summative essay, break the task into stages: initial reflection, annotated bibliography, draft argument, peer feedback, revision, and final submission. This approach makes students’ evolving thinking visible—the actual learning development is emphasised as central to the essay’s pedagogical value.

Practical example

A history lecturer redesigns their major essay assignment following this model:

  • Week 4: Submit a 300-word reflection on which primary source from the course materials most challenges your initial assumptions about [historical event]. Explain why it challenges you. (No AI, handwritten in class.)
  • Week 7: Submit an annotated bibliography of five scholarly sources, with a 200-word justification for why these sources matter to your argument. Note where you encountered conflicting historiographical positions.
  • Week 10: Submit a full draft with tracked thinking. Include a reflection on how your argument evolved from Week 4 to Week 10: What evidence changed your mind? What remained contested?

This staged approach centres the essay as a learning process—the practice of thinking through evidence and argument.

2. Process over product: Make the learning visible

The challenge

Midford (2025) identifies a critical distinction that reframes the entire conversation: the real value of essay writing lies not in the final product but in the process—the research quest, the grappling with conflicting evidence, the iterative development of arguments, and the intellectual growth that occurs through sustained engagement with historical (or disciplinary) problems. 

Traditional essays are vulnerable to AI because they are submitted as polished final products with little visibility into the process of their creation. If students submit only a final 2000-word essay, academics have no way of knowing whether it reflects the student’s thinking or AI output. More importantly, even if we could detect AI use, the more fundamental problem remains: the student has missed the intellectual struggle that produces learning.

The solution is to make the thinking process visible and to assess this as a core outcome. If we assess the development, the struggle with conflicting evidence, the evolution of argument, and the refinement through feedback, we can ensure that students are genuinely engaging with the intellectual work that essays are meant to develop

How-to: Practical actions

Scaffold assignments to capture the research process

Break the essay into discrete phases with submission checkpoints:

  • Research question development and initial source identification
  • Annotated bibliography (with reflections on conflicting sources)
  • Outline and argument map (showing how evidence supports claims)
  • First draft with marginal notes on areas of uncertainty
  • Peer feedback and revision notes
  • Final submission with a “thinking evolution” reflection (500 words on how the argument may have changed)

Each phase should be assessed formatively or carry a percentage of the overall grade. This iterative approach makes the essay transparent: educators can see that students engaged with the intellectual work rather than outsourced it.

Require AI use logs—if AI is permitted

If students are permitted to use AI, require them to document how they used it: what prompts they entered. What outputs did they receive? What did they keep, modify, or discard, and why? This metacognitive documentation shifts the focus from the essay as artifact to the essay as a record of decision-making, judgment, and growth.

Incorporate oral defences or peer discussion.

Pair written essays with oral presentations, viva voces, or peer-led discussions where students explain their research process, defend their thesis, and respond to questions about their sources. Midford (2025) suggests that peer and educator discussion allows educators to “interrogate a student’s understanding in real time” and to determine whether a student comprehends their arguments or is merely reproducing AI-generated content. Oral components are inherently AI-resistant and provide insight into the depth of understanding and disciplinary thinking.

Use in-class writing to establish baselines

Begin the semester with an in-class, handwritten or supervised writing task on a related topic. This establishes a baseline of each student’s independent writing ability, making significant divergences in later submissions more visible.

Practical example

A history lecturer redesigns a 2000-word essay on a historical event:

  • Week 3 (10%): In-class response to a primary source (300 words, handwritten). Establishes the student’s baseline analytical voice.
  • Week 5 (10%): Submit research question, potential sources, and a 300-word explanation of what makes this question historically important.
  • Week 8 (10%): Submit draft with marginal annotations showing areas of uncertainty or historiographical debate.
  • Week 11 (50%): Submit final essay with a 500-word reflection on how thinking evolved.
  • Week 12 (20%): 15-minute oral defence in tutorial. Answer questions about historiographical debates, primary source interpretation, and argument choices.

This structure ensures that the essay reflects a sustained process of learning and struggle, not a one-shot final product.

3. Critical AI literacy: Teach the limits, not just the prompts

The challenge

Many students (and academics) view AI as a neutral tool that simply “helps with writing.” However, AI tools have significant limitations: they can generate fabricated citations, reflect biases present in their training data, produce generic writing that lacks depth in specific disciplines, and fail to engage authentically with primary sources or course-specific evidence unless explicitly prompted. Additionally, while AI outputs can mimic analytical thinking, they often lack the depth, specificity, and insightful contextualisation.

If students use AI uncritically, they risk submitting work that is factually inaccurate, shallow, or missing the discipline-specific rigour that makes disciplinary writing valuable. The solution is to teach critical AI literacy as an explicit learning outcome, not just how to use AI effectively, but how to evaluate its outputs critically, recognise its limitations, understand its risks, and use AI as a learning tool  (Midford, 2025; ASPERA, 2025).

How-to: Practical actions

Embed AI literacy into the curriculum with discipline-specific examples

Dedicate at least one tutorial or lecture to discussing AI’s capabilities and limitations in your discipline. Show students examples of AI-generated essays that contain hallucinated references, generic subject matter, engagement with bias, or a lack of specificity. Discuss why these outputs fail disciplinary standards.

Link AI use to disciplinary standards and methods.

Show students how generic AI prose fails to meet the specific expectations of your discipline. AI-generated essays may lack proper engagement with primary source analysis, miss nuances of debate, or apply anachronistic frameworks. In law, they may miss precedent. In literature, they may miss textual interpretation. Teach students to recognise these gaps and fill them with their own disciplinary expertise.

Require transparent AI acknowledgment.

Use AI acknowledgment templates that ask students to declare how they used AI, what outputs they received, how they modified those outputs, and why they made those choices. This builds a culture of transparency and ensures students think critically about whether and when AI use serves their learning.

Practical example

A history lecturer includes a two-week AI literacy module:

  • Week 1: Students input the course’s major essay question into AI and submit the output. In class, the group analyses the AI essay: Which primary sources are cited correctly? Which are hallucinated? Does the essay engage with debates or apply generic analytical frameworks? What nuances of the historical period does it miss?
  • Week 2: Students revise the AI essay, correcting errors, adding specific primary source analysis, integrating historiographical nuance, and addressing anachronisms. They submit both the original AI output and their revised version with a 500-word reflection on what they learned about AI’s limitations and disciplinary standards.

Topic 4: Assessment redesign and implementation

The Challenge

The first three topics address the reorientation of essay pedagogy, purpose, process, and critical literacy, but they require practical guidance for implementation. Educators need systematic frameworks for redesigning essay assessments to maintain validity while integrating AI transparently. Without concrete strategies, essays continue in traditional forms while institutions default to detection tools and restrictions that address symptoms rather than causes.

Assessment validity must precede security concerns. If an assessment does not accurately measure what students know and can do, worrying about AI-generated submissions becomes secondary to the assessment’s fundamental failure (Corbin, Dawson, & Liu, 2025). The real work lies in redesigning what is assessed and how.

The AI Assessment Scale (AIAS) provides a practical framework and reframes assessment in terms of validity and purposeful design. It comprises five non-hierarchical levels: 

  • No AI (Level 1), 
  • AI Planning (Level 2), 
  • AI Collaboration (Level 3), 
  • Full AI (Level 4), and 
  • AI Exploration (Level 5).

Critically, the AIAS is not a labelling system retrofitted onto existing assessments; it requires educators to reconsider what, how, and under what conditions they assess learning (Perkins, Furze, Roe, & MacVaugh, 2024).

How-to: Practical Actions

Design multi-layered assessments with clear AIAS Levels

Rather than assigning a single essay with ambiguous AI policy, create assessments with clearly delineated stages at different AIAS levels:

Week 4 (Level 1 – No AI): In-class, supervised analytical response (300 words, handwritten) to a primary source. This establishes baseline independent analytical ability.

Week 8 (Level 2 – AI Planning): Students use AI to brainstorm arguments and outline opposing viewpoints. They submit AI-generated outputs, along with a 200-word reflection explaining which ideas they will pursue and why. The assessment focuses on students’ judgment in evaluating AI suggestions.

Week 11 (Level 3 – AI Collaboration): Submit draft essay incorporating AI-assisted feedback and editing. Include an AI usage log documenting prompts, outputs received, and modifications made with justification. Assessment focuses on synthesis, source engagement, and the student’s own voice—visible in their revisions.

Week 13 (Level 1 – No AI): Oral defence (15–20 minutes) where students explain the research process, defend thesis choices, and respond to questions about historiographical debates and primary sources. This oral component is inherently AI-resistant and provides direct evidence of understanding.

This structure ensures the essay reflects genuine learning: students demonstrate independent ability, exercise judgment evaluating AI, synthesise sources critically, and articulate thinking in real-time discussion.

Develop Transparent Marking Rubrics

Assessment rubrics must clearly articulate what is assessed at each level and make transparent what role AI plays. A rubric for Level 3 essays might include:

CriterionLevel 1 (Baseline)Level 3 (AI Collaboration)Level 1 (Oral Defence)
Disciplinary ThinkingIdentifies key debate; poses informed questionSynthesises conflicting perspectives; integrates multiple sourcesArticulates own position; defends interpretation against alternatives
Evidence IntegrationIdentifies relevant sources; quotes accuratelyContextualises evidence; explains significance within frameworkAnalyses nuance; explains why specific evidence supports the argument
Critical Engagement with AIN/AExplains which AI suggestions were adopted, modified, and rejected; justifies with disciplinary standardsReflects on AI limitations; articulates how thinking evolved beyond AI-generated content

Require transparent AI acknowledgment

If students use AI, they require structured documentation:

AI Acknowledgment Statement:

  • Which AI tools are used and at which stages (brainstorming, outlining, drafting, editing)?
  • For what specific purposes?

AI Usage Log:

  • Prompt entered ? AI output received ? Decision made (adopted/modified/discarded, with justification)

Reflection on AI Use (200–300 words):

  • How did AI use change your process compared to baseline work?
  • What did you learn about AI’s capabilities and limitations in your discipline?
  • How did AI assist or limit your thinking?

This documentation shifts from “Did you use AI?” (undetectable and incentivises dishonesty) to “How did you use AI, and what did you learn?” It builds transparency and makes intellectual work visible.

Gateway assessments in foundational courses that establish prerequisite skills should typically be Level 1 (No AI) to ensure authentic capability assessment. High-stakes summative assessments might employ a hybrid model: a Level 1 component (in-class exam), a Level 3 component (open assignment with disclosed AI use), and a Level 1 component again (oral defence).

Institutions should provide clear, consistent guidance on AI use, reducing student anxiety about disclosure and ensuring faculty interpret policies consistently across disciplines.

Conclusion

Essays should assess deeper skills than just writing or citing, as generative AI can now handle those tasks efficiently. Making essays transparent processes in which students show their reasoning and development turns them into valuable learning tools that AI cannot replace. The research in this guide outlines how educators can transform essay assessment to emphasise genuine intellectual struggle and growth. Essays remain vital in higher education because they foster critical thinking, analysis, and synthesis skills that are still uniquely human. While AI may support these efforts, it cannot substitute for the core thinking involved. Educators must rethink learning goals and make essay assessments truly reveal student learning.

References

Dawson, P., Bearman, M., Dollinger, M., & Boud, D. (2024). Validity matters more than cheating. Assessment & Evaluation in Higher Education, 49(7), 1005–1016. https://doi.org/10.1080/02602938.2024.2386662

Furze, L., Perkins, M., Roe, J., & MacVaugh, J. (2024). The AI Assessment Scale (AIAS) in action: A pilot implementation of GenAI-supported assessment. Australasian Journal of Educational Technologyhttps://doi.org/10.14742/ajet.9434

Midford, S. (2025). Is the essay history? Rethinking effective assessment in the age of generative AI. History Australia, 1–5. https://doi.org/10.1080/14490854.2025.2570457

Monash University. (2025, February 26). Introducing AI as a collaborator in the writing processhttps://teaching-community.monash.edu/introducing-ai-as-a-collaborator-in-the-writing-process/

Perkins, M., Furze, L., Roe, J., & MacVaugh, J. (2024). The Artificial Intelligence Assessment Scale (AIAS): A framework for ethical integration of generative AI in educational assessment. Journal of University Teaching and Learning Practice, 21(06). https://doi.org/10.53761/q3azde36

Roe, J., Furze, L., & Perkins, M. (2025). Digital plastic: A metaphorical framework for critical AI literacy in the multiliteracies era. Pedagogies: An International Journal, 1–15. https://doi.org/10.1080/1554480X.2025.2557491

TEQSA. (2023, November 22). Assessment reform for the age of artificial intelligence. Tertiary Education Quality and Standards Agency. https://www.teqsa.gov.au/guides-resources/resources/corporate-publications/assessment-reform-age-artificial-intelligence

University of Sydney. (2025, September 9). The futures of open student longform writing at the University of Sydneyhttps://educational-innovation.sydney.edu.au/teaching@sydney/the-futures-of-open-student-longform-writing-at-the-university-of-sydney/

AI Use Statement

This essay, “The Future of the Essay in the Age of AI: A Practical Guide for Australian Higher Education,” was produced using generative AI tools and research platforms in accordance with the Artificial Intelligence Assessment Scale (AIAS) framework (Perkins et al., 2024).

AI use (AIAS Level 2–3):

  • Elicit (https://elicit.com/) was used to search and retrieve relevant peer-reviewed articles on AI in assessment, essay pedagogy, and higher education policy. Elicit’s AI-powered research capabilities helped identify and organise key findings from academic sources on the AIAS framework and assessment reform.
  • Perplexity AI assisted with semantic search and information synthesis across academic literature, institutional guidance on AI in assessment, and contemporary Australian university resources.
  • AI tools supported content organisation, outline development, and preliminary drafting based on specified requirements and source materials.
  • All AI-generated content was critically reviewed, verified against sources, refined for accuracy and discipline-specific relevance, and adapted to meet pedagogical standards for Australian higher education.

What was not AI-generated:

  • Thesis formulation and overall argument structure
  • Selection and sequencing of four key topics
  • Pedagogical rationale, discipline-specific examples, and practical recommendations
  • Critical evaluation and synthesis of sources
  • Final editorial decisions, tone, and expression

Compliance: This use of AI and research platforms aligns with best practice in transparent research and academic writing, demonstrating the critical AI literacy and intentional AI integration that the essay advocates for in educational assessment

Posted

Comments

Leave a Reply