Evaluating Generative AI in Historical Research: A Comparative Study on Identifying Primary Source Evidence in Ancient History

Authors

DOI:

https://doi.org/10.64946/aiantiquity.v2i1.003

Keywords:

Primary Sources, Ancient History, Historical Methodology, Generative AI, Humanities

Abstract

This study explores how traditional historical methods and generative AI tools compare in the identification, interpretation, and validation of primary sources in ancient history. Drawing from a dual case study approach—four case studies conducted by human historians and four by AI tools (GPT-4, Claude 2, Gemini, Perplexity)—we evaluate the epistemological strengths and limitations of each method. Using qualitative document analysis, historiographical criteria, and expert review, the study assesses source criticism, genre classification, provenance transparency, and evidentiary value. Results indicate that generative AI excels at broad content discovery and thematic synthesis but struggles with historical genre boundaries, source verification, and manuscript-based scholarship. Human researchers consistently outperform in contextual interpretation, critical chronology, and the adjudication of textual authority. We propose a human-in-the-loop framework combining digital speed with scholarly rigor, advocating for model pluralism, temporal prompting, and provenance-first protocols. This integrated methodology ensures AI contributes meaningfully to digital historiography without compromising historical standards. 

Downloads

Download data is not yet available.

Downloads

Published

2026-02-27

Issue

Section

Articles

Similar Articles

1-10 of 15

You may also start an advanced similarity search for this article.