Is AI a Cognitive Trojan Horse?
Could on-demand, seductively responsive and highly fluent AI models bypass our "epistemic vigilance" mechanisms, and present a novel cognitive risk?
Back in December, I asked attendees at the OEB25 conference (a global, cross-sector conference on digital learning) “Is AI a cognitive Trojan Horse?”
The question was meant to be a little playful, and to provoke discussion rather than make a point. But it also reflected growing concerns that the ease, speed and fluidity with which AI models provide us with information potentially circumvents our ability to assess and assimilate that information in critical and healthy ways.
This is the “cognitive Trojan Horse” in the question — the idea that emerging AI models are so appealing to us that it’s hard to resist inviting them into our cognitive lives, even though we still don’t know how they might potentially influence our thinking, our beliefs, our perceptions and understanding, and even how we behave.
It’s certainly a uncomfortable idea, and one that I suspect most people would instinctively push back on — especially as we’re increasingly depending on AI in a so many different ways, from how we learn and understand the world to how we make decisions, run organizations, and even find companionship.
Yet this is exactly what we would expect a cognitive Trojan Horse to look like — a gift with so much promise and potential that to question its use would seem churlish and backward.
It’s precisely because of this though that I think we should at least be asking questions about the potential unintended cognitive consequences of ubiquitous AI.
Especially if these tools are able to silently slip past the “epistemic vigilance” mechanisms we’ve evolved to protect us against potentially harmful cognitive influences.
Epistemic vigilance
Epistemic vigilance is the process by which we — or more precisely, our cognition — flag and assess communicated information that may lead to us being misinformed or deceived.
The concept was developed and extensively explored in a seminal paper by Dan Sperber and six colleagues in 2010.1 In the paper they argue that “Humans depend massively on communication with others, but this leaves them open to the risk of being accidentally or intentionally misinformed. We claim that humans have a suite of cognitive mechanisms for epistemic vigilance to ensure that communication remains advantageous despite this risk.”2
At the heart of their work is the idea that human-human communication is vitally important for learning from an evolutionary perspective. And because of this, we have evolved mechanisms that are optimized for learning through communication by ensuring that cognitive overheads are as low as possible, while ensuring that learning efficiency is as high as possible.
The result is that we default to trusting what we receive when communicating with others. But if anything feels “off,” our epistemic vigilance mechanisms kick in and we begin to critically assess what we are receiving — and reject it if it doesn’t feel trustworthy.
It’s a model that has a lot in common with our immune system — a system that is always on the lookout for potentially harmful agents, but that only kicks in when it encounters something that looks or feels foreign. And of course, it’s a system that viruses are adept at circumventing by appearing to be “friendly” and “trustworthy” when they are, in fact, not.
There are, not surprisingly, many factors that determine when epistemic vigilance kicks in. But a lot of these revolve around our evolved ability to sense when something doesn’t feel trustworthy — the way something is communicated, the tone and nuance of the communication, the body language and micro expressions of the communicator, contextual information around who the communicator is, what their aims are, past experiences, and so on.
Of course, these feelings are, themselves, untrustworthy, as decades of behavioral science and research on cognitive biases have shown. But within the messiness of human society, epistemic vigilance tends to work.
But what if you throw a technology into the mix that upsets the status quo — a metaphorical brand new virus that we haven’t had the chance to adapt to?
This is where we potentially face what’s often referred to as an evolutionary mismatch — a situation where a new technology transcends our evolved abilities to safely and successfully navigate its potential impacts.
Because we are a technological species, and have been for millennia, such mismatches are actually quite commonplace. Well known examples include mismatches between evolved risk responses and how we instinctively respond to technologies such as synthetic chemicals, vaccines, and pretty much anything that’s new and novel.
Yet — and this is part of our superpower as humans — we are remarkably good at using our cognitive abilities and intelligence to compensate and adapt to such mismatches, despite having not evolved with risks directly associated with many of technologies we encounter in our lives.
But what if the mismatch impacts the very cognitive abilities we rely on to navigate differences between what we experience, and what we’ve evolved to live with?
In effect, what if a new technology — and AI specifically in this case — does not trigger our epistemic vigilance mechanisms in the same ways that human-human communication does, and as a result has the ability to slip past our defenses undetected?
This is not mere speculation. While new research is absolutely needed into the potential for AI to act as a cognitive trojan horse by bypassing our epistemic vigilance mechanisms, there are sufficient indicators from associated areas of research that suggest a number of mechanisms by which this might occur.
These include (but are not limited to) processing fluency (our tendency to trust information that is delivered with a high degree of fluency), the role of “attractiveness” in communication (our willingness to trust a source of information that intrinsically appeal to us on multiple levels), speed and volume of information flow (where excessively high rates of information flow potentially overwhelms epistemic vigilance mechanisms), and what might be termed the “Intelligent User Trap” (where a smart user “knows” they are clever enough not to be fooled).
Processing fluency
Processing fluency refers to the ease, or the effort, that’s associated with mentally processing information. And when it comes to person-person communication, it affects how the person receiving information from someone else determines whether to trust it or not.
In effect, processing fluency forms part of a suite of epistemic vigilance mechanisms.
As Rolf Reber and Christian Unkelback described it a 2010 paper on processing fluency and judgments of truth:
“Processing fluency is defined as the subjective experience of ease with which a stimulus is processed. If a person cannot recognize the statement, this experienced ease is taken as information when judging the truth of a statement. If the statement can be processed easily, the person will conclude that the statement is true; if the statement is difficult to process, she concludes that the statement is not true.”3
In other words, communication that is clear, compelling, and takes little effort to understand, tends to be assumed to be true. It doesn’t trigger epistemic vigilance.
And of course, AI apps like ChatGPT, Claude, Perplexity, and others, are supremely adept at creating responses that are clear, compelling, and take little effort to understand. These are models that distill the very best of highly effective human communication into their core, and reflect it in how they engage with users.
In effect, large language model-based AIs are optimized for processing fluency, and as a result are primed to slip by our epistemic vigilance mechanisms.
Attractiveness
Beyond processing fluency, we tend to treat received information as more trustworthy if it comes from someone we like, or who we warm to, or who seems friendly toward us. And this extends to how any communication is crafted and delivered.
Here, there is extensive research showing how someone who is perceived to be warm and competent as a communicator is more likely to engender trust.4 And there are emerging indications that this also applies to how we respond toAI apps.5
It turns out we tend to trust people and AI chatbots more — in other words they are less likely to trigger our epistemic vigilance mechanisms — if they are perceived to be warm and competent.
And as most AI platforms are exquisitely good at this as a result of how they work and how they’ve been trained, there is a tendency to trust them — even when we’re warned not to.
But reading across multiple fields of study, my sense is that there’s more to this than just warmth and competence: some form of “attractiveness” that makes us want to trust the AI’s we’re using that is a combination of how they engage with us, the character they convey, how empathetic and attentive they seem, and probably a lot more.
These are all characteristics and behaviors that contribute to why we find someone attractive and want to spend time with them — and want to trust them. And there’s growing evidence that AI models are very good indeed at emulating these characteristics and behaviors.
You only need to see the growing popularity of AI companions to get a sense of how easy it is for people to form a very human-like attachment to their AI assistants. And it’s quite startling how many users of platforms like ChatGPT develop a personal and trusting relationship with their AI, even to the extent of naming and gendering it (or in some cases respecting the AI’s own choice of name and gender).
If, as I suspect, there is a multidimensional type of “attractiveness” that AI models are exceptionally good at emulating, this may well be another factor that allows them to slip into our cognitive processes without tripping our epistemic defenses.
Speed and volume
And then there’s the speed with which AI models can package and communicate information, and the sheer volume of information they are able to deliver — all with a high degree of fluency.
We’ve evolved as a species to handle a relatively slow rate of information delivery via various forms of communication — not just the speed with which words are delivered to us, but the speed with which ideas, concepts, analysis, and perspectives are delivered.
Modern communication media have, of course, accelerated this a little, although we are still bandwidth-limited by our cognitive ability to absorb information.
But what if we had the means to package new information in such a way that even the most complex of ideas slipped into our minds like a freshly shucked oyster slipping down our throat, bypassing the need to think hard about them.
To an extent, this is what we’re beginning to see with emerging AI apps. And it results from a combination of fluency, attractiveness, and an ability to research and synthesize information at a scale and speed that lies far beyond mere human capabilities.
This is part and parcel of a growing trend in cognitive offloading where users will literally “offload” thinking and research tasks to AI bots, and then assimilate the resulting compressed information. And it’s easy to see why the trend exists: if you can offload every question, idea, thought, onto a suite of trusted AI bots and then “upload” their fluent and “attractive” summaries, why would you not use this cognitive superpower to your advantage?
And yet, research is already indicating that cognitive offloading can reduce critical thinking.6
To make things more complicated, cognitive offloading is highly scalable. Why use one session with ChatGPT when you can simultaneously be asking questions within multiple sessions? Why just use ChatGPT when you can have an army of AI engines all working for you simultaneously from Anthropic, Google, Meta, and beyond? And why limit yourself to just dipping into your extended AI mind occasionally when you can have these AI analysts and advisors on hand 24/7?
In effect, the rate at which we are now able to receive the most informative, attractive, fluent communications from AI is only limited by our choices around when and where we use it. And in a world where we are being told that it’s the AI-augmented that will inherit the earth, the temptation is to go full-on artificial intelligence.
The only problem is that it’s doubtful that our epistemic vigilance mechanisms are up to the task of coping with the resulting flow of information — and this is likely tied to the observed reduction in critical thinking with cognitive offloading.7
Epistemic vigilance is a costly cognitive process. It requires holding information in working memory while evaluating it, generating alternative hypotheses, checking what we’re receiving against what we know (or believe), assessing source characteristics, and much more. And if the flow of incoming information exceeds our capacity to do this, it potentially forces an incredibly tough choice on us: throttle the flow and give up the promised benefits, or go with the flow and give up our cognitive checks and balances.
Of course AI makes the choice easier by making the seeming benefits feel seductively compelling — further fooling our epistemic vigilance defenses.
The intelligent user trap
Finally — at least in this limited list — is the challenge of the “intelligent user trap.”
This is somewhat speculative, although there is evidence to support it — including work from Dan Kahan and colleagues which indicates that more educated individuals are more adept at justifying beliefs that are not supported by evidence.8
The theory goes that more intelligent users tend to be more curious (and so get a bigger “hit” from new information); they tend to process information faster, and so are less attuned to the dangers of speed and volume overload; they trust their judgement, and so are less likely to question it; and they (at least in some cases) value efficiency and so are less likely to slow the rate of information being received.
They also tend to have an oversized ability to use their intelligence to justify their beliefs and actions — which brings us back to Dan’s work.
In other words, the very cognitive capacities that make them "smart" also make them better receivers of the AI's output stream — and worse evaluators of it.
Another potential epistemic vigilance suppressor in other words.
So should we be worried?
So, is AI a cognitive Trojan Horse, or could it turn out to be?
This is an admittedly limited analysis, and there’s clearly a need for a lot more research here. At the same time it’s telling that a search for peer review papers on epistemic vigilance and AI only returns (as of writing) seven papers on the database SCOPUS, and a couple more on preprint archives like arXiv. And a similar search on AI and the concept of a cognitive Trojan Horse returns no papers at all.
And yet the science behind factors that may reduce, or even completely bypass, the effectiveness of our epistemic defenses is there. And in many cases, emerging AI tools and platforms are showing capabilities that align with many of these factors.
As a result, there’s a chance that we may be developing technologies that we do not have the cognitive defense mechanisms to resist, and that we are cognitively predisposed to trust.
Of course, there’s also the possibility that we have all of the cognitive abilities we need to use AI wisely and effectively. And I suspect that skeptical readers will already be thinking: "But I know I'm talking to a machine, so my vigilance is already up."
However, research actually suggests the opposite — that anthropomorphic fluency (the ability of AI apps to emulate the best human you’ve ever met!) triggers social cognition circuits regardless of explicit awareness. And the more human-like the interaction feels, the more trust resilience it generates.9
And even if there’s only a small chance that we are encouraging people to incorporate technologies into their lives that could have far-reaching cognitive implications, surely we should be asking critical questions around potential risks, and carrying out research to better-understand and navigate these risks.
Unless, that is, the AI cognitive Trojan horse has already delivered its payload, and everyone’s too enamored by the promise of AI as a result to even think about the potential downsides …
Sperber, D., F. Clément, C. Heintz, O. Mascaro, H. Mercier, G. Origgi and D. Wilson (2010). “Epistemic vigilance.” Mind and Language 25(4): 359-393. https://dan.sperber.fr/wp-content/uploads/EpistemicVigilance.pdf
There’s a small but rapidly growing literature around AI and epistemic vigilance. See for instance Galindez-Acosta, J. S. and J. J. Giraldo-Huertas (2025). Trust in AI emerges from distrust in humans: A machine learning study on decision-making guidance. https://doi.org/10.48550/arXiv.2511.16769
Reber, R. and C. Unkelbach (2010). “The Epistemic Status of Processing Fluency as Source for Judgments of Truth.” Review of Philosophy and Psychology 1(4): 563-581. https://doi.org/10.1007/s13164-010-0039-7
See for instance Fiske, S. T., A. J. C. Cuddy and P. Glick (2007). “Universal dimensions of social cognition: warmth and competence.” Trends in Cognitive Sciences 11(2): 77-83. https://doi.org/10.1016/j.tics.2006.11.005
Here the literature is evolving and a little disperse, but a useful starting point is Hernandez, I. and A. Chekili (2024). “The silicon service spectrum: warmth and competence explain people’s preferences for AI assistants.” Frontiers in Social Psychology 2. https://doi.org/10.3389/frsps.2024.1396533
For instance, see Gerlich, M. (2025). “AI Tools in Society: Impacts on Cognitive Offloading and the Future of Critical Thinking.” Societies 15(1). https://doi.org/10.3390/soc15010006
It’s worth noting here that research does not show a general causative link between cognitive offloading and reduced critical thinking, and it is likely that there are use cases where it’s possible to offload and continue to assess received information critically. But intuitively it’s easy to imagine a tradeoff between volume of information and critical assessment — especially when that information is designed to be consumed easily and fast.
See, for instance, Kahan, D. M., E. Peters, E. C. Dawson and P. Slovic (2017). “Motivated numeracy and enlightened self-government.” Behavioural Public Policy 1(1): 54–86 (https://doi.org/10.1017/bpp.2016.2) and Kahan, D. M., E. Peters, M. Wittlin, P. Slovic, L. L. Ouellette, D. Braman and G. Mandel (2012). “The polarizing impact of science literacy and numeracy on perceived climate change risks.” Nature Climate Change 2: 732-735 (https://doi.org/10.1038/nclimate1547)
See, for instance, de Visser, E. J., S. S. Monfort, R. McKendrick, M. A. B. Smith, P. E. McKnight, F. Krueger and R. Parasuraman (2016). “Almost human: Anthropomorphism increases trust resilience in cognitive agents.” Journal of Experimental Psychology: Applied 22(3): 331-349. http://doi.org/10.1037/xap0000092



I don't think this is an accurate account of the design and functioning of our epistemic vigilance mechanisms. The greater worry should rather be about AI increasing misplaced mistrust, not misplaced trust:
"We have evolved from a situation of extreme conservatism, a situation in which we let only a restricted set of signals affect us, toward a situation in which we are more vigilant but also more open to different forms and contents of communication... If our more recent and sophisticated cognitive machinery is disrupted, we revert to our conservative core, becoming not more gullible but more stubborn."
https://reason.com/2020/02/09/people-are-less-gullible-than-you-think/
"our open vigilance mechanisms are organized in such a way that the most basic of these mechanisms, those that are always active, are the most conservative, rejecting much of the information they encounter"
https://hal.science/ijn_03451156v1
Very interesting, Andrew. I recently dived into a far shallower but related rabbit hole: https://medium.com/@barneylerten/peace-of-mind-or-pieces-of-mind-how-ai-could-be-the-biggest-rorschach-test-ever-ea276f23ec11