What we miss when we talk about "AI Harnesses"
AI Harness Engineering is suddenly in vogue. But does the seemingly innocuous "harness" metaphor come with hidden risks?
This past week the idea of an “AI Harness” shifted from a term predominantly used in AI development circles, to something that swept across the web with near viral intensity.
The concept is relatively intuitive, and is increasingly being used to describe the tools, memory, prompts, guardrails, and more, that allow increasingly powerful AI systems to be “harnessed” and put to good use.
The only problem is that words often have power that goes beyond their intended meaning. And while the idea of harnessing AI makes sense, there’s a danger that the speed with which the terminology is being adopted risks locking us into a trajectory that comes with unintended consequences as it defines how we think about our relationship with AI, and even its relationship to us.
The AI Harness
The term “harness” had been circulating in one form or another for some time in AI circles. “Test harness” and “evaluation harness” are long-established terms in software engineering, and EleutherAI’s Language Model Evaluation Harness has been a standard tool for testing generative AI models since 2020.
By late 2025, Anthropic was using “harness” to describe agent infrastructure, referring to the Claude Agent Software Development Kit as “a powerful, general-purpose agent harness” in a November 2025 post on effective harnesses for long-running agents.
And in January 2026, Aakash Gupta declared that “2025 was agents. 2026 is agent harnesses,” building on Phil Schmid’s argument that agent harnesses would define the year ahead.
But the crystallizing moment came in early February 2026, when Mitchell Hashimoto — co-founder of HashiCorp and creator of Terraform — published a blog post that gave the practice a name.
He called it “harness engineering.”
Within days, OpenAI published a detailed account of building a million-line codebase with zero manually typed code, titled “Harness engineering: leveraging Codex in an agent-first world.”
And on February 18, Ethan Mollick’s widely read guide to AI both popularized and started the process of normalizing the term as it organized its entire framework around three concepts: “Models, Apps, and Harnesses.”
What’s in a word?
The speed with which the terms “AI harness” and “harness engineering” have entered the vocabulary of artificial intelligence is perhaps a testament to the need for new ways of describing what’s emerging. And as I said earlier, it makes sense — at least superficially — as a new entry in the evolving lexicon of AI metaphors.
But as with all metaphors, “harness” doesn't just describe something — it also shapes how we think about what's being described. And this one comes with some assumptions that are worth examining.
The term “harnessing” is commonly applied to technologies where the nascent power they represent is harnessed to create value. But there are dimensions to how the metaphor is applied to frontier AI systems — systems that increasingly display characteristics we associate with understanding, judgment, and even autonomy — that complicate what might appear to be a natural extension of the term.
And, of course, metaphors are never completely neutral.
Metaphors work because they allow us to frame and understand something new in terms we are already familiar with. But as they do, they also constrain and even taint our thinking — enticing us to slip into treating the new as if it’s something old and, as we do, limiting future possibilities by embedding a priori assumptions into emerging capabilities.
In other words, the words we use both reflect how we think about the past, interpret the present, and influence how we steer and direct the future.
And because of this, its worth thinking a little more closely about whether “harness” in the context of AI comes with implications we may want to address sooner rather than later.
What the harness presupposes
I explore this further in a new preprint, which is in a holding pattern with arXiv but can currently be accessed here. Its worth reading in full, but I did want to pull out some of the main points below.
A harness, in its primary usage, is what you put on a working animal. It directs a powerful entity’s energy toward useful work. It assumes that the entity being harnessed is valuable for its strength but cannot be trusted with its own direction.
The harness is designed by the controller, with the harnessed entity having no say in its design. And critically, a harness is meant to transmit power while preventing unwanted behavior — to deliver capability while maintaining control.
It may be that this framing is irrelevant to the term’s use with respect to AI. At the same time, the term does come with specific embedded assumptions about the relationship between human and AI that are worth making explicit.
First, the harness assumes a clean separation between controller and controlled. In other words, the human directs in this case, while the AI executes.
Here, the intelligence that matters — the judgment about what to do and why — resides entirely on the human side. Even in agentic contexts where the AI exercises operational judgment, the harness assumes that the meta-judgment — what the agent should be permitted to decide, and within what bounds — remains firmly human.
In other words, the AI contributes capability, but not understanding.
Second, the harness assumes that capability can be separated from transformation. The goal of the harness is to extract useful work from the model without the user being changed in the process. The user who deploys a well-harnessed AI should, it is assumed, emerge with their task completed and themselves unchanged.
Applying the metaphor here, you’d assume that any alteration to the user is a side effect to be minimized, not a feature of the interaction. And yet, as I am currently exploring in my work (another preprint coming out shortly but available here), we need to be thinking more about the AI-human relationship as one that, by its very nature, influences and changes both AI and human in the process.
And third, the harness metaphor reinforces the instrumental framing of AI — a framing whose roots extend to Aristotle’s distinction between physis and techne — and which persists in the contemporary insistence that AI is “just a tool.”
Yet the tool metaphor has been challenged repeatedly as AI systems display increasing autonomy and adaptiveness. Tobias Rees, for instance, characterizes the insistence that AI is “just a tool” as “a nostalgia for human exceptionalism.” And multiple philosophical frameworks — from Verbeek’s technological mediation theory, to Clark and Chalmers’ extended mind thesis — argue that advanced technologies not only serve human purposes but actively reshape the cognitive and experiential landscape within which those purposes are formed.
In other words, as they are “harnessed” they alter the harnesser — a very different dynamic than that presupposed in the early use of the metaphor with AI. And one that, I would argue, is substantially amplified in emerging frontier AI systems.
So where does this leave us?
It may be that the metaphor of the harness is a useful and relatively benign way of wrapping our heads around emerging capabilities.
On the other hand, it may be a metaphor that constrains how our relationship with increasingly powerful AI systems develops, and one that embeds assumptions and biases in our understanding of advanced artificial intelligence that will leave us with serious challenges in the future.
Either way, it seems that some intentionality may be in order before we — to use another metaphor — get stuck in a rut of constrained thinking about AI that will come back to bite us.
At a minimum, I would suggest that an appropriate framing for how we build advanced AI systems should accommodate bidirectionality (the user is also changed), transformation as intrinsic to capability (not a side effect to be prevented), and the possibility that the most consequential effects of human–AI interaction may be invisible from within a paradigm optimized for task performance.
It should also leave room for the possibility that the nature of human–AI relationships may itself evolve in ways that a control-oriented metaphor cannot accommodate. Especially if, as I would argue, we need to be thinking more about working in relationship with emerging AI technologies, rather than approaching them as something to be commanded and controlled.
For more on my exploration of the harness metaphor as applied to AI, check out the preprint here.


