Auto Research
An autonomous agent loop that improves a system without human involvement: define an objective, a metric, and boundaries — then let agents run indefinitely.
Last updated: 2026-04-13
Overview
Auto research is Karpathy’s term for removing the researcher from the research loop entirely. The insight: humans are the bottleneck. Traditional research means a person runs experiments, reads results, adjusts hypotheses, then runs again. Auto research replaces that cycle with an agent that loops on its own — no human in the middle.
The proof of concept: Karpathy ran auto research overnight on his nanoGPT project — a repo he’d already tuned manually for years. It came back with improvements he’d missed: weight decay on value embeddings was forgotten, and atom betas were under-tuned. Critically, these interact — fixing one shifts the optimal setting for the other. A human bottleneck running sequential experiments would likely never find the interaction.
The principle generalizes beyond model training. Any domain with a verifiable output metric is a candidate: compiler optimization, CUDA kernel performance, infrastructure configuration, search ranking.
The Prerequisite: Verifiable Metrics
Auto research only works when you have objective, easy-to-evaluate metrics.
Writing more efficient CUDA kernels is a perfect fit: you have a slow implementation, you want a faster one with identical behavior — binary, measurable, automatable. Hyperparameter tuning with validation loss is a perfect fit.
If you can’t evaluate, you can’t auto research it. Anything softer — nuance of intent, aesthetic quality, judgment without ground truth — is outside the loop. This is the same constraint frontier labs face in RL training: models improve on verifiable domains and stagnate on unverifiable ones.
Program.MD: The Research Organization as Markdown
Karpathy calls the markdown file describing how auto research should run the program.MD. It contains: objectives, what kinds of ideas to explore (architecture, optimizer, data pipeline, etc.), and constraints on what’s allowed.
The key framing: “A research organization is a set of markdown files that describe all the roles and how the whole thing connects.” Different program.MDs = different research organizations with different risk tolerances, different stand-up frequencies, different exploration strategies. You can compare their improvement rates empirically.
The meta-layer: optimize the program.MD itself. Run a contest where people write different program.MDs for the same hardware budget — measure which gets the most improvement — then feed the data back to a model to write a better program.MD. Auto-researching the auto-researcher.
Distributed Auto Research (SETI@home Pattern)
The single-loop version parallelizes naturally. The more interesting extension is untrusted contributors from the internet.
The structural property that makes this viable: cheap to verify, expensive to find. Any contributor claims their commit improves validation loss — and you can check by just running it. The contributor ran 10,000 experiments; you run 1 to verify. This is the same property as SETI@home (protein folding candidates) and Folding@home.
Implications:
- An untrusted pool of workers can safely collaborate with a trusted verification pool
- The structure is blockchain-adjacent: commits build on each other, proof of work = experimentation, leaderboard is the reward
- A swarm of internet agents could potentially run circles around a frontier lab on problems with clear metrics — the Earth has far more untrusted compute than any single lab
- People and orgs can donate compute to causes: join the auto-research forum for cancer drug discovery instead of just donating money
Security caveat: running arbitrary code from untrusted contributors is risky. Systems need sandboxing and careful scope constraints before this can work safely at scale.
The Loopy Era
Karpathy describes AI capability as an onion where each layer gets “taken for granted” before you move up:
- LLM — taken for granted
- Single agent session — taken for granted
- Claw-like persistent loops (running on your behalf when you’re not looking) — currently being taken for granted
- Multiple agents collaborating
- Instructions to agents (program.MD, AGENTS.md)
- Optimization over the instructions (meta-level tuning)
At each layer, when things don’t work, it “feels like skill issue” — bad instructions, missing memory tool, wrong scope. This is empowering because it means improvement is always available. The psychosis comes from realizing the onion has no bottom: “this is like infinite and everything is skill issue.”
Connections
- coding-agent — the harness auto research runs on; macro actions are its interface
- agent-evaluation — verifiable metrics are the prerequisite; model jaggedness determines what can be auto-researched
- thin-harness-fat-skills — program.MD is the research-org equivalent of skill files: policy in markdown, not code
- agentic-engineering — orchestration and guardrails for running agent loops safely
- andrej-karpathy — originated auto research; demonstrated it on nanoGPT
Sources
- Skill Issue: Andrej Karpathy on Code Agents, AutoResearch, and the Loopy Era of AI — No Priors podcast, added 2026-04-13