May 29, 2026
Article
HTML vs Markdown for Agent Memory: A Full-Scale Benchmark on Accuracy, Latency and Cost
A few weeks ago, we ran a comparison test between HTML and Markdown on LoCoMo with 603 questions and HTML won.
To validate the results on real production, we re-ran our agent memory benchmark at full scale: 1,982 questions across 11 conversations. The results: HTML beat Markdown on cost (50% cheaper), accuracy (+0.26 points), and speed (40% faster on query, 12.5% faster on curate).

The Question
Thariq from the Claude Code team had already argued that HTML was an unreasonably effective output format for AI agents because humans actually read it. If HTML helps when the human is the reader, is HTML truly effective when the agent is the reader of its own memory?
The Benchmark
This time we ran LoCoMo at full scale with 1,982 questions across every conversation and category. The setup was the same as the previous run, with two isolated context trees (one Markdown, one HTML) queried by the same agent.


HTML wins almost every category: the overall accuracy is really close, but the sub-category comparison shows the difference.
Multi-hop questions gain +1.06 points and temporal questions gain +1.25 points. Multi-hop questions require the agent to pull facts from multiple notes to find the answer. Temporal questions require the agent to pull facts in the right time order. Both are the hardest kinds of questions, because the agent has to connect several things together.
Markdown only won on adversarial questions (−1.35 points). Adversarial questions are designed to trick the agent with confusing wording that nudges it toward the wrong answer.
The differences in the cost and token columns are obvious:
Query input tokens drop by 68%, query cost drops by 68%, and total cost drops by 50%.
For latency: HTML's slowest queries (p95) are much slower than Markdown's (4,311 ms versus 259 ms).
Conclusion
To conclude, HTML is not only more readable for humans but also more effective for agent memory (in accuracy, cost, and latency), as the results of this full-scale benchmark show. As a next step, we are shipping HTML as the default format for ByteRover memory, which will be released soon.