Writing — Gioia Zheng

Published 1 note

Failure analysis in retrieval-augmented generation

2026-05-28

A single aggregate score is the wrong unit for evaluating a RAG pipeline. Reporting per-category failure rates makes regressions visible that aggregates hide.

Read note