The Catalog of Catalogs

“This much is known: for every rational line or forthright statement there are leagues of senseless cacophony, verbal nonsense, and incoherency.” — Jorge Luis Borges, The Library of Babel

In 1941, Borges imagined a universe in which the limitation of knowledge was not scarcity but excess. Every possible book existed, yet nothing was meaningfully accessible. In this world, the librarian’s hope rests on a mythical object: a Catalog of Catalogs, a structure that might impose order on overwhelming noise, and the Book-Man, who has studied it.

Erik Desmazieres, 1997, La Bibliothèque de Babel (above and thumbnail) for Le Livre Contermporain and Les Bibliophiles Franco-Suisses.

In 2026, modern scientific organizations increasingly resemble Borges’ library. Data accumulates faster than meaning can be maintained, and organization becomes the difference between insight and noise. We’re not faced with a lack of quantitative or qualitative metrics, we’re faced with a lack of coherence.

Last week, I wrote about how the crucial endeavor in science is getting people with fundamentally different expertises to talk to each other. I focused a lot on ontologies as a mechanism of that record, and how both subject matter and translational governance are underestimated infrastructure. Borges’ allegory of the Catalog of Catalogs and the Book-Man are particularly prescient here; it’s not knowledge being sought but the demand for structured knowledge. We’re all looking for a template by which to organize the data and someone who can make sense of it and therefore lead us into a kind of distillation where the infinite becomes comprehensible to the finite.

This structured knowledge doesn’t exist by accident. Instead, it’s a very active process of structure, focus, and then orchestration: a “what”, a “why”, and a “how”.

Structure

The first concern is that data be reasonably segmented and relational. From a practical standpoint, data must be Findable, Accessible, Interoperable, and Reusable (FAIR). If not, data tend to be labyrinthine, bottlenecked, or plainly incoherent.

We’ve all lived in organizations where we’ve had to wade through years of previous colleagues’ Excel files, sandbox analyses, hopes, and dreams. Correctly structured data maintain relevance and a paper trail for a broad swath of the organization without unnecessary bureaucracy.

A lot of focus is put on the first two elements (findable and accessible), but interoperable and reusable data are equally important. It’s crucial that data maintain relation to other data while functioning broadly across contexts. Structure presents the “where” and the “how” of coherence.

Focus

Structure frames and orients knowledge, but it does not direct it. Most organizations do not lack data; they lack agreement about which questions matter. Focus provides the bridge between what is technically possible and what is operationally relevant.

In the absence of focus, data systems become encyclopedic rather than instrumental: impressive in scope, but disconnected from decision-making. Everything is recorded, nothing is prioritized, and insight is deferred indefinitely.

Focus requires constraint. It means explicitly defining:

  • which problems are in scope,
  • which audiences the data serve,
  • and which uses are not supported.

Focus is an acknowledgment that not all pursuits are equally valuable and that excluding irrelevance is a prerequisite for action. Focus is where knowledge ceases being merely reflective and starts becoming useful.

Orchestration

Where structured, focused data enable valuable prototypes, orchestration is what allows those systems to scale. It is no accident that modern computational frameworks (whether data architecture, DevOps, or agentic systems) borrow musical terminology: orchestration is the work of getting multiple competent functions to operate coherently through change. The way we use our data must mirror what we expect from it.

Orchestration requires stewardship and governance. It is the ongoing maintenance of alignment between structure, focus, and reality as systems grow. Initial models of data flow are necessary, but insufficient; organizations must adapt as methods change, incentives shift, and markets evolve. That adaptability does not emerge spontaneously. It requires deliberate ownership, revision authority, and sustained care.

Calvin and Hobbes: Man of Action by Bill Watterson. September 21, 1993

When Borges wrote about his library, he did so appealing to the philosophical arrangement of knowledge and how we make sense of the world. In this perfect and infinite arrangement of all possible things that could be described, his best case still was seeking a digested, comprehensible version of it.

In scientific systems, it’s our reality to take pragmatic approaches to our expanding means of measuring and describing the world. Whether we’re collecting and making decisions in the current AI revolution, the “Big Data” boom of the last decade, or Borges’ 1941, we’re still restricted by the same conditions.

No amount of technological advancement can replace our core need for coherence, not completeness, in data. The limiting factor has never been measurement, storage, or computation. It has always been our willingness to decide what matters, who decides it, and who is accountable when meaning breaks.




Enjoy Reading This Article?

Here are some more articles you might like to read next:

  • Forge: A Multiomics Analytical Platform
  • Dunkies, Data, and Defensive Equilibrium
  • When the levees break
  • Boundary Illusions
  • Why it's so hard to feed people