Beyond the Tree

Discourse parsing and the eRST framework

  1. The Fragmentation Problem
  2. Anatomy of an eRST Analysis
  3. From Theory to Practice

Natural language documents are more than the sum of their sentences. They form complex structures wherein sections subordinate to larger sections, claims come supported by evidence, narratives unfold in sequence, so as to give rise to meanings not localisable to any individual proposition. Identifying these structures and the relations they encode is the task of discourse parsing.

For over three decades, Rhetorical Structure Theory has been the dominant framework for this task, representing documents as hierarchical trees with labelled relations and nucleus-satellite distinctions. RST's elegance comes with constraints, however, and alternative frameworks (SDRT, PDTB, CCR) have each addressed different limitations while introducing their own.

This series examines Enhanced Rhetorical Structure Theory (eRST), a recent framework that synthesises insights from multiple traditions. Drawing on a 2025 paper by Zeldes et al. in Computational Linguistics, we trace the fragmentation problem that motivated eRST, the formal machinery it introduces, and the practical infrastructure that accompanied it (corpus, metrics, and baseline parser).