Exhibition Program

Science of Communication and Computation


Analyzing the discourse structure behind the text

- Hierarchical top-down RST parsing based on neural networks -


Analyzing a discourse structure behind the document is crucial for context aware Natural Language Processing (NLP) tasks including machine translation and automatic summarization. We propose a neural discourse parsing method based on Rhetorical Structure Theory (RST) that regards a document as a constituent tree. Our parser builds RST trees at different levels of granularity in a document and then replace leaves of upper-level RST trees with lower-level RST trees that were already constructed. The parsing is performed in a top-down manner for each granularity level by recursively splitting a larger text span into two smaller ones while predicting nuclearity labels and rhetorical relations. Unlike previous discourse parsers, our parser can be fully parallelized at each granularity in a document and does not require any handcrafted features such as syntactic features obtained from full parse trees of sentences.


  • [1] N. Kobayashi, T. Hirao, M. Okumura, M. Nagata, “Top-down RST Parsing Utilizing Granularity Levels in Documents,” in Proc. of 25th Annual Meeting of Natural Language Processing, pp. 1002-1005, 2019.




Tsutomu Hirao, Linguistic Intelligence Research Group, Innovative Communication Laboratory