Stylistic Variation Corpus for NLG

If you use this data in your research, please refer to and cite:  Controlling Personality-Based Stylistic Variation with Neural Natural Language Generators. S. Oraby, L. Reed, S. Tandon, S. TS, S. Lukin, and M. Walker. SIGDIAL 2018. Melbourne, Australia.

Overview: This dataset provides training data for  natural language generation of several types, both for personality and for sentence planning operations of various kinds.

PERSONALITY: a set of 88,000 meaning representation to natural language utterance pairs in the restaurant domain, varying in style according to psycholinguistic models of personality.

Data: The data available for download is a single CSV file containing meaning representations and their corresponding natural language utterances for a subset of the  Big-Five personalities (agreeable, disagreeable, conscientious, unconscientious, and extravert), generated using the Personage statistical generator (Mairesse and Walker, 2010).

Download: Fill out the following form to download the stylistic variation for NLG corpus.


1 Start 2 Complete
Download Style NLG Corpus