PersonageNLG: Style in NLG

If you use this data in your research, please refer to and cite:  Controlling Personality-Based Stylistic Variation with Neural Natural Language Generators. S. Oraby, L. Reed, S. Tandon, S. TS, S. Lukin, and M. Walker. SIGDIAL 2018. Melbourne, Australia.

Overview: The PersonageNLG corpus is a set of 88,000 meaning representation to natural language utterance pairs in the restaurant domain (train) and 1,390 pairs (test), based on the E2E Challenge dataset. The utterances vary in style according to psycholinguistic models of personality, and provides a resource for natural language generation of restaurant descriptions in different Big-Five personalities.

Data: The data available for download is a zip of 2 CSVs files (one for train, one for test) containing meaning representations and their corresponding natural language utterances for a subset of the  Big-Five personalities (agreeable, disagreeable, conscientious, unconscientious, and extravert), generated using the Personage statistical generator (Mairesse and Walker, 2010).

