Argument Facet Similarity Corpus

If you use this data in your research, please refer to and cite: Amita Misra, Brian Ecker, Marilyn Walker. "Measuring the Similarity of Sentential Arguments in Dialogue". In The 17th Annual SIGdial Meeting on Discourse and Dialogue (SIGDIAL), Los Angeles, California, USA, 2016.

Overview: When people converse about social or political topics, similar arguments are often paraphrased by different speakers, across many different conversations. Debate websites produce curated summaries of arguments on such topics; these summaries typically consist of lists of sentences that represent frequently paraphrased propositions, or labels capturing the essence of one particular aspect of an argument, e.g. Morality or  Second Amendment. We call these frequently paraphrased propositions. Like these curated sites, our goal is to induce and identify argument facets across multiple conversations, and produce summaries. However, we aim to do this automatically. We frame the problem as consisting of two steps: we first  extract sentences that express an argument from raw social media dialogs, and then rank the extracted arguments in terms of their similarity to one another. Sets of similar arguments are used to represent argument facets. 

The Data: This Corpus contains annotations for argument quality and argument similarity for high quality argument pairs for three debate topics, gun control, death penalty and gay marriage as described in the paper

Related works: These sentences were extracted from the posts in Internet Argument Corpus as described in the following papers.

  • Marilyn A. Walker, Pranav Anand, Jean E. Fox Tree, Rob Abbott, Joseph King. "A Corpus for Research on Deliberation and Debate." In Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC), Istanbul, Turkey, 2012.

Contact: Please direct questions to Amita Misra: amisra2 [at] ucsc [dot] edu