Internet Argument Corpus v2

If you use this data in your research, please refer to and cite:

  • Marilyn A. Walker, Pranav Anand, Jean E. Fox Tree, Rob Abbott, Joseph King. "A Corpus for Research on Deliberation and Debate." In Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC), Istanbul, Turkey, 2012.

Overview: The Internet Argument Corpus (IAC) version 2 is a collection of corpora for research in political debate on internet forums. It consists of three datasets: 4forums (414K posts), ConvinceMe (65K posts), and a sample from CreateDebate (3K posts). It includes topic annotations, response characterizations (4forums), and stance.

The Data: The data is stored in MySQL databases. The supporting code uses Python3 and SQLAlchemy.

Works that use this corpus:

Download: Fill out the following form to download the datasets. Code can be found at our BitBucket repository.

GitHub: https://github.com/sl-m-lab/Internet-Argument-Corpus/

Contact: Please direct questions to Rob Abbott: abbott [at] soe [dot] ucsc [dot] edu

Website last updated June 21, 2024.

Download Argument Extraction Corpus