If you use this data in your research, please refer to and cite:
- Marilyn A. Walker, Pranav Anand, Jean E. Fox Tree, Rob Abbott, Joseph King. "A Corpus for Research on Deliberation and Debate." In Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC), Istanbul, Turkey, 2012.
Overview: The Internet Argument Corpus (IAC) version 2 is a collection of corpora for research in political debate on internet forums. It consists of three datasets: 4forums (414K posts), ConvinceMe (65K posts), and a sample from CreateDebate (3K posts). It includes topic annotations, response characterizations (4forums), and stance.
The Data: The data is stored in MySQL databases. The supporting code uses Python3 and SQLAlchemy.
Works that use this corpus:
- Rob Abbott, Marilyn Walker, Pranav Anand, Jean E. Fox Tree, Robeson Bowmani, and Joseph King. "How can you say such things?!?: Recognizing Disagreement in Informal Political Argument". In Proceedings of the Workshop on Language in Social Media (LSM), Portland, Oregon, USA, 2011.
- Marilyn A. Walker, Pranav Anand, Jean E. Fox Tree, Rob Abbott, Joseph King. "A Corpus for Research on Deliberation and Debate." In Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC), Istanbul, Turkey, 2012.
Download: Fill out the following form to download the datasets. Code can be found at our BitBucket repository.
GitHub: https://github.com/sl-m-lab/Internet-Argument-Corpus/
Contact: Please direct questions to Rob Abbott: abbott [at] soe [dot] ucsc [dot] edu
Website last updated June 21, 2024.