[CaCL] CaCL 9/17: A Crowdsourced Corpus of Multiple Judgments and Disagreement on Anaphoric Interpretation

Oh, Byung-Doh oh.531 at buckeyemail.osu.edu
Thu Sep 10 14:11:18 EDT 2020


Hi everyone,

Willy will be leading the discussion of the following paper next Thursday via Zoom (https://osu.zoom.us/j/94712017994?pwd=cVdNUGRSMkMyU2R0dTZaZ2xnSmxjUT09<https://urldefense.com/v3/__https://osu.zoom.us/j/94712017994?pwd=cVdNUGRSMkMyU2R0dTZaZ2xnSmxjUT09__;!!KGKeukY!hYGiY_UAuKLEYNU4jiPcQ5s8lVvpAdOrFRRJXO5SsGDNjowTpFbEysaH8ahMkX8gyR3TY9iOdw$>):


A Crowdsourced Corpus of Multiple Judgments and Disagreement on Anaphoric Interpretation

https://www.aclweb.org/anthology/N19-1176.pdf


We present a corpus of anaphoric information (coreference) crowdsourced through a game-with-a-purpose. The corpus, containing annotations for about 108,000 markables, is one of the largest corpora for coreference for English, and one of the largest crowdsourced NLP corpora, but its main feature is the large number of judgments per markable: 20 on average, and over 2.2M in total. This characteristic makes the corpus a unique resource for the study of disagreements on anaphoric interpretation. A second distinctive feature is its rich annotation scheme, covering singletons, expletives, and split-antecedent plurals. Finally, the corpus also comes with labels inferred using a recently proposed probabilistic model of annotation for coreference. The labels are of high quality and make it possible to successfully train a state of the art coreference resolver, including training on singletons and non-referring expressions. The annotation model can also result in more than one label, or no label, being proposed for a markable, thus serving as a baseline method for automatically identifying ambiguous markables. A preliminary analysis of the results is presented.

=================
Byung-Doh Oh (he/him/his)
Ph.D. Student
Department of Linguistics
The Ohio State University
https://byungdoh.github.io
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osu.edu/pipermail/cacl/attachments/20200910/6e75d9ee/attachment.html>


More information about the CaCL mailing list