MCLC: weibo corpus (1)

Denton, Kirk denton.2 at osu.edu
Tue Aug 14 09:36:40 EDT 2012


MCLC LIST
From: Daan van Esch <daanvanesch at gmail.com>
Subject: weibo corpus (1)
***********************************************************

You can access my linguistically annotated corpus of Sina Weibo messages
at www.leidenweibocorpus.nl <http://www.leidenweibocorpus.nl/>, where you
can also download the full data set (5.1 million messages). I'm going to
add another 15 million messages to the corpus in the coming week or so,
all in the same format, so if you're interested in the larger data set,
check back on the open access page sometime next week.

I may also be able to provide another 25 million messages (give or take a
few) on top of that, but that particular data set does not contain a lot
of meta data or linguistic annotations. If you're interested in that data
set, please contact me off-list.

Best wishes,

Daan van Esch
Leiden University




More information about the MCLC mailing list