Collection, storage and analysis of online teenage talk: assets and challenges
Report a problem or upload filesIf you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
I will address a range of issues based on 10 years of experience with sociolinguistic research on informal computer-mediated communication (CMC) produced by youngsters. Starting from the two main datasets we are currently working with (corpus 2007-2013 and corpus 2015-2016), I’ll discuss some challenges with respect to gathering data on the social profile of the informants and some ethical issues. Next, attention will be devoted to the consequences of the size and (often imbalanced) composition of CMC-corpora for the data processing. In order to illustrate the challenges of the genre I'll briefly deal with a specific methodological issue: whether or not to operationalize the occurrence of CMC-features as binary or ordinal variables. Finally, while large corpora generally trigger (and necessitate) quantitative data processing, I want to stress that supplementary qualitative research may be indispensable if we do not want to get alienated from CMC-pragmatics.
Download slides: clarinplusworkshop2017_vandekerckhove_storage_01.pdf (314.7 KB)
Download slides: clarinplusworkshop2017_vandekerckhove_storage_01.pdf (207.6 KB)
Download slides: clarinplusworkshop2017_vandekerckhove_storage_01.pdf (833.5 KB)
Link this pageWould you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !