Annotation of the Corpus of the Saeima with Multilingual Standards

Published on 2018-05-30619 Views

Roberts Darģis

This paper describes a release of corpus of Saeima (parliament of Latvia) as open data resources for multidisciplinary research. The corpus consists of the transcription of Latvian parliamentary debat

ParlaCLARIN Workshop 2018 - Miyazaki

Related categories

Presentation

ParlaCLARIN Workshop: Creating and Using Parliamentary Corpora, Miyazaki 201800:00

Annotation of the Corpus of the Saeima with Multilingual Standards00:10

Motivation00:17

The Corpus of the Saeima00:47

Morphological and Syntactical Annotations01:27

Bonito corpus browser (NoSketch engine)01:44

Universal Dependencies (CoNLL-U)02:08

Machine Translation to English02:19

Named Entities02:42

LinkedSaeima I – structure03:18

LinkedSaeima II – interfaces03:49

LinkedSaeima III – innovation04:40

Conclusions05:27

Thank you!06:19