Publicerad: 2021-05-21
ISBN: 978-91-7929-614-8
ISSN: 1650-3686 (tryckt), 1650-3740 (online)
We present the ongoing NorLM initiative to support the creation and use of very large contextualised language models for Norwegian (and in principle other Nordic languages), including a ready-to-use software environment, as well as an experience report for data preparation and training. This paper introduces the first large-scale monolingual language models for Norwegian, based on both the ELMo and BERT frameworks. In addition to detailing the training process, we present contrastive benchmark results on a suite of NLP tasks for Norwegian. For additional background and access to the data, models, and software, please see: http://norlm.nlpl.eu
ELMo, BERT, Norwegian, pre-trained models, contextualized embeddings, Nordic language models
Inga referenser tillgängliga