Pedersen S. Pedersen
University of Copenhagen, Copenhagen, Denmark
Lars Borin
University of Gothenburg, Gothenburg, Sweden
Markus Forsberg
University of Gothenburg, Gothenburg, Sweden
Neeme Kahusk
University of Tartu, Tartu, Estonia
Krister Lindén
University of Helsinki, Finland
Jyrki Niemi
University of Helsinki, Finland
Niklas Nisbeth
University of Copenhagen, Copenhagen, Denmark
Lars Nygaard
Kaldera Language Technology, Oslo, Norway
Heili Orav
University of Tartu, Tartu, Estonia
Hiríkur Rögnvaldsson
University of Iceland, Iceland
Mitchel Seaton
University of Copenhagen, Copenhagen, Denmark
Kadri Vider
University of Tartu, Tartu, Estonia
Kaarlo Voionmaa
University of Gothenburg, Gothenburg, Sweden
Ladda ner artikelIngår i: Proceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013); May 22-24; 2013; Oslo University; Norway. NEALT Proceedings Series 16
Linköping Electronic Conference Proceedings 85:16, s. 147-162
NEALT Proceedings Series 16:16, p. 147-162
Publicerad: 2013-05-17
ISBN: 978-91-7519-589-6
ISSN: 1650-3686 (tryckt), 1650-3740 (online)
During the last few years; extensive wordnets have been built locally for the Nordic and Baltic languages applying very different compilation strategies. The aim of the present investigation is to consolidate and examine these wordnets through an alignment via Princeton Core WordNet and thereby compare them along the measures of taxonomical structure; synonym structure; and assigned relations to approximate to a best practice. A common web interface and visualizer “WordTies” is developed to facilitate this purpose. Four bilingual wordnets are automatically processed and evaluated exposing interesting differences between the wordnets. Even if the alignments are judged to be of a good quality; the precision of the translations vary due to considerable differences in hyponymy depth and interpretation of the synset. All seven monolingual and four bilingual wordnets as well as WordTies have been made available via META-SHARE through the META-NORD project.
Wordnets; multilingual links; wordnet web interface; Nordic and Baltic languages; META-NORD.
Rigau; G. and Agirre; E. (2002). Semi-automatic Methods for WordNet Construction. Tutorial at 2002 International WordNet Conference; Mysore; India.
Bhattacharyya; P. (2010) IndoWordNet. Proceedings of LREC 2010. Valletta: ELRA.
Borin; L.; Danélls; D.; Forsberg; M.; Kokkinakis; D. and Gronostaj; M.T. (2010). The past meets the present in Swedish FrameNet++. In Proceedings of the 14th EURALEX International Congress; pp. 269–281. Leeuwarden: EURALEX.
Borin; L. and Forsberg; M. (2009). All in the family: A comparison of SALDO and WordNet. In Proceedings of the Nodalida 2009 Workshop on WordNets and other Lexical Semantic Resources – between Lexical Semantics; Lexicography; Terminology and Formal Ontologies; pp. 7–12. Odense: NEALT.
Borin; L. and Forsberg; M. (2010). Beyond the synset: Swesaurus – a fuzzy Swedish wordnet. In Workshop on Re-thinking synonymy: Semantic sameness and similarity in languages and their description. Helsinki.
Borin; L. and Forsberg; M. (2011). Swesaurus – ett svenskt ordnät med fria tyglar. LexicoNordica vol. 18; pp. 17–39.
Borin; L.; Forsberg; M. and Lönngren; L. (2008). The hunting of the BLARK – SALDO; a freely available lexical database for Swedish language technology. Joakim Nivre; Mats Dahllöf and Beáta Megyesi (eds.); Resourceful language technology. Festschrift in honor of Anna Sågvall
Hein; pp. 21–32. Acta Universitatis Upsaliensis: Studia Linguistica Upsaliensia 7. Uppsala: Uppsala University.
Derwojedowa; M.; Piasecki; M.; Szpakowicz; S.; Zawislawska; M. and Broda; B. (2008). Words; concepts and relations in the construction of the polish WordNet. In Global WordNet Conference 2008; pp. 162–177. Szeged; Hungary.
Daudé J.; Padró L. and Rigau G. (2003). Validation and Tuning of Wordnet Mapping Techniques. Proceedings of the International Conference on Recent Advances on Natural Language Processing (RANLP’03). Borovets; Bulgaria.
Daudé J.; Padró L. and Rigau G. (1999). Mapping Multilingual Hierarchies Using Relaxation Labeling. Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora (EMNLP/VLC’99). Maryland; US.
Fellbaum; C. (ed) (1998). WordNet – An Electronic Lexical Database. Cambridge; Massachusetts: The MIT Press.
Hjorth; E. and Kristensen; K. (2003). Den Danske Ordbog. Gyldendal; Denmark.
Järborg; J. (2001). Roller i Semantisk databas. Research Reports from the Department of Swedish; No. GU-ISS-01-3. University of Gothenburg: Dept. of Swedish. Johannsen; A. and Pedersen; B.S. (2011). “Andre ord” – a wordnet browser for the Danish wordnet; DanNet. In Proceedings from 18th Nordic Conference of Computational Linguistics; NODALIDA 2011; Riga; Latvia. Nothern Association for Language Technology; Vol. 11 pp. 295–298; University of Tartu.
Kann; V. and Rosell; M. (2006). Free construction of a free Swedish dictionary of synonyms In Proceedings of the 15th NODALIDA conference; pp. 105–110. Joensuu: University of Eastern Finland.
Martola; N. (2011). FinnWordNet och det finska samhället. In: Symposium om onomasiologiske ordbøker i Norden. Schæffergården; Copenhagen.
Kahusk; N.; Orav; H. and Vare; K. (2012). Cross-linking Experience of Estonian WordNet. In: Human Language Technologies – The Baltic Perspective: The Fifth International Conference on Human Language Technologies – The Baltic perspective. Tartu; Estonia; October 4-5; 2012. (Ed. Arvi; Tavast; Kadri Muischnek; Mare; Koit). IOS Press; pp. 96–102. Online access: doi:10.3233/978-1-61499-133-5-96
Lenci; A.; Bel; N.; Busa; F.; Calzolari; N.; Gola; E.; Monachini; M.; Ogonowski; A.; Peters; I.; Peters; W.; Ruimy; N.; Villegas; M. and Zampolli; A. (2000). SIMPLE: A general framework for the development of multilingual lexicons. International Journal of Lexicography; vol. 13; pp. 249–263
Lindén; K. and Carlson; L. (2010). FinnWordNet – WordNet på finska via översättning. LexicoNordica – Nordic Journal of Lexicography; vol. 17; pp. 119–140
Lindén; K.; Niemi; J. and Hyvärinen; M. (2012) Extending and Updating the Finnish Wordnet. In Diana Santos; Krister Lindén and Wanjiku Ng’ang’a (eds.); Shall We Play the Festschrift Game? Essays on the Occasion of Lauri Carlson’s 60th Birthday; pp. 67–98. Springer: Berlin; Heidelberg. ISBN 978-3-642-30773-7.
Pedersen; B.S; Nimb; S.; Asmussen; J.; Sørensen; N.; Trap-Jensen; L. and Lorentzen; H. (2009). DanNet – the challenge of compiling a WordNet for Danish by reusing a monolingual dictionary. Language Resources and Evaluation; Computational Linguistics Series; pp. 269– 299.
Pedersen; B.S.; Nimb; S. and Braasch; A. (2010). Merging specialist taxonomies and folk taxonomies in wordnets. - a case study of plants; animals and foods in the Danish wordnet In: Proceedings from the Seventh International Conference on Language Resources and Evaluation; pp. 3181–3186. Malta.
Peters; W.; Vossen; P.; Díes-Orzas; P. and Adriaens; G. (1998). Cross-lingual Alignment of Wordnets with an Inter-Lingual-Index. In: EuroWordNet – A Multilingual Database with Lexical Semantic Networks; pp. 149–179. Kluwer Academic Publishers.
Pustejovsky; J. (1995). The Generative Lexicon. Cambridge; Massachusetts: MIT Press.
Robkop; K.; Thoongsup; S.; Charoenpron; T.; Sornlertlamvanich; V. and Isahara; H.. (2010).WNMS: Connecting Distributed Wordnet in the Case of Asian WordNet. In: Proceedings of the 5th International Conference of the Global WordNet Association (GWC 2010); Mumbai; India.
Tufis; D.; Ion; R. and Ide; N. (2004). Word Sense Disambiguation as a Wordnets Validation Method in BalkaNet. In Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC 2004); pp. 1071–1074. Lisbon: ELRA
Vossen; P. (ed.) (1998). EuroWordNet: A Multilingual Database with Lexical Semantic Networks. Dordrecht: Kluwer Academic Publishers.