IT Faculty / IT-fakulteten >
Department of Computer Science and Engineering / Institutionen för data- och informationsteknik >
Doctoral Theses / Doktorsavhandlingar Institutionen för data- och informationsteknik >

Unsupervised Learning of Morphology and the Languages of the World

Please use this identifier to cite or link to this item:

Files in This Item:

File Description SizeFormat
gupea_2077_21418_1.pdfPhD, Fulltext2406KbAdobe PDF
gupea_2077_21418_2.pdfAbstract97KbAdobe PDF
gupea_2077_21418_16.pdfThesis78685KbAdobe PDF
Title: Unsupervised Learning of Morphology and the Languages of the World
Authors: Hammarström, Harald
Issue Date: 16-Nov-2009
University: Göteborgs universitet. IT-fakulteten
Institution: Department of Computer Science and Engineering ; Institutionen för data- och informationsteknik
Parts of work: Hammarström, H. (2005). A New Algorithm for Unsupervised Induction of Concatenative Morphology In Yli-Jyrä, A., Karttunen, L., and Karhumäki, J., editors, Finite State Methods in Natural Language Processing: 5th International Workshop, FSMNLP 2005, Helsinki, Finland, September 1-2, 2005. Revised Papers, volume 4002 of Lecture Notes in Computer Science, pages 288–289. Springer-Verlag, Berlin.

Hammarström, H. (2006a). A naive theory of morphology and an algorithm for extraction. In Wicentowski, R. and Kondrak, G., editors, SIGPHON 2006: Eighth Meeting of the Proceedings of the ACL Special Interest Group on Computational Phonology, 8 June 2006, New York City, USA, pages 79–88. Association for Computational Linguistics.

Hammarström, H. (2006b). Poor man’s stemming: Unsupervised recognition of same-stem words. In Ng, H. T., Leong, M.-K., Kan, M.-Y., and Ji, D., editors, Information Retrieval Technology: Proceedings of the Third Publicatons and Contributions 5 Asia Information retrieval Symposium, AIRS 2006, Singapore, October 2006, volume 4182 of Lecture Notes in Computer Science, pages 323–337. Springer-Verlag, Berlin.

Hammarström, H. (2007a). A fine-grained model for language identification. In Proceedings of iNEWS-07 Workshop at SIGIR 2007, 23-27 July 2007, Amsterdam, pages 14–20. ACM.

Hammarström, H. (2007b). A survey and classification of methods for (mostly) unsupervised learning of morphology. In NODALIDA 2007, the 16th Nordic Conference of Computational Linguistics, Tartu, Estonia, 25-26 May 2007. NEALT.

Hammarström, H., Thornell, C., Petzell, M., and Westerlund, T. (2008). Bootstrapping language description: The case of Mpiemo (Bantu A, Central African Republic). In Proceedings of LREC-2008, pages 3350–3354. European Language Resources Association (ELRA).

Hammarström, H. (2009a). Poor man’s word-segmentation: Unsupervised morphological analysis for indonesian. In Proceedings of the Third International Workshop on Malay and Indonesian Language Engineering (MALINDO). Singapore: ACL.

Hammarström, H. (2009b). A Survey of Computational Morphological Resources for Low-Density Languages Submitted.

Forsberg, M., Hammarström, H., and Ranta, A. (2006). Lexicon extraction from raw text data. In Salakoski, T., Ginter, F., Pyysalo, S., and Pahikkala, T., editors, Advances in Natural Language Processing: Proceedings of the 5th International Conference, FinTAL 2006 Turku, Finland, August 23-25, 2006, volume 4139 of Lecture Notes in Computer Science, pages 488–499. Springer-Verlag, Berlin.

Hammarström, H. (2008a). Automatic annotation of bibliographical references with target language. In Proceedings of MMIES-2: Wokshop on Multi-source, Multilingual Information Extraction and Summarization, pages 57–64. ACL.

Hammarström, H. (2008b). Counting languages in dialect continua using the criterion of mutual intelligibility. Journal of Quantitative Linguistics, 15(1):34–45.

Hammarström, H. (2009c). Whence the Kanum base-6 numeral system? Linguistic Typology, 13(2):305–319. m. Hammarström, H. (2009d [to appear]). Rarities in numeral systems. In Wohlgemuth, J. and Cysouw, M., editors, Rara & Rarissima: Collecting and interpreting unusual characteristics of human languages, Empirical Approaches to Language Typology, pages 7–55. Mouton de Gruyter.

Hammarström, H. (2009e). The Status of the Least Documented Language Families in the World Submitted.
Date of Defence: 2009-12-11
Disputation: 10:15 in room HB1, Hörsalsväagen 8
Degree: Doctor of Engineering
Publication type: Doctoral thesis
Keywords: Computational Linguistics
Language typology
Abstract: This thesis presents work in two areas; Language Technology and Linguistic Typology. In the field of Language Technology, a specific problem is addressed: Can a computer extract a description of word conjugation in a natural language using only written text in the language? The problem is often referred to as Unsupervised Learning of Morphology and has a variety of applications, including Machine Translation, Document Categorization and Information Retrieval. The problem is also relevant ... more
ISBN: 978-91-628-7942-6
Appears in Collections:Doctoral Theses / Doktorsavhandlingar Institutionen för data- och informationsteknik
Doctoral Theses from University of Gothenburg / Doktorsavhandlingar från Göteborgs universitet



© Göteborgs universitet 2011