Initial mutations as low-entropy features in neural language modeling


Seminar talk.

Abstract: The state-of-the-art for a number of important English-language NLP tasks has improved rapidly with the introduction of neural network methods over the last 5-10 years. While these approaches have been successfully applied to many other languages, progress in the field as a whole has been measured by advancing the state-of-the-art for English. This has led to models that require huge amounts of training data in order to achieve reasonable performance, and this can present difficulties for languages which have limited resources or which are typologically very different from English. This talk will cover recent developments in language modeling for Irish and Scottish Gaelic that make use of special linguistic properties of these languages.