Code-switching in Irish tweets: A preliminary analysis

Published in Proceedings of the 3rd Celtic Language Technology Workshop at MT Summit XVII, 2019

Recommended citation: Teresa Lynn and Kevin Scannell. Code-switching in Irish tweets: A preliminary analysis. In Proceedings of the Third Celtic Language Technology Workshop, pages 32–40, Dublin, Ireland, 2019. European Association for Machine Translation. https://kevinscannell.com/files/codeswitch.pdf

Abstract: As is the case with many languages, research into code-switching in Modern Irish has, until recently, mainly been focused on the spoken language. Online user-generated content (UGC) is less restrictive than traditional written text, allowing for code-switching, and as such, provides a new platform for text-based research in this field of study. This paper reports on the annotation of (English) code-switching in a corpus of 1496 Irish tweets and provides a computational analysis of the nature of code-switching amongst Irish-speaking Twitter users, with a view to providing a basis for future linguistic and socio-linguistic studies.