Seminar talk, via Zoom.
Abstract: Grammatical error correction is an important end-user application of natural language processing. In recent years, approaches using large language models have led to improved performance on this task, at least for English and a few other well-resourced languages. Nevertheless, it remains challenging to build systems that (1) provide results that are sufficiently reliable for end-users and (2) give some explanation for errors that they detect for the benefit of language learners. I will discuss recent progress on this problem for the Irish language, focusing on an important subset of errors involving the so-called “initial mutations” found in Irish and the other Celtic languages. The primary challenge is assembling a large enough dataset for training — we make use of both synthetic data produced with the help of an Irish dependency parser, as well as error examples mined from Wikipedia edit logs.