Pivot Pre-finetuning for Low Resource MT: A Case Study in Kikamba

Stephen Kiilu, Machel Reid

May, 2023

Abstract

Current approaches to performant machine translation often require large amounts of data (Koehn et al., 2022). However for a majority of 7000+ languages in the world, these languages often have a relative lack of digitized/organized text available, and are considered low-resource. In practical terms, this often means that there is a substantial drop in quality between translation performance between high and low-resource language pairs. We look to explore the intersection of rapid NMT adaptation techniques and pre-trained sequence to sequence models to better leverage multilingual models, performing a case study on Kikamba.

Type

Conference paper

Publication

ICLR 2023 TinyPapers

Source Themes

Pivot Pre-finetuning for Low Resource MT: A Case Study in Kikamba

Abstract

Stephen Kiilu

NLP Researcher