How AI is helping preserve Indigenous languages

Australian researchers have partnered with Google to preserve endangered Indigenous languages.

A robot, known as Opie helps teach children in the Ngukurr community the local languages.

A robot, known as Opie helps teach children in the Ngukurr community the local languages. Source: Supplied.

Australia's Indigenous population is rich in linguistic diversity, with over 300 languages spoken across different communities.

Some of the languages can be as distinct as Japanese is to German.

But many are at risk of becoming extinct because they are not widely accessible and have little presence in the digital space.

Professor Janet Wiles is a researcher with the ARC Centre of Excellence for the Dynamics of Language, known as CoEDL, which has been working to transcribe and preserve endangered languages.

She says one of the biggest barriers to documenting languages is transcription.

"How transcription is done at the moment is linguists select small parts of the audio that might be unique words, unique situations or interesting parts of grammar, and they listen to the audio and they transcribe it," she told SBS News.

The CoEDL has been researching 130 languages spoken across Australia and neighbouring countries like Indonesia.

Their work involves going into communities and documenting huge amounts of audio. So far, they have recorded almost 50,000 hours.

Transcribing the audio using traditional methods is estimated to take two million hours, making it a painstaking and near impossible task.

Knowing time is against them, Professor Wiles and her colleague Ben Foley turned to artificial intelligence.

Source: Supplied.

Crossing the digital divide

Ms Wiles, who specialises in information technology and electrical engineering, was confident AI could accommodate much faster transcription.

She says, last year, they partnered with the technology giant Google to develop what is known as "machine learning" technology, which can process audio recordings.

"What machine learning can do is actually look for those patterns over vast amounts of data. So what we have to do is get data into a form that the machine can read," she said. 

"What artificial intelligence provides is the learning step from audio to the test. It sounds like a really mundane thing, but that's the key step for language to cross the digital divide, so to go from spoken language to a language where people can text on their phone and use it with digital tools."

Mr Foley says they have been able to build models for 12 Australian Indigenous languages, including Kunwok, Kriol, Mangarayi, Nakkara, Pitjantjatjara, Warlpiri, Wubuy, as well as Indigenous languages in the region.

He says it includes languages such as Abui, spoken in Indonesia, and Cook Islands Maori, an East Polynesian language.

"In the 12 months we've been working with Google, we've run two workshops. The first workshop took three days to set up the system," he said. 

"The second workshop, in the afternoon, 20 linguists were able to work on building language models for 12 languages. Six of those were Australian Indigenous languages."

Robots teaching languages

As part of the ongoing project, the team has worked closely with community members in the Northern Territory's Ngkurr region, where over 10 languages are spoken by 1000 people.

Professor Wiles says, with the help of the Ngukurr Language Centre, they have developed a robot, known as Opie, that aids in teaching children the local languages.

"A robot can't teach a language, but what it can do is support the language workers in what they already do. It has three different kinds of activities, stories, which gives children exposure to the complex grammar of their language, and then it has memory games, which is practice with the words, and then it has pronunciation practice."

Professor Wiles says, at the moment, due to the lack of resources, children get only about an hour a week with language workers.

She says the robot enables the children to continue learning even after their lessons have ended.

"The idea of the robot is to amplify their efforts, so everything on the robot has been developed with and chosen by the community as material there. The voices are the voices of people they know in the community, and the children love to hear the voices of their teachers and the community elders," she said.

Published 31 May 2018 at 6:15pm
By Abbie O'Brien
Source: SBS