The EU’s rich linguistic landscape consists of 24 official languages and over 60 regional and minority languages. But when it comes to technology support and digital services, a big imbalance persists between the five most spoken EU languages and the remaining ones.
Why is digital inequality of smaller languages such a big problem exactly?
Andy Way: In the 21st century, language cannot be a barrier to access information. Digital language services should be available in all languages to ensure a level playing field and full access to digital services for all. This is currently not the case for all but a few obvious EU languages, so most EU citizens have to access services in a language other than their first choice. That is inherently unfair. The EU has flagged this.
To reach digital language equality, artificial intelligence (AI) plays an important role. Actually, one of the core application areas within AI is language technology (LT): developing AI tools that can process and understand human languages and facilitate human-computer interaction. Current LT allows us to build many advanced applications, which were unthinkable only a few years ago, and we will see lots of even more exciting results in the near future.
In fact, LT – also called language-centric AI – is one of the most relevant technologies for society, and it has a fast growing economic impact. However, LT is not equally developed for all European languages, and there are very few dedicated nationally financed programmes addressing LT. Me and other European LT professionals in academia and industry agree that this needs to change.
Your project proposes a roadmap towards achieving full digital language equality by 2030. What are the main points of this roadmap?
Andy Way: The main parts of the roadmap are to provide the path and means needed to implement the strategic research, innovation and implementation agenda (SRIIA), which has two main goals:
- Societal and economic goal: digital language equality (DLE) in Europe in 2030;
- Scientific goal: reach deep natural language understanding (DNLU) via state-of-the-art AI techniques in 2030.
Accompanying this are a set of timelines, actions, and priorities.
The roadmap’s recommendations include EU-level legal protection for over 60 regional and minority languages and a virtual centre for language diversity, coordinated by ELE, comprised of leading LT/AI centres across Europe.
It also recommends promoting a pan-European network of research centres, promoting that all EU-funded projects have a language diversity plan and that they develop better benchmarks and datasets for all languages.
The roadmap also proposes to extend LT beyond language as such, focussing on language and culture-specific technologies (and not just transfer it from English), enforcing open ecosystems, open source, open access, open standards and interoperability, etc.
Another recommendation of the EP your project has been working on is that “Europe has to secure its leadership in language-centric AI”. Could you please explain this? And why is this so important?
Andy Way: The European Language Technology community is world-class but highly fragmented. Together with recently launched infrastructures such as our sister project, the European Language Grid, the ELE programme – the ELE project will hopefully leading to the ELE programme, the funding programme that takes on boards (most of) our recommendations and implements them via the roadmap – can help coordinate this community. Many synergies can be identified and exploited when it comes to the systematic development of tools and resources for all European languages. We have to capitalise on this if the scientific goal of achieving deep natural language understanding by 2030 is to be achieved.
In doing so, Europe will be better placed to stand on its own two feet and not be overly reliant on tools and services produced by multinational corporations located outside Europe – these companies, even though they are located outside of Europe, actually have many high-profile Europeans leading their scientific programmes.
Overall, EU researchers from both academia and industry would need access to:
- sufficient public sector data, data from broadcasters, social media, publishers, etc.;
- flexible access and support to sufficient graphics processing unit-based high-performance computing facilities;
- sufficient international LT experts in EU research centres and companies.
With these tools and resources, they can build the LT needed to ensure full digital language equality for all EU languages by 2030.
How do you see the future?
Andy Way: The ELE project has set out how we can achieve digital language equality for all EU languages by 2030. We have established a strategic research, innovation and implementation agenda and a roadmap as to how this can happen. We acknowledge that there are many topics competing for funding, but if Europe really does value its cultural heritage and linguistic diversity, and really wants equal access to digital services for all its citizens, then this roadmap has to be followed.
With all this prepared by the ELE project, we recommend initiating the ELE programme: a long-term, large-scale funding programme for the development of technologies and resources for all European languages so that all speakers of all languages can benefit from these digital technologies when using modern information technologies.
What more do you think Europe could do?
Andy Way: If the EU does not address the matter properly, the worst-case scenario is that some or maybe even most of these languages will eventually suffer from digital language extinction.
To achieve full digital language equality, the ELE programme has to be supported by national, regional and central funding agencies – while these are all EU languages, national and regional funding agencies need to play their part as well; we cannot rely solely on the Commission to plug the gaps.