Turkey and European Union flags

This project is funded by the European Union.

Good practices


This section will be finalized with the feedback collected in the workshops organized during the project.

While it is clear that a large number of languages in the world require intensive investment in resource creation for technology enablement, it seems highly unlikely that such an investment can be delivered readily and easily in a short span of time. Given these limited resources, language communities should be empowered to determine the future of their languages. In this document, we have presented how digital representation is part of this process.

For deciding where to start, we suggest adopting the methodology of 4-D design thinking of Discover, Design, Develop and Deploy as introduced by Bali et al. in their ELLORA initiative. This user-centric approach is as follows:

  1. Discover what is most needed by the language community,

  2. Design for the users and their language giving attention to diversity of the language and avoiding an approach parting from a majority language,

  3. Develop and deploy frequently in an interative manner constantly improving and detecting failures from the beginning.

Even when the community hasn’t developed a perspective of language technology development, it is good practice to keep the value of data in mind when doing language preservation activities. Some examples of these are:

  • Organize events and create content to raise data-awareness in the language community,

  • Introduce languages to crowdsourcing platforms,

  • Organize datathons for language data collection,

  • Translate folk tales and children’s stories which reside in public domain,

  • Store plain text or document versions of published material in order to create text corpora,

  • Save and openly share translation memories for helping other translators and for creating parallel data,

  • Store recordings of broadcast material (e.g. radioshows) and transcribe if possible so that they can be converted into speech data,

  • Save content published in social media to a permanent place so that they don’t get lost in timelines.

Do you have suggestions or questions? Write us at info-at-collectivat.cat


This document was created with the financial support of the European Union. The content of this website is the sole responsibility of Col·lectivaT and SKAD and does not necessarily reflect the views of the European Union.