Skip to main content
2025

Using AI and data sovereignty to preserve Indigenous culture and languages

Overview

The Ngalia people of Australia face the daunting challenge of preserving their cultural heritage in a rapidly changing world. A collaboration between Kiwa Digital, the Ngalia Heritage Research Council, and AWS is using AI to help preserve this invaluable cultural knowledge. Kiwa Digital has produced CultureQ™ software built on AWS and underpinned by generative AI to securely store, manage, interact, and publish cultural assets while maintaining Indigenous data sovereignty. This digital archival system offers Indigenous communities worldwide a powerful tool for preserving and promoting their cultural heritage, knowledge, and cultural sustainability for future generations. “Our mission is really simple.  It’s building technology that creates identity and intergenerational learning on infrastructure that our communities control and direct,” said Steven Renata, Managing Director, Kiwa Digital.

About Kiwa Digital

Initially established as a post-production software company, Kiwa Digital founder Rhonda Kite in 2009 pioneered a text-to-voice user experience for children’s apps, specializing in Te Reo Māori, the Indigenous language of New Zealand, and inclusive needs. Over a decade, Kiwa Digital’s business and technology evolved towards a stronger focus on building cultural literacy in adults. The company’s unique cultural perspective and advanced technologies mean it does more than build apps – it uplifts and empowers Indigenous voices, people and cultures around the world.

Challenge | Limited ways to preserve language

Languages worldwide are disappearing at an alarming rate, with one language becoming extinct approximately every 40 days. Of the world's estimated 7,200 languages, many face extinction due to several critical factors: younger generations abandoning their ancestral tongues, cultural assimilation, and displacement of indigenous communities. This rapid loss of linguistic diversity represents not just the disappearance of words and grammar, but the erosion of unique cultural perspectives and traditional knowledge. “A global language goes silent. And it’s that urgency that really drives our work.  We’re trying to move from beyond translation into living knowledge systems that enable that sovereignty and enable communities to control the future destiny and sustainability of that language,” said Renata.

There is no single standard method for preserving endangered languages. “Until now, Indigenous people had limited ways to preserve language and culture, with many unwilling or unaware of how to leverage technology to help,” said Renata. “Users could only access static content.” Traditionally, data sovereignty is the concept that data is subject to the laws of the country where it is physically stored, and global cloud computing sometimes makes data residency a complex issue. For Kiwa Digital, data sovereignty has nuances: “First of all, it’s about understanding that data is a taonga, a treasure.  And for the Indigenous groups it means what is that data, where it came from, who it is sourced to, acknowledged to, and how it will be used.  This understanding of data as part of data sovereignty is critical now and in a hundred years from now.  Having that contextual understanding of data as part of data sovereignty is crucial to what CultureQ can actually do,” said Renata. A key part of the challenge  was trying to weave data governance, security, and usability into one sovereign platform that would scale across very diverse Indigenous communities.

Opportunity | Using generative AI to interact with cultural data

Kiwa Digital’s mission is to help Indigenous people protect, govern, and activate their knowledge and learning systems so that they are safe, sovereign, and future proofed. To build CultureQ, Kiwa Digital partnered with AWS and Custom D, creators of the Caitlin generative AI platform. “Because generative AI is quite new, data governance and security is hard, but it presented a new opportunity,” said Sam Sehnert, CTO at Custom D. “We did a lot of work in the fintech and insure tech space.  Through that work we gained a really good understanding of data governance, rules, regulations, and compliance, which we brought to the collaboration.”

With generative AI integrated into CultureQ, users can now have conversations and interact with their cultural data in a secure and managed environment.  The first opportunity the company identified for CultureQ was interacting with elders and Indigenous people all over the world to find out where all their records and data were dispersed.  “In fact, they were everywhere!  They were on UBS sticks, hard drives, Māori cupboards, under Uncle Steve’s bed.  And this created a massive problem.  The opportunity was how to bring all of those records together in a place that’s safe and secure for now and future generations,” said Renata. The opportunity was to be able to consolidate data from very disparate places into one secure place.  The second opportunity was the ability to weave everything into a single platform that can be controlled by a variety of different groups.  “We looked at the tools out there and it was clear that many of them dealt with one slice of the opportunity, often translation.  But there was really nothing out there that delivered an end-to-end solution for cultural governance and the ability to use private AI built by Indigenous people for Indigenous people,” said Renata.

Solution | Running RAG requests to identify relationships and distill accurate information from multiple sources

Kiwa Digital aimed to create a controlled interface that would respect Indigenous protocols by exclusively featuring verified knowledge sources with proper citations. This approach ensured both academic integrity and cultural sensitivity in accessing Indigenous information.

Kiwa Digital aimed to develop a guided interface that would obey Indigenous group protocols and rules by exclusively using approved knowledge sources and properly citating the sources. “Our approach to generative AI was to treat it like a kaitiaki, a guardian. And to never scrape the open internet.  This is the absolute key role of what we call guardian AI within CultureQ,” says Renata.

Custom D, through its Caitlyn platform, brought a secure AI architecture on Amazon Bedrock, private VPC patterns, and Zero Trust design, a cybersecurity strategy based on the core principle of ‘never trust, always verify.’ Zero Trust design means that no user, device, or application is automatically trusted, regardless of whether they’re inside or outside of an organization's network perimeter. Custom D helped Kiwa Digital embed governance and appropriate metadata to ensure that the AI answers are culturally sensitive and accurate. Custom D’s Caitlyn platform — built on Amazon Bedrock — provides the secure, private generative AI foundation used in CultureQ. Custom D also provide wraparound services to help onboard customers to generative AI, identify opportunities, and spot use cases within their organizations. “Some of the data governance, data sovereignty, and security aspects of what we’ve built with Caitlyn on the data management side were really crucial to Kiwa Digital’s use case,” said Sehnert. “One of the key reasons we chose AWS as the platform for Caitlyn is that Amazon Bedrock works in a way that protects your data. You’re not sending data out to third party model providers, and you’re in complete control of how that data is used and when it is used.”

Caitlyn’s architecture uses Bedrock knowledge bases to take advantage of Retrieval Augmented Generation (RAG), a technique that involves drawing information from a data store to augment the responses generated by Large Language Models (LLMs). When setting up a knowledge base with a specific data source, the application can query the knowledge base to return information either with direct quotations from sources or with natural responses generated from the query results. The company used Amazon Bedrock knowledge bases to abstract from the heavy lifting of building its own RAG pipeline and reduce application build time and uses Amazon OpenSearch to sync with  that data. Knowledge bases reduce operational costs by avoiding the need to continually train models to leverage private unstructured data.

Caitlyn’s ingestion pipeline uses AWS Step Functions to create state machines, a series of event-driven steps to build distributed applications, automate processes, orchestrate microservices, and to create data and machine learning pipelines. “It’s a really great way of managing and getting visibility into really complex processes that are running,” said Sehnert. The platform also uses Amazon DynamoDB as its operational serverless, NoSQL, fully managed database for single-digit millisecond access latency at any scale. Caitlyn also pulls data from the lakehouse architecture of Amazon SageMaker to query data in-place with Apache Iceberg–compatible tools and engines and then pushes it through Amazon Athena to interactively run data analytics using Apache Spark to analyze data directly in Amazon Simple Storage Service (Amazon S3) Tables. Amazon S3 Tables are used to store private tabular data at scale for Apache Iceberg queries. The company uses Amazon Quick Sight to generate BI dashboards using built-in agents for research and automation in Quick Suite while maintaining enterprise-grade security and governance.

“One of the key things we’ve built with Caitlyn is automated guardrails designed around the data ingested into Caitlyn and around the metadata that is then applied to that data.  When an organization designs their metadata scheme, they can put various rules around how that data is used within the AI, who can access it, and what can it be used for,” said Sehnert.

Outcome | CultureQ used to build global Indigenous cultural literacy

Culture Q is paying dividends in the preservation of Indigenous cultures, delivering on Kiwa Digital’s dream. “Our Tongan community is using CultureQ for a Tongan medical dictionary, bringing together phrases and terms, some of which have never been used before, and enabling that for a modern day audience,” said Renata. “We’re working with Ngalia people of the Western Australian to look at ancient maps and take those images, some of which are very sacred, and use generative AI to create modern day resources that are useful for the communities.” Kiwa Digital is also working very closely with the Cherokee Nation Language Department, helping them to build the first ever Indigenous dictionary which is immersive, having advanced functions for search terms to help people understand the context of the words and phrases. “We’ve leveraged over 50 years of work creating that dictionary in a non-digital manner, and now we’re bringing it into the modern society context,” says Renata.

“When I reflect on our culture and the trials and tribulations that we’ve gone through, to be given this challenge to go back and help is something that you pinch yourself for every day. With it comes great responsibility, so working with my team and Custom D and AWS we’re really trying to change the world in a way that’s meaningful, sustainable, and that we stay true to the kaupapa, the purpose, which is the people that it serves,” concluded Renata.

Did you find what you were looking for today?

Let us know so we can improve the quality of the content on our pages