AWS Public Sector Blog

Indiana University streamlines social science research with secure automated transcription on AWS

AWS Branded Background with text "Indiana University streamlines social science research with secure automated transcription on AWS"

Recorded interviews are a cornerstone of social science research, offering rich insight into human behavior, beliefs, and lived experiences. But before that data can be analyzed, it typically must be transcribed—a process that can often consume hundreds of hours and significant portions of research budgets. At Indiana University (IU), researchers and graduate assistants were spending hundreds of hours a year transcribing every detail from recorded audio, creating a major bottleneck for projects that rely on qualitative data.

To help researchers reclaim their time and focus on discovery, IU developed a secure, scalable, and cost-effective Automated Transcription Service (ATS) built on Amazon Web Services (AWS). Since launching, the service has supported nearly 60 projects across 16 departments and has become a model for how universities can use cloud-native tools to affordably streamline research workflows and protect sensitive data.

Simplifying transcription for busy research teams

Indiana University is home to a vibrant community of social science researchers across multiple campuses. The scope of social inquiry extends beyond traditional disciplines to include areas like education, public health, and even music and dentistry. Supporting this diverse research ecosystem is the Social Science Research Commons (SSRC), part of IU’s Institute for Social and Behavioral Research.

For many of these researchers, recorded conversations are essential to capturing firsthand insight. But handling and transcribing that kind of sensitive audio has long been a challenge. Commercial tools often lack adequate privacy protections, and manual transcription quickly drains time and budgets. These challenges led Emily Meanwell, director of IU’s SSRC, to spearhead the search for a solution.

From the outset, it was incumbent to the IU team that whatever they developed prioritized usability, affordability, and security. “Voice recordings are considered personally identifiable information,” said Meanwell. “And as social science researchers, we take the privacy and confidentiality of our research subjects very seriously. We wanted a tool our researchers could use that would meet all of our security requirements.”

Turning a pilot project into a university-wide tool

The idea to create a university-wide transcription platform came from a collaboration between research leaders, cybersecurity professionals, and technologists across IU. “We had a researcher whose transcription needs were consuming the entire budget for their research assistant,” said Will Drake, a security analyst at IU’s Center for Applied Cybersecurity Research. “That assistant didn’t even have time to analyze the data. We realized this was a common issue, and that pointed us toward an automated approach.”

AWS stood out early, in part because the university already had a HIPAA business associate agreement in place—a key requirement for securely handling sensitive or health-related data. The team then searched the AWS Solutions Library and found a sample solution using Amazon Transcribe in a call center scenario. Alan Walsh, an IU technologist said: “This got us 80 percent of the way there.” With some tweaking, including a custom Python script to convert JSON output into usable Word documents, the team had a working proof of concept in just a few days.

IU leadership quickly backed the project. “IU Research was happy to support this innovative project to meet the needs of social science researchers with a secure, university-wide solution,” said Brea Perry, associate vice president for research and vice provost for research at IU Bloomington. With institutional support, the team forged ahead on development.

Building an automated transcription service on AWS

With the mandate of creating a system designed to be both researcher-friendly and highly secure, from concept to first deployment, the ATS solution came together in just two months. From the user side, the application is frictionless. Once a research project is approved by the SSRC, researchers securely transfer their files, and the ATS manager simply uploads the audio files to a secure cloud location. From there, the system takes over automatically and processes the audio, transcribes the content, and notifies the ATS manager when the final document is ready. Transcripts are delivered as clean Word files with any low-confidence sections highlighted for streamlined review, making it faster and simpler for researchers to analyze their data.

Behind the scenes, the entire solution runs on AWS’s serverless technology to support scalability, cost-efficiency, and security. Amazon Simple Storage Service (Amazon S3) handles file storage and enforces lifecycle policies to make sure data isn’t retained longer than necessary. Amazon Transcribe handles voice-to-text conversion, automatically detecting language and speakers; Amazon Transcribe supports over 100 different languages so researchers can also utilize multilingual qualitative data. AWS Lambda and AWS Step Functions orchestrate the workflow and convert raw JSON output into user-friendly Word documents. Job details and reporting data are stored in Amazon DynamoDB, allowing for simple tracking and analysis. Finally, Amazon Simple Notification Service (Amazon SNS) and Amazon EventBridge handle real-time notifications and delivery, integrating seamlessly with tools like Microsoft Teams and Slack to alert the team when a transcribed file is ready.

Walsh explains that going serverless was a core principle for the project: “By using nothing but serverless services, there isn’t anything that needs to be maintained, patched, updated, or taken care of. AWS is doing all the heavy lifting.”

Delivering faster time to secure research discovery

Since launching university-wide in late 2023, the ATS has supported 57 projects from 16 different departments, transcribing 677 files in 2024 alone. Researchers have adopted the tool across multiple disciplines, helping them reclaim valuable time in various research modalities.

Meanwell says the feedback has been overwhelmingly positive. “One Ph.D. candidate even sent a handwritten thank-you note for helping with their dissertation. That was a first for me,” she said.

Beyond efficiency, the ATS has had a broader impact. “It’s changed how we support researchers institutionally,” said Drake. “Before, we had to help researchers shop around for vendors and walk them through security requirements. Now, we have a vetted, internal tool that meets those needs.”

What other research institutions can learn from IU’s approach

Indiana University’s success with the ATS offers a roadmap for other institutions looking to support researchers with secure, cloud-based tools. Centralizing services through a body like the SSRC supports consistent security, streamlined workflows, and ease of access. With sensitive data like voice recordings, prioritizing privacy and treating personally identifiable information with care is essential. Usability is also key; many researchers aren’t technical experts, so tools must be intuitive and accessible. IU’s team started with resources from the AWS Solutions Library and built from there, customizing the solution to fit the university’s specific needs.

IU has open-sourced the ATS and hopes other institutions will adopt and expand on the work. “There’s a lot of interest from peers, and we’re excited about future collaborations,” said Walsh.

Scaling research support with the AWS Cloud

With input from researchers, strong institutional backing, and the flexibility of AWS, IU built a solution that empowers academics to do what they do best: ask bold questions, gather rich data, and drive discovery.

As the university looks ahead, the ATS stands as a model for how AI-powered, cloud-native solutions can support academic research while protecting participants and data.

Learn how AWS can help your institution build, deploy, and scale innovative solutions, and contact us today.

Read related stories on the AWS Public Sector Blog:

Brian DeKemper

Brian DeKemper

Brian is an enterprise account executive at AWS, supporting research university customers in the Great Lakes region. He’s spent over two decades working in technology and services companies that focus on the higher education market. Outside of work, he enjoys skiing and traveling with his family.

Luke Coady

Luke Coady

Luke is a solutions architect at AWS with over 23 years of combined experience in technology. His experience spans both traditional on-premises infrastructure and modern cloud architectures, making him particularly adept at guiding educational institutions through their digital transformation journeys.