AWS Architecture Blog
Category: Amazon EMR
How Nielsen uses serverless concepts on Amazon EKS for big data processing with Spark workloads
In this post, we follow Nielsen’s journey to build a robust and scalable architecture while enjoying linear scaling. We start by examining the initial challenges Nielsen faced and the root causes behind these issues. Then, we explore Nielsen’s solution: running Spark on Amazon Elastic Kubernetes Service (Amazon EKS) while adopting serverless concepts.
Insights for CTOs: Part 3 – Growing your business with modern data capabilities
This post was co-wrtiten with Jonathan Hwang, head of Foundation Data Analytics at Zendesk. In my role as a Senior Solutions Architect, I have spoken to chief technology officers (CTOs) and executive leadership of large enterprises like big banks, software as a service (SaaS) businesses, mid-sized enterprises, and startups. In this 6-part series, I share […]
How Parametric Built Audit Surveillance using AWS Data Lake Architecture
Parametric Portfolio Associates (Parametric), a wholly owned subsidiary of Morgan Stanley, is a registered investment adviser. Parametric provides investment advisory services to individual and institutional investors around the world. Parametric manages over 100,000 client portfolios with assets under management exceeding $400B (as of 9/30/21). As a registered investment adviser, Parametric is subject to numerous regulatory […]
How to Accelerate Building a Lake House Architecture with AWS Glue
Customers are building databases, data warehouses, and data lake solutions in isolation from each other, each having its own separate data ingestion, storage, management, and governance layers. Often these disjointed efforts to build separate data stores end up creating data silos, data integration complexities, excessive data movement, and data consistency issues. These issues are preventing […]
Field Notes: Building an automated scene detection pipeline for Autonomous Driving – ADAS Workflow
This Field Notes blog post in 2020 explains how to build an Autonomous Driving Data Lake using this Reference Architecture. Many organizations face the challenge of ingesting, transforming, labeling, and cataloging massive amounts of data to develop automated driving systems. In this re:Invent session, we explored an architecture to solve this problem using Amazon EMR, Amazon […]
ERGO Breaks New Frontiers for Insurance with AI Factory on AWS
This post is co-authored with Piotr Klesta, Robert Meisner and Lukasz Luszczynski of ERGO Artificial intelligence (AI) and related technologies are already finding applications in our homes, cars, industries, and offices. The insurance business is no exception to this. When AI is implemented correctly, it adds a major competitive advantage. It enhances the decision-making process, […]
Architecting Persona-centric Data Platform with On-premises Data Sources
Many organizations are moving their data from silos and aggregating it in one location. Collecting this data in a data lake enables you to perform analytics and machine learning on that data. You can store your data in purpose-built data stores, like a data warehouse, to get quick results for complex queries on structured data. […]
Field Notes: Launch Amazon EMR with a Static Private IP in a Private Subnet
Organizations across every industry and sector are looking to easily and cost-effectively process vast amounts of data. Amazon EMR offers a way to instantly provision as much or as little capacity as needed to perform data- intensive tasks. When launching Amazon EMR, the IPs of the primary (master) and core node are automatically assigned at […]
Amazon MSK Backup for Archival, Replay, or Analytics
Amazon MSK is a fully managed service that helps you build and run applications that use Apache Kafka to process streaming data. Apache Kafka is an open-source platform for building real-time streaming data pipelines and applications. With Amazon MSK, you can use native Apache Kafka APIs to populate data lakes. You can also stream changes to […]
AWS Architecture Monthly Magazine: Education
One of the missions of the education industry is to educate the next generation of the industry-ready workforce. Whether K-12, higher education, or continuing education, enabling teachers and professors to effectively deliver curriculum and improve student performance is a goal of Education Technology (EdTech) and learning companies. Two trends for AWS use cases in education […]









