HBase | AWS Big Data Blog

Amazon EMR HBase on Amazon S3 transitioning to EMR S3A with comparable EMRFS performance

Starting with version 7.10, Amazon EMR is transitioning from EMR File System (EMRFS) to EMR S3A as the default file system connector for Amazon S3 access. This transition brings HBase on Amazon S3 to a new level, offering performance parity with EMRFS while delivering substantial improvements, including better standardization, improved portability, stronger community support, improved performance through non-blocking I/O, asynchronous clients, and better credential management with AWS SDK V2 integration. In this post, we discuss this transition and its benefits.

Improve Amazon EMR HBase availability and tail latency using generational ZGC

Large-scale HBase deployments on Amazon EMR suffer from unpredictable garbage collection behavior that creates performance bottlenecks for business-critical applications. To solve this problem, Amazon EMR leverages Oracle’s generational ZGC technology from JDK 21 to deliver predictable, sub-millisecond pause times. This post shows you how to configure generational ZGC in Amazon EMR 7.10.0, apply performance tuning methods, and optimize HBase RegionServer garbage collection settings.

Combine NoSQL and Massively Parallel Analytics Using Apache HBase and Apache Hive on Amazon EMR

Ben Snively is a Solutions Architect with AWS Jon Fritz, a Senior Product Manager for Amazon EMR, co-authored this post With today’s launch of Amazon EMR release 4.6, you can now quickly and easily provision a cluster with Apache HBase 1.2. Apache HBase is a massively scalable, distributed big data store in the Apache Hadoop ecosystem. It is […]

AWS Big Data Blog

Tag: HBase

Amazon EMR HBase on Amazon S3 transitioning to EMR S3A with comparable EMRFS performance

Improve Amazon EMR HBase availability and tail latency using generational ZGC

Combine NoSQL and Massively Parallel Analytics Using Apache HBase and Apache Hive on Amazon EMR

Learn

Resources

Developers

Help