AWS Storage Blog

Cloud-powered tick data: revolutionizing financial data storage with Amazon S3 and LSEG

Data has become the lifeblood of modern financial markets, driving everything from investment decisions to regulatory compliance. Nowhere is this more evident than in electronic trading, where the ability to efficiently store, process, and analyze historical market data can make the difference between success and failure. Market participants are witnessing an unprecedented surge in tick data volumes, with annual growth rates of 20-40%, while simultaneously needing ever-greater precision. This evolution has put significant pressure on financial institutions, many of which find their traditional on-premises infrastructure struggling to keep pace with both the technical demands and associated costs of managing these expanding datasets.

As financial institutions grapple with these escalating data challenges, cloud-based solutions offer a transformative path forward. AWS provides the ideal foundation for handling the massive scale and performance requirements of financial market data. The virtually unlimited storage capacity, durability, and flexible access patterns of Amazon S3 make it particularly well-suited for tick data repositories. AWS Partner LSEG’s Tick History Data and Tick History PCAP (TH Data and TH PCAP) – S3 Direct powered by Amazon S3 – offers a comprehensive solution for financial market data storage and access. This collaborative solution combines LSEG’s market data expertise with the AWS cloud infrastructure to deliver:

  • Petabyte-scale financial data storage with enterprise-grade security
  • Optimized cost management through S3 Intelligent-Tiering storage class
  • Global, high-performance data access through the AWS worldwide network
  • Seamless integration with analytics tools for deeper market insights
  • Elimination of on-premises infrastructure management overhead

In this post, we examine the growing challenges of financial market data management and how cloud solutions can address these needs. We explore the architecture and workflow of LSEG’s TH Data and TH PCAP solution powered by Amazon S3 and detail the AWS services that power this solution. Finally, we summarize the comprehensive benefits that make this approach compelling for financial institutions seeking to modernize their market data operations.

How cloud solutions address financial data challenges

Financial institutions typically face multiple challenges with their market data infrastructure:

  • Managing hundreds of terabytes of monthly data transfers
  • Maintaining expensive on-premises infrastructure requiring significant capital investment
  • Unpredictable costs as data volumes grow
  • Complex scaling challenges, particularly when managing data from multiple global exchanges
  • Specialized hardware, connectivity, and skills requirements that divert resources from core business activities

Cloud-based solutions offer transformative approaches to these market data challenges. Using cloud storage services such as Amazon S3 allow financial institutions to access virtually unlimited storage capacity without the burden of managing physical infrastructure. The pay-as-you-go pricing model provides cost predictability that traditional infrastructure can’t match.When combined with specialized financial data services, such as LSEG’s TH Data and TH PCAP, these cloud solutions create powerful ecosystems that allow organizations to focus on analysis rather than infrastructure management. The integration of high-quality financial data with scalable cloud storage eliminates technical barriers such as infrastructure capacity constraints, data transfer bottlenecks, complex scaling requirements across global exchanges, specialized hardware dependencies, and the need for dedicated skills to maintain on-premises systems – challenges that previously hindered financial institutions from achieving their full analytical potential.Implementing this solution allows financial institutions to significantly reduce total cost of ownership while gaining improved data quality, operational clarity, and the ability to scale dynamically as data requirements grow.

The following diagram showcases the user interface for querying LSEG’s TH Data and TH PCAP data through Amazon Athena. The interface provides a full view of all Tick History and PCAP data, allowing customers to see all venues, fields, and descriptors. The query functionality lets users combine, code, and transform TH Data and TH PCAP data alongside their own proprietary data through a unified AWS interface. AWS Glue serves as the data catalog in this solution, maintaining metadata about the tick data stored in Amazon S3 and making it discoverable for Athena queries. While ETL transformation with AWS Glue is optional, it’s available for customers who need additional data processing beyond what’s possible with direct queries.

The following diagram showcases the user interface for querying LSEG’s TH Data and TH PCAP data through Amazon Athena. The interface provides a full view of all Tick History and PCAP data, allowing customers to see all venues, fields, and descriptors. The query functionality lets users combine, code, and transform TH Data and TH PCAP data alongside their own proprietary data through a unified AWS interface. AWS Glue serves as the data catalog in this solution, maintaining metadata about the tick data stored in Amazon S3 and making it discoverable for Athena queries. While ETL transformation with AWS Glue is optional, it's available for customers who need additional data processing beyond what's possible with direct queries.

Solution overview

LSEG’s TH Data and TH PCAP solutions are powered by Amazon S3 and operating via a comprehensive four-step process that handles data from capture to customer access:

  1. Data capture: LSEG captures traditional market data directly from exchange data centers using LSEG Elektron Network for TH Data, while TH PCAP data is independently collected through the LSEG Low Latency Group’s capture suite technology deployed in exchange colocation facilities.
  2. Processing and normalization: Data is processed, normalized, and quality-assured.
  3. Cloud storage: Processed data is stored in a dedicated LSEG S3 bucket in multiple formats.
  4. Customer access: LSEG’s S3 Direct customers can query this data directly or download it on-demand over a congestion-free global network.

LSEG’s TH Data and TH PCAP solutions are powered by Amazon S3 and operating via a comprehensive four-step process that handles data from capture to customer access: Data capture: LSEG captures traditional market data directly from exchange data centers using LSEG Elektron Network for TH Data, while TH PCAP data is independently collected through the LSEG Low Latency Group’s capture suite technology deployed in exchange colocation facilities. Processing and normalization: Data is processed, normalized, and quality-assured. Cloud storage: Processed data is stored in a dedicated LSEG S3 bucket in multiple formats. Customer access: LSEG’s S3 Direct customers can query this data directly or download it on-demand over a congestion-free global network.

The preceding diagram shows the global network architecture of LSEG’s TH Data and TH PCAP – S3 Direct solution. The LSEG AWS account (containing LSEG’s TH Data and TH PCAP S3 Bucket and S3 Access Points) connects through the AWS global network, which features multiple AWS PoPs and S3 Transfer acceleration. This robust network infrastructure enables data to flow efficiently to both cloud-based LSEG S3 Direct clients (in AWS Regions A, B, and C) and to on-premises data centers.

LSEG’s TH Data and TH PCAP solution uses the Amazon S3 object storage architecture in several sophisticated ways:

  • Data organization: Financial tick data is stored using a partition-based structure, meaning the data is physically divided into separate segments or “partitions” based on logical categories such as exchange, instrument type, date, and time intervals. This partitioning strategy distributes data across storage in a hierarchical manner (for example /exchange/instrumenttype/date/hour/), allowing the system to quickly access only the relevant data segments when querying specific market segments and timeframes, rather than scanning the entire dataset.
  • Storage classes optimization: The solution implements automatic S3 Lifecycle policies that transition historical tick data through the Amazon S3 storage classes (Standard → Intelligent-Tiering), optimizing costs while maintaining accessibility based on data age and query patterns. S3 Intelligent-Tiering automatically moves data between access tiers when access patterns change, and can archive objects that become infrequently accessed, delivering automatic storage cost savings.
  • Parallel request processing: The solution uses S3’s ability to handle massive concurrent read requests, enabling hundreds of financial analysts and algorithms to simultaneously access different segments of tick data without performance degradation. This is essential during high-volume market analysis periods.

The implementation uses several key AWS services, with Amazon S3 at its core:

  • Amazon S3: Provides the foundation for secure, durable, and scalable storage. S3 Intelligent-Tiering automatically optimizes storage costs based on access patterns.
  • Amazon S3 Transfer Acceleration: Available for optimized data transfers when needed.
  • S3 Access Points: Provides customized access control for different customers, providing tailored permissions to specific tick datasets based on client subscriptions and entitlements.
  • AWS Global Backbone: Provides high-speed, reliable data transfer between LSEG’s centralized storage and customer access points worldwide through the AWS private global network infrastructure, offering consistent performance and minimizing public internet congestion. Amazon CloudFront Points of Presence (PoPs) deliver low-latency access to market data through the AWS global edge network locations, reducing data access times for geographically distributed users.
  • AWS Direct Connect: Dedicated network connection from LSEG’s on-premises environments to AWS, facilitating secure and reliable data movement into LSEG’s AWS S3 bucket.
  • Amazon Athena: Helps you make SQL queries directly against data stored in Amazon S3, allowing for interactive analysis of market data without having to move the data into a separate analytics system.
  • AWS Glue: Serves as a centralized metadata catalog for Amazon Athena to efficiently query tick data stored in Amazon S3. AWS Glue maintains the schema definition and table structure, enabling users to discover and query financial market data without having to manually define table structures. It also offers ETL (Extract, Transform, Load) capabilities for preparing and transforming tick data for downstream analytics.
  • AWS Identity and Access Management (IAM): Provides fine-grained access control with least-privilege principles.
  • AWS Key Management Service (AWS KMS): Manages encryption keys for data security at rest and in transit.

This solution uses the inherent scale and redundancy of Amazon S3 while eliminating the need to provision and maintain expensive on-premises infrastructure. The tick data is stored in multiple formats to support different access patterns:

  • Parquet files: for columnar compression and analytics efficiency, reducing storage costs by 40-60% while dramatically improving query performance.
  • CSV formats: for broad compatibility with legacy systems.

Benefits of cloud-powered market data solutions

Cloud-powered market data solutions offer comprehensive benefits for financial institutions:

  • Significant cost reduction: Move from capital expenditure to predictable operational costs with potential infrastructure savings of up to 80%
  • Streamlined operations: Eliminate infrastructure management headaches
  • Enhanced market data quality: Access enterprise-grade market data with consistent quality across asset classes and global markets
  • Future-proof scalability: Never worry about outgrowing your infrastructure
  • Global accessibility: Access your data from anywhere in the world
  • Focus on core business: Redirect technical resources to value-adding activities

Amazon S3 is the optimal solution for tick data, providing the following advantages:

  • Durability and redundancy: Amazon S3 offers highly durable object storage. Based on its unique architecture, Amazon S3 is designed to deliver 99.999999999% (11 nines) data durability. Furthermore, Amazon S3 stores data redundantly across a minimum of three Availability Zones (AZs) by default, providing built-in resilience against widespread disaster.
  • Access pattern flexibility: Tick data exhibits varied access patterns where recent data might be frequently queried while historical data becomes less accessed over time. For a small monthly object monitoring and automation charge, S3 Intelligent-Tiering monitors access patterns and automatically moves objects that have not been accessed to lower-cost access tiers. S3 Intelligent-Tiering automatically stores objects in three access tiers: one tier that is optimized for frequent access, a 40% lower-cost tier that is optimized for infrequent access, and a 68% lower-cost tier optimized for rarely accessed data. S3 Intelligent-Tiering monitors access patterns and moves objects that have not been accessed for 30 consecutive days to the Infrequent Access tier and after 90 days of no access to the Archive Instant Access tier. For data that doesn’t need immediate retrieval, you can set up S3 Intelligent-Tiering to monitor and automatically move objects that aren’t accessed for 180 days or more to the Deep Archive Access tier to realize up to 95% in storage cost savings.
  • Query-in-place capabilities: Amazon S3 integrates seamlessly with Athena, allowing organizations to run standard SQL queries directly against their market data stored in Amazon S3. This eliminates the need to transfer massive datasets to separate analytical systems, reducing complexity and enabling faster time-to-insight on petabyte-scale tick data repositories.
  • Performance scaling: Amazon S3 is built to support high-request rates. Amazon S3 request rate performance allows your application to achieve at least 3,500 PUT/COPY/POST/DELETE and 5,500 GET/HEAD requests per second per partitioned prefix. You can increase your read or write performance by using parallelization.

Conclusion

In this post, we explored how the combination of Amazon S3 and LSEG’s TH Data and TH PCAP has revolutionized financial market data storage and accessibility. We examined how this cloud-based solution addresses the challenges of managing explosive growth in tick data volumes, demonstrating how a global investment bank achieved 80% TCO reduction while improving data quality and operational efficiency.The key takeaways from this cloud-powered market data approach are compelling for financial services institutions of all sizes. Shifting from capital-intensive on-premises infrastructure to cloud-based storage allows organizations to achieve streamlined operations, enhanced data quality, future-proof scalability, and predictable costs. Most importantly, this transformation allows financial institutions to refocus their technical resources on core business activities and innovation rather than infrastructure management.

Ready to transform your market data operations? We encourage you to explore how Amazon S3 and specialized financial data services can modernize your approach to market data. Use our Calculator to estimate your potential savings, schedule a free assessment, or contact the AWS Financial Services team to learn more about implementing a similar solution for your organization.

Rohit Singh

Rohit Singh

Rohit Singh is an enterprise solutions architect with 8 years at Amazon Web Services (AWS) financial services vertical. In his current role, he leverages his deep knowledge of AWS services and the financial industry to design and implement solutions for enterprise clients, with a focus on security, scalability and cost optimization. Rohit enables companies to migrate legacy systems to the cloud and build innovative cloud-native applications using containers and microservices.

Feng Xia

Feng Xia

Feng Xia is a Director of Software Engineering currently leading the development of the Tick History platform. With over 18 years of experience in both real-time and historical market data applications, His extensive background spans the design, implementation, and optimization of high-performance data systems that support critical financial services.

Laszlo Dus

Laszlo Dus

Laszlo Dus is a Technical Director of Vehicle Mastering, Time Series and Tick History at LSEG. He has over 20 years’ experience working with real-time market data and associated technologies, building teams and products.