AWS for Industries

How the NYSE cut download times and built a cloud-native platform for our historical data products

by Anthony Zawadzki, Anand Pradhan, Vinil Bhandari, Mike Perna, and Alex Mirarchi | on | Permalink |  Share

The New York Stock Exchange Group (NYSE Group) operates five equities exchanges and two options exchanges. These are purpose-built to meet the needs of corporate and ETF issuers and provide investors with greater choice in how they trade. The NYSE Group owns and operates the largest US equities exchange group by market share, The New York Stock Exchange, a symbol of capitalism at its best and the belief that free markets offer every individual the chance to benefit from success. The NYSE Group trades on average over 2.4 billion shares and processes over 330 billion messages a day.

A little over a year ago, we embarked on our data access and distribution cloud journey with the goal of making all of our non-real time data available to customers natively in the cloud. We saw that the data industry’s center of gravity – distribution, storage, and analysis – was moving to the cloud, and we felt that current and future customers would value a cloud-native way of accessing our historical data. To validate our thinking, we surveyed a diverse cross-section of customers representing the buy side, sell side, proprietary trading firms, and market data vendors. Approximately half were already using the cloud in their technology stack and had plans to expand usage. Of these firms, a large number were already AWS customers. Furthermore, 40% were either beginning to implement a cloud strategy or were strongly considering it for the near future.

nyse customer survey

Overlaying this feedback – where approximately 90% of our sample size was either using the cloud or about to – made it clear that there was demand for non-real time data products, including our flagship Daily TAQ, in the cloud. In addition, we saw a cloud-based delivery platform as a solution to a critical pain point for customers – the long download times that many faced during high volatility trading sessions. On these days, larger-than-usual file sizes combined with large numbers of customers requesting to pull files concurrently often created bottlenecks, thereby causing average download times to nearly double. These delays were frustrating and slowed our customers’ time-to-value. Moreover, with the continued trend of surging trading volumes and high-volatility days becoming more frequent, we wanted to solve this problem.

Having made our decision to build a cloud-native platform, we needed the right cloud partner. Given AWS’ significant presence with our existing customer base, continuous innovation, and deep industry experience, we believed that collaborating with AWS would help meet our customers’ demands.

How we built the platform

In the summer of 2021, we started building. One of the first design considerations was choosing the region to host our new platform. Noting that nearly all of our cloud-enabled customers already had a presence in US-East-1, we decided to set up our infrastructure here and connect our AWS tenancy to our on-premises infrastructure with an Amazon Virtual Private Cloud (Amazon VPC) leveraging an AWS Direct Connect. Being in the same region as most of our customers would let us provide a higher performant platform experience while enabling customers to utilize the no-cost data transfers within the region.

For the data layer, we decided to use Amazon Simple Storage Service (Amazon S3). There were several reasons why we decided to use Amazon S3, including the high availability and easy management of data that Amazon S3 provides in the form of versioning and encryption lifecycle management. All entitlement data from on-premises flows into the Amazon S3 bucket every 15 minutes. This triggers an AWS Lambda function, which then reads the information and performs CRUD operations to entitle users to their specific data packages. Based on the users’ data subscription, we leverage AWS Identity and Access Management (IAM) roles and policies to grant appropriate access. This framework neatly integrates with the NYSE online order process and billing systems to provide the ability to flexibly permission content based on whatever product(s) and date-range(s) a customer may choose to license and quickly enable access to new datasets.

On top of the Amazon S3-based access, we also offer data access over SFTP with AWS. To enable SFTP access we use AWS Transfer family service and leverage Amazon Secrets Manager service to store the ssh keys provided by users. These secrets are validated for each user access by leveraging a Lambda function as the identity provider. By leveraging the scalability, security, and monitoring capabilities that AWS’ SFTP offering provides, the download speeds with Transfer family were much faster than our on-premises solution while providing the highest level of security.

We decided to build out this SFTP layer as customer feedback indicated that some market participants have not fully embraced the cloud yet. With the flexibility of building in AWS, we could modernize our non-real time data storage and distribution while also meeting customer needs, regardless of the stage of their cloud journey. By building on AWS, we unified both Amazon S3 and SFTP distribution in one, easy-to-use cloud-based platform.

Lastly, to make sure that our platform is secure, all of the data is encrypted using AWS Key Management Service (AWS KMS). In addition, everything inside the VPC is locked out with private subnets and none of the assets in the VPC have direct public access. To further mitigate the risk of malicious attempts to access the data, we rate limit and IP whitelist the data.

nyse data center

With the rollout of Amazon S3 access in November 2021, and SFTP access in May 2022, customer adoption and feedback has been very positive. In just seven months, over 20% of NYSE Group’s historical data customers have adopted Amazon S3 access – surpassing our goal of 20% uptake in the first 12 months. In addition, over 25% of our Daily TAQ customers have made the switch from SFTP consumption to Amazon S3-based access in less than one year. These customers have also reported significantly faster and more consistent download times. In fact, with Amazon S3 access, data download times were cut by approximately 75% on average. The scalability of Amazon S3 transfers truly shines during high volatility trading days, where files can be standard deviations larger than average, and downloads via the legacy platform could take up to eight hours. With Amazon S3 we saw an approximately 90% reduction in download times on high volatility days. No matter the nature of the day’s trading activity, Amazon S3 access has proven to be consistent and fast. This ability to more quickly process raw NYSE data and apply a suite of AWS analytics tools to our data enables our customers to create value to their business and discover new insights faster.

Another benefit of our cloud-based distribution platform is that it provides our customers’ data managers with an easier line-of-sight to administrate user access and track license compliance. When compared to legacy internet-based username and password access, the cloud enables data managers to centralize data storage and track who is accessing what content, as well as what they are doing with it. Greater internal transparency around use directly translates into better data access controls and lower risk of being out of compliance.

In addition to addressing our customers’ data download and access management challenges, moving to the cloud has provided cost and time-to-market benefits for the NYSE. Deploying our non-real time product distribution in the cloud meant that we could move all of our historical and reference data products, which span 30 products going back over 30 years, in just two weeks. Furthermore, we’re realizing cost savings of nearly 70% by unifying NYSE market data storage and delivery on AWS.

We’re excited to build on this success and work to meet customer demands in the cloud by launching new and innovative offerings. In fact, we recently launched our first cloud-only data product, The NYSE Options Open-Close Volume Summary, and we’re assessing which services to build next. We’ve noticed a growing appetite for more flexible, on-demand access to content, which we’re definitely considering. Looking forward, building out tools such as APIs or data lake access, or analytic capabilities that are easily deployable on our data are ideas that are certainly of interest. And we’re looking forward to continued collaboration with AWS as we continue our cloud journey.

Anthony Zawadzki

Anthony Zawadzki

Anthony Zawadzki is a Director of Product Management on the New York Stock Exchange’s Proprietary Data Products team. Anthony originally joined the NYSE in 2012 as a summer intern and has since held various product management roles at the firm, most recently overseeing cloud strategy and other strategic initiatives. Anthony is a graduate of Lafayette College where he studied Economics and Religious Studies.

Anand Pradhan

Anand Pradhan

Anand Pradhan is a Senior Director of Technology at NYSE and heads two diverse technology departments that provide technology for NYSE’s Regulatory technology and the National Market System, which tuns the market’s largest real-time market data consolidation system (OPRA). He also leads Cloud computing initiatives for NYSE. Prior to his current tenure at NYSE, Anand was the CTO of a FinTech startup, held several previous roles at NYSE and was a senior software specialist at Patni Computer Systems. Anand holds a degree in Engineering from Sambalpur University with honors.

Vinil Bhandari

Vinil Bhandari

Vinil Bhandari, is the Director of Engineering at NYSE. In his current role, Vinil has spearheaded the migration of strategic systems and services to the cloud, including the launch of TAQ data delivery on AWS. Vinil is a software engineering leader with close to 20 years of building and leading highly effective engineering teams in diverse industries such as Healthcare, Banking, Finance, edTech & eCommerce. He started his cloud journey with migrating Kaplan Test prep’s on-prem systems to AWS in 2010 and has been using AWS for over a decade now. Vinil completed his Bachelor’s of Science from NIT Durgapur, India.

Mike Perna

Mike Perna

is a Principal Solutions Architect at Amazon Web Services (AWS), specializing in the Financial Services Industry. During his two years at AWS, Mike has worked with a range of Capital Markets participants, including traditional exchanges, Broker/Dealer, cryptos, market data providers, hedge funds, and other fintech companies, helping them solve technology challenges. Mike brings 20 years of diverse experience working in finance and technology, including trading Futures and Options, running ATS and market-making platforms, and managing Ultra-low latency colocation trading infrastructure.

Alex Mirarchi

Alex Mirarchi

Alex Mirarchi is a Capital Markets Industry Specialist at Amazon Web Services (AWS). Alex’s core focus is helping Global Exchanges and Infrastructure providers transform their businesses with AWS products and services. Alex joined AWS from Oracle Cloud Infrastructure (OCI), where he was responsible for strategy and structuring programs and strategic partnerships with software vendors. Prior to Oracle, Alex worked in capital markets, selling Asian Equities to Hedge Funds in London with HSBC and in New York with Macquarie Group. Alex also held front office roles in Fixed Income and Foreign Exchange, and started his career in HSBC Global Asset Management’s alternatives division in London. Alex holds an MBA from Columbia Business School.