For data and data analytics providers who deliver high-performance data streaming for their clients, consistent and fast data is key to business. However, for providers still using legacy systems, high rates of data ingest paired with the ongoing quest for near real-time processing and AI-driven analysis can unfortunately lead to higher costs, slower processing speeds, and even system failure in the form of lost data. This has the potential to be detrimental to a business.
Over the past 2 years, our Element 84 team has worked hand in hand with a customer doing large scale data analysis to modernize their system and help protect them from this potential business risk. We worked to transform a largely manual analysis process into an AI-first, agentic solution involving an ingest pipeline of 40+ data sources with advanced analytics. This overarching implementation has accompanied a variety of infrastructure improvements including shifting client’s legacy data streaming system to a serverless cloud-native implementation, as well as integrating Element 84’s Natural Language Geocoding and geospatial datalake capabilities. The result has been transformative for the business, improving the data processing speed, facilitating the implementation of an agentic-first analytics system, and unlocking potential revenue.
Through this case study, we share our experience responding to a data outage incident and how our response allowed for an overall system shift that ultimately increased available data and decreased latency for our client due to new AWS-native infrastructure improvements.
Understanding and responding to data ingest needs and errors
Earlier this year, the team uncovered a data interruption incident involving a client’s legacy system. During this incident, almost 40% of potential data from a key data source was not processed by the system.
Ultimately, this interruption was triggered by a combination of the following variables:
- Message format changes from data providers
- Architectural misunderstandings leading to duplicate processing
- Python bottleneck limiting processing throughput
Upon learning about the incident, the team was able to immediately triage the problem and deploy an emergency fix before proceeding with a comprehensive architecture review. We established three potential courses of action and settled on the most comprehensive option before developing an implementation plan.
At first, restoring data access was the priority for all interested parties. We were able to first respond to that need by implementing a new architecture able to ensure the Extracting, Transforming, and Loading (ETL) of data from needed satellite sources could execute in a scalable way. This solution worked well in a temporary format, but, despite its functionality, the high cost required for operating in this solution meant that a longer-term solution was still necessary.
Implementing a more comprehensive and responsive system
Because we were able to react quickly, the Element 84 team was able to work alongside our clients to implement a multi-stage response.
After data access was restored, we turned our attention to implementing a comprehensive cloud-native solution. By prioritizing cloud-native technologies, we were able to work toward a lower maintenance system that would also be scalable and durable over time. With the initial stopgap solution in place, we were able to facilitate the option that provided maximum parallel processing capabilities – and at a lower cost than what operations had previously required.
Our final solution represents a true data lake architecture that can handle over 10,000 messages per second and easily accommodates multiple data providers. In addition to being cheaper than the temporary fix we implemented first, it provides a more scalable and maintainable long-term solution.
Results: Near real-time data processing
After deploying the new architecture and Lambda solution, the new system could process over twice as many messages per second in a more cost effective manner. This allows agents acting on behalf of an LLM to operate as close to real time as possible.
Through this work, our client’s data throughput was boosted substantially while also saving significant money annually in both operational costs and potential revenue.
In addition to improvements in operational efficiency and cost savings, this new solution allows for even more diverse vendor potential for non-STAC data. With the new system, additional data vendors can be introduced all while maintaining an impressive throughput.
This case study reflects an in-progress Element 84 project, and we look forward to sharing the future of this work as it becomes available for publication. In the meantime, if you have any questions or if you would like to chat with our team about a similar project at your organization, find us any time on our contact page.
