Building an Open Data Lake House Using Trino and Apache Iceberg

Presented by

Matt Fuller Co-Founder & VP, Product | Starburst & Tom Nats Director Customer Solutions | Starburst

About this talk

As companies build their data analytics practice, they quickly outgrow running analytics off their operational store that powers their applications. Building a read replica only buys them time until they hit scalability limits with their growing internal and customer demand. This is where one hits the crossroads of going all in with a cloud data warehouse or choosing an open data lake house approach to future-proof them for scale, performance, and cost efficiency. In this workshop, Matt Fuller and Tom Nats lead you through how you can easily build and manage an open data lake house architecture using open-source technologies such as Trino and Apache Iceberg to support your growing analytics. Trino is an open source highly parallel and distributed query engine built from the ground up at Facebook for efficient, low-latency analytics. Iceberg is an open source, high performant table storage format that enables an engine like Trino to perform data warehousing SQL functionality such as UPDATE, DELETE, and MERGE commands on the data lake house. In addition, Matt and Tom will lead you through combing these technologies to perform near real-time analytics with streaming ingestion with database functionality on the lakehouse. This workshop will use the Starburst Galaxy SaaS product making it simple to leverage these technologies for your modern data lake house without having to worry about the operational aspects of running Trino and other software.
Related topics:

More from this channel

Upcoming talks (0)
On-demand talks (10)
Subscribers (1820)
Starburst’s mission is to free our customers to see the invisible and achieve the impossible. Join us for high value content, insightful conversations, and the constant opportunity to learn.