Building an Open Data Lake House Using Trino and Apache Iceberg
Matt Fuller Co-Founder & VP, Product | Starburst & Tom Nats Director Customer Solutions | Starburst
About this talk
As companies build their data analytics practice, they quickly outgrow running analytics off their operational store that powers their applications. Building a read replica only buys them time until they hit scalability limits with their growing internal and customer demand. This is where one hits the crossroads of going all in with a cloud data warehouse or choosing an open data lake house approach to future-proof them for scale, performance, and cost efficiency. In this workshop, Matt Fuller and Tom Nats lead you through how you can easily build and manage an open data lake house architecture using open-source technologies such as Trino and Apache Iceberg to support your growing analytics. Trino is an open source highly parallel and distributed query engine built from the ground up at Facebook for efficient, low-latency analytics. Iceberg is an open source, high performant table storage format that enables an engine like Trino to perform data warehousing SQL functionality such as UPDATE, DELETE, and MERGE commands on the data lake house. In addition, Matt and Tom will lead you through combing these technologies to perform near real-time analytics with streaming ingestion with database functionality on the lakehouse. This workshop will use the Starburst Galaxy SaaS product making it simple to leverage these technologies for your modern data lake house without having to worry about the operational aspects of running Trino and other software.