Modernizing Data Lakes with Apache Iceberg on Backblaze B2

Presented by

Pat Patterson, Chief Technical Evangelist, Backblaze

About this talk

In the evolving landscape of big data, efficient and scalable data lake management is paramount. Traditional table formats like Apache Hive have served well but come with limitations in schema evolution, partitioning flexibility, and transaction support. Apache Iceberg has emerged as a robust solution, addressing these challenges with features like ACID transactions, hidden partitioning, and time-travel. This webinar delves into the integration of Apache Iceberg with Backblaze B2 Cloud Storage. You will learn how Iceberg's architecture optimizes query performance and data management when paired with Backblaze B2. The session will include practical demonstrations using the Drive Stats data set, showcasing query executions through platforms such as Snowflake, Trino, and DuckDB. Key Takeaways: * Understand the limitations of traditional table formats and how Apache Iceberg overcomes them. * Learn how to leverage Backblaze B2's scalable storage for efficient data lake operations. * Gain insights into real-world applications and query optimizations using Iceberg with Backblaze B2. Join us to discover how combining Apache Iceberg with Backblaze B2 can revolutionize your data lake strategy, offering enhanced performance, flexibility, and cost savings. By registering to this event you agree to receive more information about Backblaze and possibly other partnership products and services. If you decide you are not interested in receiving these communications you will always have the option to opt-out of receiving them by clicking the “Unsubscribe” link provided in the email.
Backblaze

Backblaze

20787 subscribers122 talks
Cloud Storage and Cloud Backup
Backblaze is a leading independent cloud provider that makes it astonishingly easy for people to store, protect, and use their data. Its B2 Cloud Storage platform offers always-hot, S3 compatible object storage that’s readily available through APIs, CLI, Web UI, and 3rd party integrations--to seamlessly support ranging workflows and fit hybrid cloud, multi-cloud, and other IaaS strategies. Its computer backup solutions offer automatic data protection for business fleets and home Mac and PC. We're here on BrightTalk to unpack product and solution updates, discuss best practices around data storage and protection, and openly share hard drive stats based on our experience spinning >200,000 drives. Thanks for joining us.
Related topics