Best of 2019 - Mastering Data Governance on Cloud Data Lakes

Logo
Presented by

Dhiraj Sehgal, Director of Product Marketing & Akil Murali, Director of Product Management, Security and Governance at Qubole

About this talk

As more organizations run ETL workloads, analytics, and machine learning on data residing in data lakes, there are inherent privacy and integrity risks that must be addressed. How then, should organizations preserve privacy and control access to this data as per regulations such as GDPR and CCPA. While most organizations have put some measures for data governance in data lakes, current high-level file-level security measures and accepted best practices are not sufficient for data privacy and integrity requirements. In this webinar, Qubole data privacy and integrity experts will cover: - Maintaining data integrity and keeping sensitive information safe irrespective of open-source engine - Providing granular data access controls and the ability to mask data with Apache Ranger - Avoiding lost updates, dirty reads, stale reads and enforcing app-specific integrity constraints - Complying with “right to be forgotten” and “right to be erased” by ensuring that data in the data lake is current and deleted when necessary - A demo of Qubole’s built-in Apache Ranger and ACID support for data privacy and integrity

Related topics:

More from this channel

Upcoming talks (0)
On-demand talks (118)
Subscribers (8230)
Tune in to hear from open data lake platform leaders and engineers discuss everything from continuous date engineering on data lakes for machine learning, streaming analytics, ad-hoc analytics and data exploration in the cloud. The interactive talks are designed for both data engineers, data analysts and data scientists that want to learn about some of the challenges and solutions for use cases seen in data-driven organizations. Learn more about Qubole: http://bit.ly/AboutQubole