Hadoop has the ability to store and provide unlimited data to the business to uncover new insight. But as data in Hadoop grows, so does the potential for information risk. Waterline Data Inventory allows you to automatically discover and tag fields, curate those tags, and inspect data quality at a glance in a scaleable, agile, and automated way.
Hadoop has the ability to store and provide unlimited data to the business to uncover new insight. However, making it easy for the business to quickly find the data and use it securely and in compliance is hard. Waterline Data Inventory turns data in Hadoop into a shared service for the enterprise, letting you profile the entire data lake automatically, discover and manage metadata, and find, understand, and provision that data securely.
The current process to inventory data in Hadoop is slow and unreliable. But with Waterline Data Inventory, you can automate the process leading up to data cleansing and formatting. Automating the foundational steps of the process improves time-to-value by weeks or even months while keeping your data above the waterline.
Ingesting raw data into Hadoop is easy, but extracting business value leveraging exploration tools is not. Hadoop is a file system without a data model, data quality, or data governance, making it difficult to find, understand and govern data.
In this webinar, Tony Baer, Principal Analyst of Ovum Research, will address the gaps and offer best practices in the end-to-end process of discovering, wrangling, and governing data in a data lake. Tony Baer will be followed by Oliver Claude who will explain how Waterline Data Inventory automates the discovery of technical, business, and compliance metadata, and provides a solution to find, understand, and govern data.
Attend this webinar if you are:
--A big data architect who wants to inventory all data assets at the field level automatically while providing secure self-service to business users
--A data engineer or data scientist who wants to accelerate data prep by finding and understanding the best suited and most trusted data
--A Chief Data Officer or data steward who wants to be able to audit data lineage, protect sensitive data, and identify compliance issues
Sunil Soares, Information Asset LLC; Joe DosSantos, EMC Consulting; Jay Zaidi, Fannie Mae
In this panel discussion, moderated by Dr. Barry Devlin of 9sight Consulting and presented at Strata New York 2014, Sunil Soares (formerly of IBM and now principal of Information Asset, LLC), Joseph DosSantos of EMC Consulting, and Jay Zaidi of Fannie Mae discuss the following:
--How does Big Data governance differ from traditional data governance?
-- What are the key methodologies and processes that apply to Big Data governance?
-- How ready are the Big Data vendors and markets for Big Data governance?
They also share examples of worthwhile and successful Big Data governance projects.
Suresh Srinivas, Hortonworks; Mike Sutten, Kaiser Permanente; Clark Farrey, Capital One; Sunil Soares, Information Asset LLC
In this panel discussion, moderated by Alex Gorelik of Waterline Data Science and presented at Strata New York 2014, Suresh Srinivas of Hortonworks, Mike Sutten of Kaiser Permanente, Clark Farrey of Capital One, John Mount of WinVector LLC, and Sunil Soares (formerly of IBM and now principal of Information Asset, LLC), discuss the following:
-- How do you get business value out of a data lake?
-- The importance of getting critical data quickly from the data to the user
-- The need for a catalog or inventory for a data lake
-- The role of automation and crowdsourcing metadata in managing data lakes
Waterline Data Inventory builds a complete inventory of Hadoop data, automatically and securely.It then provides business metadata-driven, multi-faceted search, and logical folders that put the right data at your fingertips. Waterline Data Inventory learns from all the data in Hadoop, and discovers lineage, business metadata, and data quality metrics to give you a complete view of the data at a glance. Waterline also discovers sensitive data, intermediate files, and data lineage and enables data stewards to manage tags.