Data Fabric: A New Paradigm For Self-Service Data & Data Scientists

Presented by

Kelly Stirman, VP Strategy, Dremio

About this talk

Data Scientists are rare and highly valued individuals, and for good reason: making sense of data, and using the machine learning libraries requires an unusual blend of advanced skills. Why is it then that Data Scientists spend the majority of their time getting data ready for models, and a fraction actually doing the high value work? In this talk we introduce the concept of Data Fabric, a new way to provide a self-service model for data, where data scientists can easily discover, curate, share, and accelerate data analysis using Python, R, and visualization tools, no matter where the data is managed, no matter the structure, and no matter the size. We will talk through the role of Apache Arrow, the in-memory columnar data standard that is accelerating analytics for GPU-based processing, as well as the role of Pandas and Arrow in providing unprecedented speed in accessing datasets from Python.

Related topics:

More from this channel

Upcoming talks (8)
On-demand talks (596)
Subscribers (88963)
Data is the foundation of any organization and therefore, it is paramount that it is managed and maintained as a valuable resource. Subscribe to this channel to learn best practices and emerging trends in a variety of topics including data governance, analysis, quality management, warehousing, business intelligence, ERP, CRM, big data and more.