Data Fabric: A New Paradigm For Self-Service Data & Data Scientists

Presented by

Kelly Stirman, VP Strategy, Dremio

About this talk

Data Scientists are rare and highly valued individuals, and for good reason: making sense of data, and using the machine learning libraries requires an unusual blend of advanced skills. Why is it then that Data Scientists spend the majority of their time getting data ready for models, and a fraction actually doing the high value work? In this talk we introduce the concept of Data Fabric, a new way to provide a self-service model for data, where data scientists can easily discover, curate, share, and accelerate data analysis using Python, R, and visualization tools, no matter where the data is managed, no matter the structure, and no matter the size. We will talk through the role of Apache Arrow, the in-memory columnar data standard that is accelerating analytics for GPU-based processing, as well as the role of Pandas and Arrow in providing unprecedented speed in accessing datasets from Python.

Related topics:

More from this channel

Upcoming talks (0)
On-demand talks (592)
Subscribers (10129)
Everyone is talking about big data. But what is it? How do you use it? How will it affect your organization? Subscribe to this channel to hear best practices and practical information on everything big data from infrastructure requirements to analysis and use cases.