Seeing the Forest and the Trees: Data Discovery in Billion Row datasets

Logo
Presented by

Jim Bednar - Director, Technical Services, Anaconda

About this talk

When exploring and visualizing large datasets, you are usually faced with a tough choice between sparsely sampling your data and aggregation. In this talk, we'll show how to use Anaconda technologies to show all of your data, revealing both trends and outliers, and letting you explore even the largest datasets interactively in a web browser. We'll discuss: - How Datashader can render your entire dataset faithfully in milliseconds or seconds, with both trends and outliers easily visible. - How pairing Datashader with the Python-based interactive tools Bokeh, Panel, and HoloViews lets you work interactively with your data, selecting individual datapoints and local groups for further inspection and exploration - How to make sense of incoming datasets and make them actionable without getting distracted by tuning plotting parameters or missing out on important insights.
Related topics:

More from this channel

Upcoming talks (0)
On-demand talks (19)
Subscribers (3165)
With more than 35 million users, Anaconda is the world’s most popular data science platform and the foundation of modern machine learning. We pioneered the use of Python for data science, champion its vibrant community, and continue to steward open-source projects that make tomorrow’s innovations possible. Our enterprise-grade solutions enable corporate, research, and academic institutions around the world to harness the power of open-source for competitive advantage, groundbreaking research, and a better world.