Joining Billions of Rows in Seconds: Replacing MongoDB and Hive with Scylla

Logo
Presented by

Alexys Jacob, CTO of Numberly

About this talk

Many organizations struggle to balance traditional big data infrastructure with NoSQL databases. Other organizations do the smart thing and consolidate the two. This presentation explores Numberly’s experience migrating an intensive and join-hungry production workload from MongoDB and Hive to Scylla. Join Alexys Jacob, CTO of Numberly, to learn how they joined billions of rows in seconds and dramatically reduced operational and development complexity by using a single database for their hybrid analytical use case. As a bonus, Alexys will also cover benchmarks for Dask (a flexible parallel computing library for analytic computing) and Spark, highlighting their differences and lessons learned along the way.
Related topics:

More from this channel

Upcoming talks (0)
On-demand talks (53)
Subscribers (134)
The Scylla NoSQL database embraces a shared-nothing approach that increases throughput and storage capacity as much as 10X. It comes in open source, enterprise and database-as-a-service options. Comcast, Discord, Grab, Medium, Starbucks, Ola Cabs, Samsung, IBM, Investing.com, Zillow and many more leading companies have adopted Scylla to realize order-of-magnitude performance improvements and reduce hardware costs. For more information: ScyllaDB.com.