Baselines & Benchmarks –Making Open Source Big Data Analytics Easy

Presented by

Arjuna Chala, Sr. Director of Special Projects for the HPCC Systems

About this talk

Bringing heterogeneous data into a homogenous data warehouse environment is one of the most daunting aspects of any big data implementation. Even though Apache Spark and HPCC Systems Thor can be thought of as complementary, there is interest in comparing their performance with data analytics-related benchmarks, specifically transformation, cleaning, normalization, and aggregation. Join us to hear how HPCC Systems Thor's performance compares to Apache Spark utilizing standard benchmarking methodologies. Learn how these benchmarks and HPCC Systems can help you establish new baselines that: •Improve the speed and accuracy of the transformation, cleaning, normalization, and aggregation processes •Enable efficient use of developer resources and development budgets •Facilitate the use of standard hardware, operating systems, and protocols

Related topics:

More from this channel

Upcoming talks (0)
On-demand talks (49)
Subscribers (2168)
HPCC Systems is an open source Big Data analytics solution for businesses of all sizes, allowing them to improve critical time to results and decisions. Subscribe to our channel to keep informed of the latest HPCC Systems events.