Sai Devulapalli & Boni Bruno, Dell EMC and Jeff Schmitt, Hortonworks
Enterprises realize that data is the fuel that data science teams crave but data is growing at an explosive rate, with COLD data growing at a much faster rate than HOT. As these data volumes grow beyond 75 TB, continuing to scale HDFS clusters using compute with local storage (DAS) gets very expensive and complex to manage. At some point you’re forced to start archiving, or worse purging, COLD data making it difficult to access and potentially throwing away valuable insights.
Dell EMC understands this problem and we’ve developed a new HDFS tiering capability with Hortonworks which effectively adds a “capacity-centric” data overlay on top of Hadoop DAS infrastructure. This integrated tier enables enterprises to continue to use existing applications such as Hive, Spark, Map-reduce, and others, while offering a more cost effective, simple, and efficient way to scale data lakes from the 100 of TB range up into the PB’s. The tiered solution enables the business to access to ALL of their data, both HOT and COLD, for analytics from business intelligence to machine learning. And most importantly, it allows the specific business use cases to define the tiering rules for performance and scale.
In this webcast, we will discuss some of the challenges that enterprises encounter when the data volumes in their analytics platforms are no longer manageable and the impact this causes on the business. We’ll detail:
1. How the Dell EMC Isilon tiered storage solution delivers the best cost-capacity-performance tradeoff for Hadoop deployments
2. How the solution works from a technical perspective
3. How customers can easily integrate the solution into their existing environment to derive more value from their data, drive deeper customer insights, improve operational efficiency, and accelerate time to market.