Ivan Jibaja, Tech Lead, Pure Storage; Joshua Robinson, Founding Engineer, FlashBlade, Pure Storage
Learn the origin of big data applications, how new data pipelines require a new infrastructure toolset and why both containers and shared storage are the fundamental infrastructure building blocks for future data pipelines.
We will first discuss the factors driving changes in the big-data ecosystem: ever-greater increases in the three Vs of data volume, velocity, and variety. The data lake concept was originally conceived as a single location for all data, but the reality is that multiple pipelines and storage systems quickly lead to complex data silos. We then contrast the legacy Hadoop applications, which are built only for volume, and the next generation of applications, like Spark and Kafka, which solves for all three Vs. Finally, we end with how to build infrastructure to support this new generation of applications, as well as applications not yet in existence.
About the Speakers:
Ivan Jibaja, Tech Lead, Pure Storage Ivan Jibaja is currently a tech lead for the Big Data Analytics team inside Pure Engineering. Prior to this, he was a part of the core development team that built the FlashBlade from the ground-up. Ivan graduated with a PhD in Computer Science from the University of Texas at Austin, with a focus on systems and compilers.
Joshua Robinson, Founding Engineer, FlashBlade, Pure Storage Joshua builds Pure's expertise in big-data, advanced analytics, and AI. His focus is on organizing a cross-functional team, technical validation, performance benchmarking, solution architectures, collecting customer feedback, customer consultations, and company-wide trainings. Joshua specializes in several data analytics tools, including Hadoop, Spark, ElasticSearch, Kafka, and TensorFlow.