How Climate Grew Its Data Science Capabilities 10x in 2 Years

Logo
Presented by

Mir Yasir Ali - Sr. Staff Software Engineer, Climate Corporation / Kris Skrinak - Machine Learning Segment Lead, AWS

About this talk

Recorded at Rev 2 | May 23-24, 2019 | New York Day 2 - Case Studies - Climate Corporation, AWS The Climate Corporation provides a platform for farmers around the world to use best-in-class analytic capabilities to digitize their operations and optimize their profits. Come learn about Climate’s journey from a scrappy startup to a mature company and hear how we grew our data science capabilities 10x, from supporting 20 data scientists to supporting 200 data scientists, over 2 years. As a new startup, data scientists were working on customized servers with a complex set of unstandardized libraries. Time and effort were lost to maintenance and overhead, with researchers spending 50% of their time maintaining and customizing research environments. Additionally, sharing work between groups was a significant challenge and versioning of models/data was done manually, with a high risk for error. We identified a need to standardize our environments to minimize time spent configuring research environments as well as simplifying collaboration across data scientists on an ongoing basis. To meet this need, we developed a process and infrastructure whereby hardware and software are tailored to a researcher’s needs based on their domain, and we built out automation to enable this process by default. By automating the configuration of Yarn, Spark, Docker, AWS and Domino we drove standardized infrastructure for research and discovery within Climate. This enabled the data science team to deliver models to production faster and at less than half the previous cost, enabling farmers around the world to increase crop yields and grow more food for all of us!

Related topics:

More from this channel

Upcoming talks (0)
On-demand talks (50)
Subscribers (3765)
Today, the best-run companies run their business on models, and those that don’t face existential threat. Welcome to "The Model Driven Business" - a channel where we will share use cases and best practices for organizations striving to make data science an organizational capability that drives business impact.