Best Practices for VerticaPy

Logo
Presented by

Badr Ouali, Vertica Head of Data Science

About this talk

Tune in to learn then adopt the most popular and helpful machine learning techniques in Vertica. Hear directly from the inventor of VerticaPy, Badr Ouali, whose powerful Python library exposes scikit-like functionality to conduct data science projects on data stored in Vertica. VerticaPy has many features that will help you achieve high performance at scale without impacting your cluster. Topics covered include: caching some statistics to avoid recomputing them twice, sending multiple queries iteratively to preserve your cluster, and even sending multiple queries at the same time to gain performance.

Related topics:

More from this channel

Upcoming talks (1)
On-demand talks (164)
Subscribers (37348)
The Vertica Unified Analytics Platform is built to handle the most demanding analytic use cases and is trusted by thousands of leading data-driven enterprises around the world, including Etsy, Bank of America, Uber, and more. Based on a massively scalable architecture with a broad set of analytical functions spanning event and time series, pattern matching, geospatial, and built-in machine learning capability, Vertica enables data analytics teams to easily apply these powerful functions to large and demanding analytical workloads. Vertica unites the major public clouds and on-premises data centers, as needed, and integrates data in cloud object storage and HDFS without forcing any data movement. Available as a SaaS option, or as a customer-managed system, Vertica helps teams combine growing data siloes for a more complete view of available data. Vertica features separation of compute and storage, so teams can spin up storage and compute resources as needed, then spin down afterwards to reduce costs.