GraphFrames: DataFrame-based graphs for Apache® Spark™

Presented by

Joseph Bradley

About this talk

GraphFrames bring the power of Apache Spark DataFrames to interactive analytics on graphs. Expressive motif queries simplify pattern search in graphs, and DataFrame integration allows seamlessly mixing graph queries with Spark SQL and ML. By leveraging Catalyst and Tungsten, GraphFrames provide scalability and performance. Uniform language APIs expose the full functionality of GraphX to Java and Python users for the first time. In this talk, the developers of the GraphFrames package will give an overview, a live demo, and a discussion of design decisions and future plans. This talk will be generally accessible, covering major improvements from GraphX and providing resources for getting started. A running example of analyzing flight delays will be used to explain the range of GraphFrame functionality: simple SQL and graph queries, motif finding, and powerful graph algorithms. For experts, this talk will also include a few technical details on design decisions, the current implementation, and ongoing work on speed and performance optimizations.
Related topics:

More from this channel

Upcoming talks (0)
On-demand talks (92)
Subscribers (39056)
No matter at what stage of your data journey you’re in, this channel will help you get a better understanding of the fundamental concepts of the Databricks Lakehouse platform and the problems we’re helping to solve for data teams.