Interactive Visualization of Streaming Data Powered by Spark
Chief Technologist, Ruhollah Farchtchi gives a presentation at Spark Summit East, 2016 on the Interactive Visualization of Streaming Data Powered by Spark.
Much of the discussion on real-time data today focuses on the machine processing of that data. But helping humans visualize real-time streams is just as important. Visualizing real-time data introduces new UX and usability challenges for any developer embedding analytics into applications, especially when the target end users are business users and not data scientists. Self-service, interactive, subsecond response time to ad hoc queries — these are the new UX requirements for any enterprise visualizing real-time data. Streaming data also lends itself to new paradigms of interaction with the stream itself, like being able to pause, rewind and replay a stream. This talk is a case study in how and why Zoomdata built a “Data DVR” capability using Spark and Spark Streaming. We will describe the required user experience, the overall architecture and the specific use of Spark and Spark Streaming. We will describe the design considerations that led us to choose Spark Streaming over alternatives like Storm. We will show how end users configure the real-time increment and a historical retention window without writing any code themselves. We will also show how pause, rewind, replay is implemented in Spark and how the solution supports both real-time and historical analysis in the same architecture. Attendees will walk away with knowledge of Spark Streaming and how users can interactively work with streaming data. They will develop familiarity with the challenges of a lambda architecture and providing a consistent analytic experience over streaming and historical data.
RecordedMar 31 201629 mins
Your place is confirmed, we'll send you email reminders
Exploring big data, live data, and unstructured text fields is exciting to a business analyst, but can generate significant anxiety for the data protection officer. Learn how to open up data to those who need it and protect what needs protecting.
Learn how everything we offer is available for your organization to embed, customize, extend, and white label as your own. We know it's Zoomdata under the covers. But your customers don't need to know.
You're in for a treat when you first learn about Search Data. Think of it as a mashup of Google and your traditional BI tool. You could use this technology to learn more from customer surveys, performance reviews, or anything else that has an open text field.
CTO, Ruhollah Farchtchi & Jake Flomenberg Partner at Accel Partners
Silicon valley venture capitalist Jake Flomenberg gets to see, track and make investments in the evolution of big picture technology trends across areas such as big data analytics, machine learning and artificial intelligence, and emerging modern data platforms and data types that are enabling organizations to be data and analytics-driven. Data driven companies make more effective information backed decisions and tend to significantly outcompete their peers operating on guesses and gut feel. Some companies are now even selling ‘data products’, for example aircraft engine manufacturers tracking and analyzing huge volumes of engine performance data to enable predictive maintenance, fixing things before failure to avoid costs and negative impact on customers.
Jake will discuss trends he’s seeing related to the data analytics market, such as the growing need for self-service data discovery on very large volumes of data, analytics on streams of data, analytics on unstructured data, and contextual analytics embedded inside of software vendor and enterprise applications. He’ll discuss the characteristics of these markets, where he sees these markets going in the future, and why his firm chose to invest in a company like Zoomdata.
Zoomdata CTO Ruhollah Farchtchi will then discuss industry trends he has forecast for 2018 including the eclipsing of relational databases by modern data platforms for doing analytics, how the cloud has changed the game for application development, and how working with streaming data is becoming the new normal. He’ll also discuss trends that are more specifically relevant to Zoomdata such as how to leverage the value of company’s investments in modern elastically scalable back-end data infrastructure and how to get corresponding value on the front-end, such as the ability to analyze and get insights from huge volumes of data, streaming data, and unstructured data types.
Understanding live data that streams at you constantly is extremely difficult and can be enormously expensive if you don't know how to do it. We'll show you how we do it, and show off our Data DVR so you never miss a byte.
Today's most successful companies are data-driven, making informed decisions instead of guesses. Traditionally data-driven insights have been delivered in the form of standalone business intelligence applications, but in the past few years we’ve seen a big shift towards delivering contextual analytics embedded directly into other software applications and business workflows. For software vendors this means meeting high end-user expectations by infusing compelling data visualizations and analytics into every application you create and sell. Modern data visualization and analytics applications, therefore, need to be designed from the ground-up to be quick and easy to embed for a web and mobile-first world. They also must capable of handling today's rapidly evolving modern data platforms so that your analytics technology does not lock you into using yesterday's
If you're a CTO, product manager, or software engineering leader, you may be curious about whether to build or buy an integrated analytics solution, how to evaluate the available offerings for embedded analytics, and how to most efficiently embed data visualization and analytics into your application. In this webcast, we will cover the following topics:
● Assessing the current state of your application's integrated analytics
● Approaching the build vs. buy decision
● Styles of embedding, from light to deep
● Considerations for deployment flexibility
● Security integration considerations
● Avoiding data platform lock-in
Howard Dresner, President, Dresner Advisory Services
Watch this video for information about the importance of big data distributions.
There are four dominant Hadoop distributions: Cloudera, Hortonworks, Amazon, and MapR. All four are gaining increased interest as the level of big data adoption grows. The market leader right now is Cloudera although high tech prefers Amazon as do smaller organizations -- up to 1,000 employees.
Content and Images Source: Dresner Advisory Services Big Data Analytics Market Study; Copyright 2017 -- Dresner Advisory Services
In this video, we explore the best way to data about your products and services into a valuable product on its own.
As Edd Wilder-James once said, “Data products are the reason data scientists are lately treated like rock stars.” Data products operationalize analytical insight. And that’s what monetizing data is all about. The automobile is a good example for the potential of monetizing data. It illustrates that data about a product can be just as valuable -- if not more valuable -- than the product itself for generating revenue and increasing customer loyalty.
This video explores how applications connect to data sources and what that means to an embedded application.
Database query languages like Structured Query Language (SQL) and the Open Database Connectivity Protocol (ODBC) have been around a long time. SQL since the early 1970s and ODBC since 1986. And for as long as people have been querying data, reducing the length of time it took to get answers back -- query latency -- has been a problem. As databases have changed and new types emerged, solving the problem has become even more complex. Custom data source connectors are a solution.
In this video, you’ll learn why businesses of all sizes are investing so heavily in big data.
Big data and big data analytics are big business. IDC projects the market to grow to $50 billion by 2019. So what do organizations that invest in big data expect to achieve. Is there still a role for intuition in decision-making? Essentially, businesses pursue three objectives with big data: understanding the past, improving the present, and predicting the future. We’re also increasingly surrounded by data-driven smart systems that are reshaping the way we work and the economy we work in.
Watch this video to find out where most organizations are right now in terms of data monetization maturity -- and how to move past that.
A five-stage model describes the path to data monetization maturity. But did you know that although many organizations have invested heavily in data and analytics since the 1990s, most have not moved beyond the first stage of the model -- distributing analytics internally? The next four stages chart the development of analytics from a cost center to a profit center. And each stage has its own requirements.
In this video, you’ll find out how iFrames extends the customization of embedded analytics beyond what’s possible with white labeling.
There are a lot of cases when white labeling isn't sufficient,especially when you want to have data analytics alongside other functionality. A good example would be a customer portal where you have several columns with different widgets. You might have a news feed, a weather map, and other features plus analytics tools.
One way to do this would be embed an analytics web application into the portal via an iFrame. Of course, like white labeling, iFrames have limits.
This video goes beyond customization via white labeling and iFrames to the use of a software development kit (SDK).
Using iFrames and white labeling for customization, there's always a tradeoff between security, interconnectivity, and the integrity of the user experience. You have more control and more connectivity with an SDK. You can build a truly custom application without starting from scratch. A robust SDK should include the ability to embed out-of-the-box charts and new charts as well as their accompanying data and metadata. It should also allow you to embed just data that can feed pre-built visualizations and, very important, integrate using REST APIs.
Watch this video to learn, how using REST APIs alongside of an SDK can multiply the power of both.
An API is to software code what a UI is to users. It helps different bits of software interact with each other. And can help you turn a data application into a data platform. If you have access to them, you can do a lot with REST APIs that weren’t built into a particular SDK. For example, you could accept user inputs and pass them to the platform to be stored. Or acquire data and metadata from the application or from users. This can be very powerful when building portals.
Watch this video to learn how administration and automation via REST APIs makes life easier when a SaaS company or large enterprise is embedding a BI platform for use by tens or hundreds of thousands of users.
In that scenario, you want to look for ways to automate the provisioning of new users and groups. So, it's important for a BI platform to offer administrative APIs that can be scripted from your application. These are usually REST APIs, and they really help ease the administrative load when users want to sign up for the embedded service from the parent application.
This video recaps why embedded applications are like icebergs -- a lot happens under the surface.
Visualizations are what we see from an embedded application. That’s the part of the iceberg that’s above water. But under the hood, you have to make sure that the embedded BI platform will work with the parent application's platform and its development environment. You have to be able to deploy it on the same kind of infrastructure. And it has to work with the parent application’s security model.
Watch this video to learn the uses and limits of customizing embedded analytics through white labeling.
When a third-party software application is integral to the way a business delivers products or services, many organizations want that software to look like its home grown. White labeling is a way to do that relatively simply; and it’s sufficient in many situations. A lot of cosmetic fine tuning can be done with logos, color palettes, fonts, icons, and background images. In combination, these changes can make an embedded analytics blend well with its parent application.
In this video, we cover the second “A”: authorization, which refers to defining and enforcing privileges and permissions for a user.
There are two common methods for authorizing users: role-based access control (RBAC) and the access control list (ACL). In the first, a user is defined as a member of group -- say finance administration -- and the group as a whole is assigned permissions. Another group in finance -- finance accounts payable -- could be assigned a different level of permissions. ACLs provide a finer-grained level of control. For embedding purposes, users, groups, and roles should be defined by the parent application.
Changing the way people see and interact with data
Zoomdata develops the world’s fastest visual analytics solution for big data. Using patented Data SharpeningTM and micro-query technologies, Zoomdata empowers business users to visually consume data in seconds, even across billions of rows of data. Zoomdata Fusion enables interactive analysis across disparate data sources, bridging modern and legacy data architectures, blending real-time streams and historical data, and unifying enterprise data with data in the cloud. Delivered in a micro-services architecture for elastic scalability, Zoomdata runs on premises, in the cloud or embedded in an application.
Subscribe to this channel to learn best practices and emerging trends about data data visual analytics.