Join us for this next segment of “Under the Hood” that focuses on the database designer feature of HPE Vertica.
Learn how the schema designs created by Database Designer provide optimal query performance for your most challenging analytic workloads. Database Designer uses smart strategies to create efficient schema designs that can be deployed, changed and re-deployed by almost anyone, even those without advanced database knowledge.
Earlier this year, the open source community delivered the Stinger Initiative to improve speed, scale, and SQL semantics in Apache Hive. Now Stinger.next is underway to build on those initial successes.
Join this 30-minute webinar with Hortonworks co-founder Alan Gates and Hortonworks Hive product manager Raj Baines to discuss SQL queries in HDP 2.2: ACID transactions and the cost based optimizer. You will also hear about the road ahead for the Stinger.next initiative.
Owen O’Malley and Carter Shanklin host the second of our seven "Discover HDP 2.1" webinars. They discuss the Stinger Initiative and the improvements to Apache Hive that are included in HDP 2.1: faster queries with Hive on Tez, new SQL semantics, and more.Read more >
IBM has taken query tuning to a new level with IBM Data Studio. More detail is available than ever before. However, the tool does take some getting used to, especially for folks that are used to a green screen based query tuning experience. This presentation introduces you to IBM Data Studio and gets you started tuning queries.Read more >
We will show you how Kudu makes it easier for you to perform both real-time monitoring and ad hoc analytic queries on the same set of data.Read more >
GraphFrames bring the power of Apache Spark DataFrames to interactive analytics on graphs.
Expressive motif queries simplify pattern search in graphs, and DataFrame integration allows seamlessly mixing graph queries with Spark SQL and ML. By leveraging Catalyst and Tungsten, GraphFrames provide scalability and performance. Uniform language APIs expose the full functionality of GraphX to Java and Python users for the first time.
In this talk, the developers of the GraphFrames package will give an overview, a live demo, and a discussion of design decisions and future plans. This talk will be generally accessible, covering major improvements from GraphX and providing resources for getting started. A running example of analyzing flight delays will be used to explain the range of GraphFrame functionality: simple SQL and graph queries, motif finding, and powerful graph algorithms.
For experts, this talk will also include a few technical details on design decisions, the current implementation, and ongoing work on speed and performance optimizations.
Zoomdata, developers of the world’s fastest big data exploration, visualization & analytics platform, lets business users see and interact with data in all new ways.
Designed mobile and touch first, its patented micro-query architecture delivers results on billions of records in seconds and gives users a single plane of access for bridging old data and new data.
Zoomdata is backed by Accel Partners, B7, Columbus Nova Technology Partners, NEA and Razors Edge Ventures.
Learn more about the Industry-first enterprise-class cloud data warehouse that can grow, shrink, and pause in seconds.
SQL Data Warehouse independently scales compute and storage, so you pay for query performance only when you need it. Unlike other cloud data warehouses that require hours or days to resize, SQL Data Warehouse lets you grow or shrink compute power in minutes.
Learn how businesses are taking full advantage of storage at cloud scale, and applying query compute based on changing performance needs. Companies are cutting costs by only paying for storage, leveraging our market-leading on-demand price per terabyte.
Join us for this segment of “Under the Hood” of HPE’s Big Data Platform to learn about preaggregating data to accelerate popular queries in the Vertica SQL database.
While HPE’s Vertica database can aggregate billions of rows per second, sometimes there's no substitute to having the answer to common queries precomputed and "ready to go". Learn about Live Aggregate Projections, how they are implemented, and what functionality is supported in the new "Excavator" release, so you can take full advantage for dashboards, reports, and other "serve" use cases that demand subsecond response times.
What you will learn in this webinar:
-Learn how Big Data is not just about Hadoop, but the wide range of new and existing frameworks inside and outside your enterprise.
-Learn how Zoomdata can query across multiple data sources to bring a single view of data across disparate data sources.
-See how business users can combine multiple sources without waiting for a data architect to set it up.
-See how the power of Apache Spark enables Zoomdata Fusion at Big Data scale.
-Learn how to access Zoomdata Fusion and more cutting-edge features in the Zoomdata Early Access Program.
Join DDN experts to see how organizations are leveraging developments in storage infrastructure to extract the greatest possible value from their data. Material covered will include general architectural concepts on building storage infrastructure for big data analytics, as well as a detailed discussion of real world applications and benchmarking results with SAS and Vertica platforms. Specifics on the impact to data ingest speed, query performance, flexibility, ease of management and overall scalability will also be covered.Read more >
This Tech Talk continues the "deep dive" on all all new IBM DB2 10 and IBM Infosphere Warehouse 10 features. Matthias Nicola from IBM labs explains the new Time Travel Query feature, which is a collection of bitemporal data management capabilities. These capabilities include temporal tables, temporal queries and updates, temporal constraints, and other functionality to manage data as of past or future points in time. Time Travel Query helps improve data consistency and quality across the enterprise and provides a cost-effective means to address auditing and compliance issues. As a result, organizations can reduce their risk of noncompliance and achieve greater business accuracy.
The presentation will discuss:
· How to create and manage temporal tables in DB2 10
· How insert, update, delete, and query data for different points in the past, present, or future
· How to use DB2 as a time machine
Please note that this webcast is conducted at 12:30 PM ET. You may see this time translated into your local time zone.
Chief Technologist, Ruhollah Farchtchi gives a presentation at Spark Summit East, 2016 on the Interactive Visualization of Streaming Data Powered by Spark.
Much of the discussion on real-time data today focuses on the machine processing of that data. But helping humans visualize real-time streams is just as important. Visualizing real-time data introduces new UX and usability challenges for any developer embedding analytics into applications, especially when the target end users are business users and not data scientists. Self-service, interactive, subsecond response time to ad hoc queries — these are the new UX requirements for any enterprise visualizing real-time data. Streaming data also lends itself to new paradigms of interaction with the stream itself, like being able to pause, rewind and replay a stream. This talk is a case study in how and why Zoomdata built a “Data DVR” capability using Spark and Spark Streaming. We will describe the required user experience, the overall architecture and the specific use of Spark and Spark Streaming. We will describe the design considerations that led us to choose Spark Streaming over alternatives like Storm. We will show how end users configure the real-time increment and a historical retention window without writing any code themselves. We will also show how pause, rewind, replay is implemented in Spark and how the solution supports both real-time and historical analysis in the same architecture. Attendees will walk away with knowledge of Spark Streaming and how users can interactively work with streaming data. They will develop familiarity with the challenges of a lambda architecture and providing a consistent analytic experience over streaming and historical data.
In this webcast, Patrick Wendell from Databricks will be speaking about Spark's new 1.6 release.
Spark 1.6 will include (but not limited to) a type-safe API called Dataset on top of DataFrames that leverages all the work in Project Tungsten to have more robust and efficient execution (including memory management, code generation, and query optimization) [SPARK-9999], adaptive query execution [SPARK-9850], and unified memory management by consolidating cache and execution memory [SPARK-10000].
Learn How to Store and Query Time Series Data in NoSQL and Other Use CasesRead more >
CloudPhysics surveyed over 1,000 VMworld attendees, asking them a series of questions related to the challenges, issues and questions they faced in managing their vSphere environment. The results were fascinating. The good news, for anyone feeling the pressures of managing a virtual environment, is this: you’re not alone. In addition, we augmented the findings with unique global data set queries from CloudPhysics to provide extra content and insights.Read more >
Understanding your customer is critical to the success of your app and your business. Join this session to discover how Amazon Mobile Analytics can help you to make the most of your mobile analytics, keeping tabs on key trends such as active users, revenue, retention and behavioral insights. We will focus on several use cases and business intelligence tracks such as dashboards, custom analytics using queries, data visualisation and machine learning.Read more >
In this webcast, Reynold Xin from Databricks will be speaking about Apache Spark's new 2.0 major release.
The major themes for Spark 2.0 are:
- Unified APIs: Emphasis on building up higher level APIs including the merging of DataFrame and Dataset APIs
- Structured Streaming: Simplify streaming by building continuous applications on top of DataFrames allow us to unify streaming, interactive, and batch queries.
- Tungsten Phase 2: Speed up Apache Spark by 10X
Learn to Store and Query Times Series Data in NoSQL and Other Use CasesRead more >
In this session, we will introduce you to the new AWS WAF service. We will show you how to use the service to block Amazon CloudFront requests that originate from IP addresses that you specify and block requests based on request content, such as header values or SQL queries. We will walk you through working code samples that automate security operations and demonstrate the flexibility of AWS WAF web ACLs.Read more >
We are very well aware that companies like Facebook, Twitter, Whatsapp deal with datasets in the range of 100's of Petabytes and more. However not all datasets are that big. Did you know that all english pages of Wikipedia amount to just 49 GB uncompressed text data? Likewise, there are a large amount of datasets ranging from customers data to events and transactions which do not exceed the low Terabyte range.
In this webinar we will discuss how to process data in this range both for interactive queries as well for batch processing. We will look at what tradeoffs can be made by tuning the architecture with SSD and RAM. And which distributed computing paradigm work best for this datasets and their typical workloads. We will revision the concepts of data locality, data replication and parallel computing for this specific class of datasets.
Discover what attorneys and litigation support managers are high-fiving about. They’re excited about Thomson Reuters’ new, and dare to say revolutionary ediscovery platform: eDiscovery Point. A new ediscovery platform that allows users simultaneously upload and process data; access that data within minutes; achieve accurate search results within seconds of performing a complex search query; as well as several other time and costs saving functionalities like advanced data analysis and predictive coding. Attend this webinar to see how eDiscovery Point will make ediscovery easier for you.
* Keith Schrodt, JD, MBA; Marketing Manager, Legal Managed Solutions; Thomson Reuters
* James Jarvis; Vice President, Product & Partner Management; Thomson Reuters
Moderator: George Socha