Join us for this next segment of “Under the Hood” that focuses on the database designer feature of HPE Vertica.
Learn how the schema designs created by Database Designer provide optimal query performance for your most challenging analytic workloads. Database Designer uses smart strategies to create efficient schema designs that can be deployed, changed and re-deployed by almost anyone, even those without advanced database knowledge.
Earlier this year, the open source community delivered the Stinger Initiative to improve speed, scale, and SQL semantics in Apache Hive. Now Stinger.next is underway to build on those initial successes.
Join this 30-minute webinar with Hortonworks co-founder Alan Gates and Hortonworks Hive product manager Raj Baines to discuss SQL queries in HDP 2.2: ACID transactions and the cost based optimizer. You will also hear about the road ahead for the Stinger.next initiative.
Owen O’Malley and Carter Shanklin host the second of our seven "Discover HDP 2.1" webinars. They discuss the Stinger Initiative and the improvements to Apache Hive that are included in HDP 2.1: faster queries with Hive on Tez, new SQL semantics, and more.Read more >
IBM has taken query tuning to a new level with IBM Data Studio. More detail is available than ever before. However, the tool does take some getting used to, especially for folks that are used to a green screen based query tuning experience. This presentation introduces you to IBM Data Studio and gets you started tuning queries.Read more >
Analysing big data quickly and efficiently requires a data warehouse optimised to handle and scale for large datasets. Amazon Redshift is a fast, petabyte-scale data warehouse that makes it simple and cost-effective to analyse big data for a fraction of the cost of traditional data warehouses. By following a few best practices, you can take advantage of Amazon Redshift’s columnar technology and parallel processing capabilities to minimize I/O and deliver high throughput and query performance. This webinar will cover techniques to load data efficiently, design optimal schemas and tune query and database performance.
• Get an inside look at Amazon Redshift's columnar technology and parallel processing capabilities
• Learn how to migrate from existing data warehouses, optimise schemas and load data efficiently
• Learn best practices for managing workload, tuning your queries and using Amazon Redshift's interleaved sorting features
Who Should Attend:
• Data Warehouse Developers, Big Data Architects, BI Managers and Data Engineers
Turbo-Charge BI on Hadoop: The Time is Now
Want to turn your Hadoop cluster into a super-powerful, analytics data warehouse? Need to run BI queries on Hadoop at top speed?
Watch this recording of a live best practice session. You'll see how leading companies are super-charging their BI on Hadoop by combining the power of Tableau with the scale of Impala, and accelerating it all with AtScale. In this session, leaders from Cloudera and Tableau share a real-world perspective on
How to get super-fast performance from BI queries on Hadoop
Deliver powerful self-service visualization directly on Hadoop
Leverage existing BI and Hadoop investments to deliver more value to more users
We will show you how Kudu makes it easier for you to perform both real-time monitoring and ad hoc analytic queries on the same set of data.Read more >
GraphFrames bring the power of Apache Spark DataFrames to interactive analytics on graphs.
Expressive motif queries simplify pattern search in graphs, and DataFrame integration allows seamlessly mixing graph queries with Spark SQL and ML. By leveraging Catalyst and Tungsten, GraphFrames provide scalability and performance. Uniform language APIs expose the full functionality of GraphX to Java and Python users for the first time.
In this talk, the developers of the GraphFrames package will give an overview, a live demo, and a discussion of design decisions and future plans. This talk will be generally accessible, covering major improvements from GraphX and providing resources for getting started. A running example of analyzing flight delays will be used to explain the range of GraphFrame functionality: simple SQL and graph queries, motif finding, and powerful graph algorithms.
For experts, this talk will also include a few technical details on design decisions, the current implementation, and ongoing work on speed and performance optimizations.
Zoomdata, developers of the world’s fastest big data exploration, visualization & analytics platform, lets business users see and interact with data in all new ways.
Designed mobile and touch first, its patented micro-query architecture delivers results on billions of records in seconds and gives users a single plane of access for bridging old data and new data.
Zoomdata is backed by Accel Partners, B7, Columbus Nova Technology Partners, NEA and Razors Edge Ventures.
What you will learn in this webinar:
-Learn how Big Data is not just about Hadoop, but the wide range of new and existing frameworks inside and outside your enterprise.
-Learn how Zoomdata can query across multiple data sources to bring a single view of data across disparate data sources.
-See how business users can combine multiple sources without waiting for a data architect to set it up.
-See how the power of Apache Spark enables Zoomdata Fusion at Big Data scale.
-Learn how to access Zoomdata Fusion and more cutting-edge features in the Zoomdata Early Access Program.
Join DDN experts to see how organizations are leveraging developments in storage infrastructure to extract the greatest possible value from their data. Material covered will include general architectural concepts on building storage infrastructure for big data analytics, as well as a detailed discussion of real world applications and benchmarking results with SAS and Vertica platforms. Specifics on the impact to data ingest speed, query performance, flexibility, ease of management and overall scalability will also be covered.Read more >
This Tech Talk continues the "deep dive" on all all new IBM DB2 10 and IBM Infosphere Warehouse 10 features. Matthias Nicola from IBM labs explains the new Time Travel Query feature, which is a collection of bitemporal data management capabilities. These capabilities include temporal tables, temporal queries and updates, temporal constraints, and other functionality to manage data as of past or future points in time. Time Travel Query helps improve data consistency and quality across the enterprise and provides a cost-effective means to address auditing and compliance issues. As a result, organizations can reduce their risk of noncompliance and achieve greater business accuracy.
The presentation will discuss:
· How to create and manage temporal tables in DB2 10
· How insert, update, delete, and query data for different points in the past, present, or future
· How to use DB2 as a time machine
Please note that this webcast is conducted at 12:30 PM ET. You may see this time translated into your local time zone.
Join us for this segment of “Under the Hood” of HPE’s Big Data Platform to learn about preaggregating data to accelerate popular queries in the Vertica SQL database.
While HPE’s Vertica database can aggregate billions of rows per second, sometimes there's no substitute to having the answer to common queries precomputed and "ready to go". Learn about Live Aggregate Projections, how they are implemented, and what functionality is supported in the new "Excavator" release, so you can take full advantage for dashboards, reports, and other "serve" use cases that demand subsecond response times.
Chief Technologist, Ruhollah Farchtchi gives a presentation at Spark Summit East, 2016 on the Interactive Visualization of Streaming Data Powered by Spark.
Much of the discussion on real-time data today focuses on the machine processing of that data. But helping humans visualize real-time streams is just as important. Visualizing real-time data introduces new UX and usability challenges for any developer embedding analytics into applications, especially when the target end users are business users and not data scientists. Self-service, interactive, subsecond response time to ad hoc queries — these are the new UX requirements for any enterprise visualizing real-time data. Streaming data also lends itself to new paradigms of interaction with the stream itself, like being able to pause, rewind and replay a stream. This talk is a case study in how and why Zoomdata built a “Data DVR” capability using Spark and Spark Streaming. We will describe the required user experience, the overall architecture and the specific use of Spark and Spark Streaming. We will describe the design considerations that led us to choose Spark Streaming over alternatives like Storm. We will show how end users configure the real-time increment and a historical retention window without writing any code themselves. We will also show how pause, rewind, replay is implemented in Spark and how the solution supports both real-time and historical analysis in the same architecture. Attendees will walk away with knowledge of Spark Streaming and how users can interactively work with streaming data. They will develop familiarity with the challenges of a lambda architecture and providing a consistent analytic experience over streaming and historical data.
Learn How to Store and Query Time Series Data in NoSQL and Other Use CasesRead more >
In this webcast, Patrick Wendell from Databricks will be speaking about Apache Spark's new 1.6 release.
Spark 1.6 will include (but not limited to) a type-safe API called Dataset on top of DataFrames that leverages all the work in Project Tungsten to have more robust and efficient execution (including memory management, code generation, and query optimization) [SPARK-9999], adaptive query execution [SPARK-9850], and unified memory management by consolidating cache and execution memory [SPARK-10000].
The GuidePoint Virtual Security Operations Center (vSOC) was designed to address many of the common complaints and issues customers experience with other managed service providers. We use the cloud to provide dynamic scalability and cost savings. vSOC analysts provide validated security incidents that allow you to focus on what’s really important: remediation.
vSOC Detect now integrates with CrowdStrike Falcon by leveraging the Falcon Connect API to ingest Falcon host data in to the vSOC Detect monitoring platform. This integration enables vSOC Detect to leverage the CrowdStrike platform for endpoint monitoring and allows analysts to correlate endpoint data against SIEM security logs. This added correlation within our SIEM enables active hunting by vSOC Detect analysts to discover new and emerging threats in customer environments.
Join us to explore “Hunting with CrowdStrike”— and how our Integrations make CrowdStrike Falcon Versatile and Effective.
Topics will include:
- Using the CrowdStrike Integration vSOC Detect
- Learning how analysts can:
- Perform ad-hoc searches and queries
- Quickly Conduct comprehensive investigations
- Identify insider threat activity
- Create dashboards and reports
Chef is built in Ruby - a conscious choice for its great flexibility and developer friendliness. For some people, learning the language can feel difficult because most examples lack your perspective as a Chef practitioner. In this interactive webinar, we invite you to follow along in your favorite editor as we dive through the source code to teach you core Ruby concepts.
Join us to learn:
- Fundamental Ruby concepts and how they create the Recipe Domain Specific Language and the tools that power Chef
- Pry’s ability to navigate and query source code
Come with the Chef Development Kit installed
Who should attend:
Chefs with a basic understanding of writing recipes and cookbooks that want to gain a better understanding of the cookbooks they author and the tools that they employ each day.
In this webcast, Reynold Xin from Databricks will be speaking about Apache Spark's new 2.0 major release.
The major themes for Spark 2.0 are:
- Unified APIs: Emphasis on building up higher level APIs including the merging of DataFrame and Dataset APIs
- Structured Streaming: Simplify streaming by building continuous applications on top of DataFrames allow us to unify streaming, interactive, and batch queries.
- Tungsten Phase 2: Speed up Apache Spark by 10X
We are very well aware that companies like Facebook, Twitter, Whatsapp deal with datasets in the range of 100's of Petabytes and more. However not all datasets are that big. Did you know that all english pages of Wikipedia amount to just 49 GB uncompressed text data? Likewise, there are a large amount of datasets ranging from customers data to events and transactions which do not exceed the low Terabyte range.
In this webinar we will discuss how to process data in this range both for interactive queries as well for batch processing. We will look at what tradeoffs can be made by tuning the architecture with SSD and RAM. And which distributed computing paradigm work best for this datasets and their typical workloads. We will revision the concepts of data locality, data replication and parallel computing for this specific class of datasets.
Understanding your customer is critical to the success of your app and your business. Join this session to discover how Amazon Mobile Analytics can help you to make the most of your mobile analytics, keeping tabs on key trends such as active users, revenue, retention and behavioral insights. We will focus on several use cases and business intelligence tracks such as dashboards, custom analytics using queries, data visualisation and machine learning.Read more >
Join us for an exploration of how Hadoop Native SQL unleashes the power of Apache™ Hadoop® for business insights and predictive analytics. We will demonstrate how Pivotal HDB, powered by Apache HAWQ (incubating) allows near-real-time execution of ad-hoc queries at scale. Complete analytics tasks faster – in seconds or minutes, not hours or days, using in-database analytics from Apache MAD lib (incubating).Read more >
ndexing is king when it comes to achieving highest levels of performance in SQL Server. When indexing is correctly combined with compression (Enterprise only feature), you can gain significant advantages. During this session, we will demo how to query dynamic management views (DMVs) to identify the right objects on which to implement compression, ways to measure performance and identify impact on memory and other resources. We will cover advanced scripts that provide details on fragmentation, memory caching & compression levels. We will also look at how someone who does not have the Enterprise Edition of SQL Server or latest edition of SQL Server can still leverage compression and gain the performance advantage using the right Flash Storage solutionRead more >
Leading organisations are now using data science to unleash profitability, efficiency and agility across all business functions. Join this webinar to understand how any SAP BW developer, business analyst or data geek working with the SAP Business Warehouse can instantly enable themselves with data science capabilities.
In particular, see how the total BEx integration of TIBCO Spotfire unlocks data science on SAP BW data with unprecedented ease, speed and value.
This webinar will cover:
- The practical considerations for analysing SAP BW data using predictive analytics
- How in-house SAP BW developers and business analysts can build invaluable data science skills at minimal expense and disruption
- How existing BW BEx queries are the natural starting point to deliver value through predictive analytics
- How a rich library of over 8000 open source R-packages, used by millions of developers and data scientists, can be used to address practically any business problem
- How the industry standard predictive modelling language, R, can run 10-100x faster on live SAP BW data using TIBCO Spotfire
- The power of total BEx integration using TIBCO Spotfire
- How TIBCO Spotfire provides a single ecosystem for the needs of a data scientist