Join us for this next segment of “Under the Hood” that focuses on the database designer feature of HPE Vertica.
Learn how the schema designs created by Database Designer provide optimal query performance for your most challenging analytic workloads. Database Designer uses smart strategies to create efficient schema designs that can be deployed, changed and re-deployed by almost anyone, even those without advanced database knowledge.
Earlier this year, the open source community delivered the Stinger Initiative to improve speed, scale, and SQL semantics in Apache Hive. Now Stinger.next is underway to build on those initial successes.
Join this 30-minute webinar with Hortonworks co-founder Alan Gates and Hortonworks Hive product manager Raj Baines to discuss SQL queries in HDP 2.2: ACID transactions and the cost based optimizer. You will also hear about the road ahead for the Stinger.next initiative.
Owen O’Malley and Carter Shanklin host the second of our seven "Discover HDP 2.1" webinars. They discuss the Stinger Initiative and the improvements to Apache Hive that are included in HDP 2.1: faster queries with Hive on Tez, new SQL semantics, and more.Read more >
IBM has taken query tuning to a new level with IBM Data Studio. More detail is available than ever before. However, the tool does take some getting used to, especially for folks that are used to a green screen based query tuning experience. This presentation introduces you to IBM Data Studio and gets you started tuning queries.Read more >
Join DDN experts to see how organizations are leveraging developments in storage infrastructure to extract the greatest possible value from their data. Material covered will include general architectural concepts on building storage infrastructure for big data analytics, as well as a detailed discussion of real world applications and benchmarking results with SAS and Vertica platforms. Specifics on the impact to data ingest speed, query performance, flexibility, ease of management and overall scalability will also be covered.Read more >
This Tech Talk continues the "deep dive" on all all new IBM DB2 10 and IBM Infosphere Warehouse 10 features. Matthias Nicola from IBM labs explains the new Time Travel Query feature, which is a collection of bitemporal data management capabilities. These capabilities include temporal tables, temporal queries and updates, temporal constraints, and other functionality to manage data as of past or future points in time. Time Travel Query helps improve data consistency and quality across the enterprise and provides a cost-effective means to address auditing and compliance issues. As a result, organizations can reduce their risk of noncompliance and achieve greater business accuracy.
The presentation will discuss:
· How to create and manage temporal tables in DB2 10
· How insert, update, delete, and query data for different points in the past, present, or future
· How to use DB2 as a time machine
Please note that this webcast is conducted at 12:30 PM ET. You may see this time translated into your local time zone.
In this webcast, Patrick Wendell from Databricks will be speaking about Spark's new 1.6 release.
Spark 1.6 will include (but not limited to) a type-safe API called Dataset on top of DataFrames that leverages all the work in Project Tungsten to have more robust and efficient execution (including memory management, code generation, and query optimization) [SPARK-9999], adaptive query execution [SPARK-9850], and unified memory management by consolidating cache and execution memory [SPARK-10000].
In this session, we will introduce you to the new AWS WAF service. We will show you how to use the service to block Amazon CloudFront requests that originate from IP addresses that you specify and block requests based on request content, such as header values or SQL queries. We will walk you through working code samples that automate security operations and demonstrate the flexibility of AWS WAF web ACLs.Read more >
Discover what attorneys and litigation support managers are high-fiving about. They’re excited about Thomson Reuters’ new, and dare to say revolutionary ediscovery platform: eDiscovery Point. A new ediscovery platform that allows users simultaneously upload and process data; access that data within minutes; achieve accurate search results within seconds of performing a complex search query; as well as several other time and costs saving functionalities like advanced data analysis and predictive coding. Attend this webinar to see how eDiscovery Point will make ediscovery easier for you.
* Keith Schrodt, JD, MBA; Marketing Manager, Legal Managed Solutions; Thomson Reuters
* James Jarvis; Vice President, Product & Partner Management; Thomson Reuters
Moderator: George Socha
Every second counts in the data center. When storage latency prevents you from meeting SLAs or improving data center efficiency, solid-state memory can be used to meet a variety of needs. Join Rob Callaghan, as he shares real customer stories on how they were able to virtualize SQL servers, reduce search queries, and improve QoS by leveraging SanDisk flash technology. You’ll learn the unique architecture advantages of flash storage and the broad range of SanDisk solutions that have helped customers dramatically improve application performance while reducing capacity challenges and cost.Read more >
Learn about the new storage optimization features in the recently announced DB2 10 product. We will cover three areas including:
•Adaptive Compression, which allows you to reach higher compression ratios with DB2 10 than ever before. Learn how this new feature helps generate storage space savings, reduce physical I/O, and improve the buffer pool hit ratio so that higher throughput and faster query execution times are achieved. We'll cover the adaptive nature of the compression algorithm which helps ensure that compression ratios remain optimal over time, thus reducing the need for DBA intervention and data reorganization.
•Multi-temperature Data Management which configures the database so that only frequently accessed data (hot data) is stored on expensive fast storage, such as solid-state drives (SSD), and infrequently accessed data (cold data) is stored on slower, less-expensive storage, such as low-rpm hard disk drives. As data cools down and is accessed less frequently, you can dynamically move it to the slower storage, helping to maximize storage assets.
•Workload management which provides the ability to treat work differently, both predictively and reactively, based on the data touched.
In 2012, we are delving into the technology behind the exciting new April 3rd announcement of the DB2 10 for Linux, UNIX and Windows product. This DB2 Tech Talk, formerly know as DB2 Chat with the Labs, follows the April 26th Technical Tour of DB2 10 software.
One reason why some vendors promote the idea that OLAP is dead is because they are promoting the idea of smaller, user created models (the desktop BI approach) rather than a single centralized model. While OLAP may be out of fashion, the capabilities and benefits of the technology make sense for many data warehouse scenarios.
Join Chris Webb independent consultant specializing in Analysis Services, MDX, Power Pivot, DAX, Power Query and Power BI and Peter Sprague VP Solutions Engineering at Pyramid Analytics for a discussion about OLAP technology and the things you need to consider, including:
•Centralized data models and security
•OLAP vs Big Data Tools
•OLAP, Data Volumes and Scale
Connecting assets to audiences with speed and scale requires a data integration solution that allows you to integrate in real time via APIs, do batch tasks such as data ingestion or synchronization and handle a variety of data sources -- from JSON to XML to EDI. With Mule 3.7, we introduced DataWeave, a simple, powerful way to query and transform all types of data.
We’ll be showcasing the power of DataWeave in the following scenarios:
- Highly performant mappings and transformations
- Simple one-to-one mappings
- Complex mappings (e.g. joins, filtering, partitioning)-
- Easy to write, easy to maintain, metadata aware mappings
Data environments are growing exponentially. Not only is there more data, but there are more data sources. At the same time, the value of unlocking that data and using it to make business decisions is also increasing.
For the business user, understanding this complex data and unlocking its potential is the key to staying ahead of the competition.
For IT organizations, complex data can be the bane of many business analytics programs, causing all kinds of trouble in data management and hindering system performance.
Size of data and number of disparate data sources are two key drivers of complexity. The bigger the data, the more effort needed to query and store it. The more data sources (data tables) the more effort that is needed to prepare the data for analysis.
The data complexity matrix describes data from both of these standpoints. Your data may be Simple, Diversified, Big, or Complex. When considering a Business Analytics program, different approaches are better suited for each data state.
Most enterprises are still deciding what the core components of a cloud data warehousing and analytics solution should be. Come see how Red Hat deployed a secure cloud data warehousing architecture inside Amazon VPC using Amazon Redshift and S3. In this in-depth session, get practical advice on how Red Hat shortened the timeline to ingest new data sources and optimized query performance. Also learn how creating virtual data marts can lead to greater agility and faster insights.Read more >
SQL has long been the most widely used language for big data analysis. The SQL-on-Hadoop ecosystem is loaded with both commercial and open source alternatives, each offering tools optimized for various use cases. Fledgling analytical engines are in incubation, but are they ready to become full-fledged members of your enterprise infrastructure? Are they ready to fly?
In the real world, enterprises must understand their needs and select a SQL-on-Hadoop solution that addresses them. Points to consider: What are your analytics use cases-will a single user be working on data discovery or will multiple users perform daily analytics? Will you need to modify SQL to adjust to different deployment scenarios, or does a single solution exist for on-premises, Cloud, and Hadoop? Can a single solution support a variety of workloads from quick-hit dashboards to complex, resource-intensive, join-filled queries?
In this webcast, you will learn:
* Some of the challenges associated with the democratization of analytics while using SQL on Hadoop
* Criteria other than performance that should be considered for enterprise-grade analytics
* How Ambari and Kerberos fit in for management and security of your data.
* How HPE Vertica for SQL on Hadoop can be used as part of a modern IT infrastructure to deliver high-performance SQL on Hadoop.
Denny Lee, Technology Evangelist with Databricks, will provide a jump start into Apache Spark and Databricks. Spark is a fast, easy to use, and unified engine that allows you to solve many Data Sciences and Big Data (and many not-so-Big Data) scenarios easily. Spark comes packaged with higher-level libraries, including support for SQL queries, streaming data, machine learning, and graph processing. We will leverage Databricks to quickly and easily demonstrate, visualize, and debug our code samples; the notebooks will be available for you to download.
This introductory level jump start will focus on the following scenarios:
- Quick Start on Spark: Provides an introductory quick start to Spark using Python and Resilient Distributed Datasets (RDDs). We will review how RDDs have actions and transformations and their impact on your Spark workflow.
- A Primer on RDDs to DataFrames to Datasets: This will provide a high-level overview of our journey from RDDs (2011) to DataFrames (2013) to the newly introduced (as of Spark 1.6) Datasets (2015).
- Just in Time Data Warehousing with Spark SQL: We will demonstrate a Just-in-Time Data Warehousing (JIT-DW) example using Spark SQL on an AdTech scenario. We will start with weblogs, create an external table with RegEx, make an external web service call via a Mapper, join DataFrames and register a temp table, add columns to DataFrames with UDFs, use Python UDFs with Spark SQL, and visualize the output - all in the same notebook.