Analyzing Unstructured Data in Hadoop

Matthew Schumpert, Director of Solutions Engineering, Datameer
Unstructured data is growing 62% per year faster than structured data. According to Gartner, data volumes are set to grow 800% in aggregate over the next 5 years, and 80% of it will be unstructured data.

Analysis of unstructured data can reveal important insights and interrelationships that are difficult or impossible to determine with traditional business intelligence tools and data warehousing infrastructure. Because unstructured data is typically large, dirty and “noisy”, it requires significantly more computing power and pre-processing to be able to extract the signal from the noise, and to find the insights that will ultimately enable businesses to make the most informed decisions possible.

Together, Hadoop and Datameer address the issues presented by unstructured data processing, and help businesses harness the potential this data, along with traditionally managed structured sources, ensuring the fastest time-to-insight.


This webinar will showcase and discuss:

•How applying big data analytics to unstructured data can help you gain richer, deeper and more accurate insights to gain competitive advantages
•The sources of unstructured data which include email, social media platforms, CRM systems, call center platforms (including notes and speech-to-text transcripts), and web scrapes
•How Monitoring the communications of your customers and prospects enables you to make time-sensitive decisions and jump on new business opportunities
Jun 18 2014
48 mins
Analyzing Unstructured Data in Hadoop
More from this community:

Business Intelligence and Analytics

  • Live and recorded (1527)
  • Upcoming (41)
  • Date
  • Rating
  • Views
  • Hadoop is transforming today's Healthcare industry. In this webinar, Charles Boicey covers the various new use cases made possible as the Hadoop ecosystem matures.​

    In 2010 the Clinical Informatics Team at the University of California in Irvine, led by Charles Boicey, looked outside of the conventional Healthcare data ecosystem for new data management solutions - their existing Electronic Health Record and Enterprise Data Warehouse environments no longer met the organization’s needs. They researched "Big Data" technologies in organizations such as Yahoo, LinkedIn, Twitter, and Facebook and concluded that Hadoop could supplement their current ecosystem to solve existing use cases and act as a platform to develop new applications for future solutions and insights.
  • YARN has fundamentally transformed the Hadoop landscape. It has opened Hadoop from a single workload system to one that can now support a multitude of “fit for purpose” processing. In this workshop we will provide an overview of Apache Slider that enables custom applications to run natively in the cluster as a YARN Ready Application. The workshop will include working examples and provide an overview of work being pursued in the community around YARN Docker integration.
  • Effective data governance requires the effective application of people, process, policy and technology to ensure consistent delivery of trusted, connected, and secure data across an enterprise.

    Organizations across all industries are investing in data governance to gain business value from their data to meet industry regulations, reduce the cost of doing business, and grow revenue and profits.

    In this webinar dedicated to data governance, Michael Wodzinski, Director of Information Architecture team, Lisa Bemis, Director of Master Data, and Fabian Torres, Director, Project Management at Houghton Mifflin Harcourt (HMH), global leader in publishing, will share their experiences in implementing a data governance program within HMH. Our guest speakers from HMH will discuss some of the unique data management challenges within HMH, how the data governance program has helped address those issues and open up new opportunities for the company. While walking you through their data governance journey, our guest speakers will offer their insights on how to establish a viable data governance practice in a complex enterprise environment, share their best practices and lessons learned. David Lyle, VP of Produce Strategy, from Informatica will share his observations in the data governance space, discuss Informatica’s data governance solutions and our thought leadership behind those offerings.
  • Attack Intelligence to Power Tomorrow’s Cyber Response.

    Preparing to combat every threat and vulnerability is a war that no cybersecurity professional can win today. Speed, accuracy and visibility of threats and active attacks is critical to defending against APTs and other sophisticated attacks responsible for today’s headline-grabbing data breaches. The next generation of advanced threat prevention solutions will require a significant shift in how we incorporate threat and attack visibility into everyday security operations, enabling incident responders to identify and stop campaigns as they happen.

    Join us as IDC’s Research Vice President for Security Products Services Charles Kolodgy shares his view of the threat landscape, including how threats are evolving, how cybercriminals are becoming more sophisticated and what new solutions are necessary to combat APTs.
  • Growing installs is the number one mission for any mobile app developer, and any user acquisition strategy generally consists of a mix of organic installs and paid campaigns. But many see these as separate and distinct. Not true. To get the most out of each — especially for smaller developers — it’s critical to understand how paid installs impact organic installs, and vice versa.

    In this webinar, Ian Sefferman of TUNE will share the eye-opening results of a study investigating the correlation between paid campaigns and organic installs (yes, it’s positive), and how this varies depending on the app category and operating system. Christian Calderon of DOTS will dive into the strategies and tactics that increase both paid and organic installs, and how they work together.

    What you’ll learn:
    For every paid install, how many organic installs an app can expect to see
    How the multiplier effect impacts app categories differently
    How organic installs and engaged users affect your paid strategy and spend
    Best-practice examples on what really works to maximize both organic installs and paid campaigns for highest yield

    #paidorganic
    #appmktg

    Speakers:
    Ian Sefferman, GM, App Store Analytics
    Christian Calderon, Head of Marketing, DOTS
  • Based on recent research by analyst Bob Larrivee of AIIM, this webinar will address how organizations can leverage technology to identify, evaluate and optimize business processes to increase operational efficiency.

    Join us as we explore:
    - Drivers for problem-solving, tracking KPIs, process failures and workflow management
    - How technology can reduce errors and exceptions that lead to lost business and non-compliance
    - Increasing visibility to optimize processes, reduce costs and deliver a superior customer experience
  • Scaling multiple databases with a single legacy storage system works well from a cost perspective, but workload conflicts and hardware contention make these solutions an unattractive choice for anything but low-performance applications.

    Attend the webinar to learn about:
    - How SolidFire’s all-flash storage system provides high performance at massive scale for mixed workload processing while simultaneously controlling costs and guaranteeing performance
    - How to deploy four or more database copies using SolidFire’s Oracle Validated Configuration, at a price point at or below the cost of traditional storage systems
    - SolidFire’s Quality of Service (QoS) guarantee; every copy receives dedicated all-flash performance, so IT admins can deliver solutions with confidence and maximize business efficiency
  • Scaling multiple databases with a single legacy storage system works well from a cost perspective, but workload conflicts and hardware contention make these solutions an unattractive choice for anything but low-performance applications.

    Attend the webinar to learn about:
    - How SolidFire’s all-flash storage system provides high performance at massive scale for mixed workload processing while simultaneously controlling costs and guaranteeing performance
    - How to deploy four or more database copies using SolidFire’s Oracle Validated Configuration, at a price point at or below the cost of traditional storage systems
    - SolidFire’s Quality of Service (QoS) guarantee; every copy receives dedicated all-flash performance, so IT admins can deliver solutions with confidence and maximize business efficiency
  • In data visualization, we map data values and relationships onto visual dimensions to create a graphical representation for exploration and analysis. How can we best use the power of the human visual system to make these values and relationships clear? Using examples from information design, cartography and data graphics, we will demonstrate how insights from research in color perception, perceptual organization and visual attention have helped define best practices for visual analysis.

    You will learn how to utilize a few perceptual and cognitive building blocks that can inform a wide variety of visualization choices, and to demonstrate how these influenced the design of the Tableau product.
  • In data visualization, we map data values and relationships onto visual dimensions to create a graphical representation for exploration and analysis. How can we best use the power of the human visual system to make these values and relationships clear? Using examples from information design, cartography and data graphics, we will demonstrate how insights from research in color perception, perceptual organization and visual attention have helped define best practices for visual analysis.

    You will learn how to utilize a few perceptual and cognitive building blocks that can inform a wide variety of visualization choices, and to demonstrate how these influenced the design of the Tableau product.
  • Channel
  • Channel profile
  • Identifying New Revenue Streams with Big Data: Driving New Product Innovation Recorded: Jan 14 2015 47 mins
    Big data enables you to quickly generate totally new insights based on an analysis of all of your structured and unstructured data.

    Analyzing all your data as a single data set regardless of data type can uncover patterns and behaviors that would be impossible to get from traditional business intelligence. Today, companies are using these insights to create innovative products and services that generate new revenue streams.

    In this webcast, you will learn:

    -How to use big data analytics to create and operationalize products and services
    -How to use big data analytics to monetize these new products and services
    -Use cases of companies driving new product innovation and revenue using big data

    It’s time to take big data seriously. Register now to learn more about how you can generate innovative product and services that drive new and immediate revenue streams for your company.
  • How to Avoid Pitfalls in Big Data Recorded: Dec 4 2014 59 mins
    Big data analytics is revolutionizing the way businesses are collecting, storing and more importantly analyzing data. However, the adoption of a big data analytics solution has its share of failures and false starts.

    Watch this webinar to learn how to navigate the most common obstacles of big data analytics.

    Datameer and MapR have worked with customers to identify and solve the common pitfalls organizations face when deploying Hadoop-based analytics.

    In this webinar, we will show you how to:
    - Find the balance between infrastructure and business use cases
    - Overcome challenges of using multiple tools that address big data analytics
    - Leverage all your resources (data scientists, IT and analysts) most effectively
  • Top 3 Big Data Use Cases In Financial Services Recorded: Nov 18 2014 48 mins
    In financial services, the top big data analytics use cases include customer analytics to understand customer journey using data from all customer interaction channels, predict and avoid customer churn, and fraud and compliance. The financial and corporate benefits of these use cases range from improving customer retention, to hundreds of millions of dollars in incremental revenue and protection of shareholder value.

    In this webinar, learn from big data analytics experts:
    - Top 3 use cases in financial services
    - The importance of applying the appropriate technologies
    - The data driven insights that will give companies a competitive edge
  • The Economics of SQL on Hadoop Recorded: Nov 5 2014 61 mins
    As organizations clamor to utilize their new investments in Hadoop ecosystems AND leverage their existing analytical infrastructures, many rush to integrate SQL as a data access layer to leverage existing skill sets and get started faster.

    However, this approach relegates Hadoop to a data management and processing platform rather than the storage and compute engine optimized for analytical workloads it was purpose-built to be.

    This webinar featuring EMA and Datameer, will discuss the technical limitations of SQL on Hadoop and propose alternative ways to fully maximize Hadoop investments.

    Leave this webinar understanding:
    - How SQL negates the inherent benefits of Hadoop
    - Why technological paradigm changes can sometimes be good
    - Use cases when SQL on Hadoop makes sense
  • Big Data & Brews: A VC's View on the Big Data Market Recorded: Oct 20 2014 10 mins
    Join two leaders in Hadoop as they "chalkboard" the Hadoop ecosystem and each of its components.
  • What Can Big Data Analytics Find Out About Your Customers? Recorded: Oct 20 2014 5 mins
    Join Matt as he goes over some of the critical functions that big data analytics can optimize at your company.
  • How to Optimize Your Customer Funnel with Big Data Analytics Recorded: Oct 20 2014 3 mins
    Join Karen as she goes over what customer funnel optimization is and how your organization can work to improve it thoroughly.
  • Instant Insights with Visual Data Discovery Recorded: Sep 17 2014 47 mins
    With Datameer’s new visual data discovery capabilities, users can now use visual tools at every step of the analytics process, all within one easy to use spreadsheet environment. Because visualizations that help with data profiling and anomaly detection are automatically generated and available throughout the data discovery process, you save valuable time in preparing and cleansing your data. It also dramatically simplifies data preparation tasks like cleansing, filtering, profiling, enrichment and general data wrangling, reducing the workload on your IT staff and the Hadoop environment. Furthermore, visual data mining tools are now available in both full-screen preview and final results modes, without leaving the spreadsheet interface.

    Attend this webinar with Matt Schumpert, Director of Product Management to learn how Datameer’s visual data discovery makes it easy to find the right exploration tool for the data and analytics use case at hand.

    Our customers are seeing big data insights in days not months and are able to:
    - Deliver faster time-to -insight, leading to better decisions, without burdening IT
    - Discover correlations, new attributes and behavioral patterns in your raw data sets, and assess their impact to the organization
    - Choose the right tool for your particular analytics user preference: use “groupby & count” like in SQL, or look at charts showing data distribution
    - Explain outcomes and perform segmentation using the built-in clustering algorithms and decision trees
  • Using Big Data to Understand Your Customer's Buying Journey Recorded: Aug 5 2014 48 mins
    Today more than ever the role of the CMOs and digital marketing executives has become data-driven. Marketing executives who leverage data to understand prospect and customer behavior have gained an edge over their peers. Big Data enables you to combine the vast amount of customer behavior data being generated from mobile, web, social media, transaction systems, Ads and turn them into new insights that drive customer acquisition and retention.

    Join Azita Martin, Chief Marketing Officer at Datameer and Matt Schumpert, Sr. Director of Solutions Engineering, as they discuss and showcase how leading edge companies are leveraging big data to:

    ▪Combine customer interaction data to understand customer buying journey
    ▪Understand high-value customer behavior beyond profile segmentation
    ▪Identify the most common path to customer churn
    ▪Perform market basket analysis to help with cross-sell and up-sell
  • Analyzing Unstructured Data in Hadoop Recorded: Jun 18 2014 48 mins
    Unstructured data is growing 62% per year faster than structured data. According to Gartner, data volumes are set to grow 800% in aggregate over the next 5 years, and 80% of it will be unstructured data.

    Analysis of unstructured data can reveal important insights and interrelationships that are difficult or impossible to determine with traditional business intelligence tools and data warehousing infrastructure. Because unstructured data is typically large, dirty and “noisy”, it requires significantly more computing power and pre-processing to be able to extract the signal from the noise, and to find the insights that will ultimately enable businesses to make the most informed decisions possible.

    Together, Hadoop and Datameer address the issues presented by unstructured data processing, and help businesses harness the potential this data, along with traditionally managed structured sources, ensuring the fastest time-to-insight.


    This webinar will showcase and discuss:

    •How applying big data analytics to unstructured data can help you gain richer, deeper and more accurate insights to gain competitive advantages
    •The sources of unstructured data which include email, social media platforms, CRM systems, call center platforms (including notes and speech-to-text transcripts), and web scrapes
    •How Monitoring the communications of your customers and prospects enables you to make time-sensitive decisions and jump on new business opportunities
  • Big Data: Power to the User Recorded: Apr 9 2014 44 mins
    What is the value of big data? How does a user get that value?

    Before, analysts would have to wait months relying on IT for a new report or make changes to an existing one. Now, analysts are able to shrink that time down to days or even minutes. On top of that, analysts can ask questions that were not possible before. In this webinar, we’ll show you how this analysis is possible and the value that has been achieved by customers.

    In this session, you will learn:
    How analysts get value out of big data
    How to visualize data at every step of analysis
    How analysts can do big data analytics without IT, in one product
  • Customer Case Studies of Self-Service Big Data Analytics Recorded: Feb 19 2014 44 mins
    In the new world of big data, analysts are challenged to ask questions that were never possible before. Self-service tools empowers business users to rapidly gather, analyze and visualize data from board, diverse data sources. Analyzing these sources provides new answers and new business opportunities for those smart enough to answer the new questions. Free-up your IT staff so they no longer have the need to response to routine report requests. Business users can now rely on the rapid delivery of advanced self-service BI and data visualization capabilities to solve complex problems and capitalize on new opportunities.

    In this session you will learn:
    -Customer examples and return on investment from self-service big data analytics
    -How business analysts can take advantage of Machine Learning
    -Best practices in self-service big data analytics
  • Webcast: Top 3 Things to Consider with Machine Learning on Big Data Recorded: Jan 24 2014 54 mins
    Machine learning is powerful but requires coding and access to all the relevant datasets to get full insights. With new Big Data analytic tools, business users can now use machine learning to gain a competitive edge.

    Based on best practices and customer experiences, join Datameer as we discuss what to look for and what value organizations get out of Machine Learning on Big Data.

    This webinar will provide:

    *an overview of challenges and tools available today
    *use cases for machine learning on hadoop
    *capabilities to look for
    *comparison of available solutions
The only End-to-End Big Data Analytics Application for Hadoop
Datameer is the only end-to-end big data analytics application purpose-built for the Hadoop ecosystem, designed to make big data easy for everyone. Companies of all sizes like British Telecom, Citibank, Trustev and Workday use Datameer to integrate, analyze and visualize all of their data to get new insights faster than ever. Founded in 2009 by Hadoop veterans, Datameer is headquartered in San Francisco, CA and counts Kleiner Perkins Caulfield & Byers, Workday, Citi Ventures, Next World Capital and Software AG among its investors.

Embed in website or blog

Successfully added emails: 0
Remove all
  • Title: Analyzing Unstructured Data in Hadoop
  • Live at: Jun 18 2014 6:00 pm
  • Presented by: Matthew Schumpert, Director of Solutions Engineering, Datameer
  • From:
Your email has been sent.
or close
You must be logged in to email this