Hi [[ session.user.profile.firstName ]]

Democratizing Data and Predictive Analytics

Democratizing Data and Predictive Analytics While Ensuring Governance & Transparency

As organizations empower more users to fully leverage advanced and predictive analytics to “democratize” their data and bring insights to the masses through interactive visual dashboards, they also need to provide enhanced data transparency, collaboration, and governance to gain broader acceptance and trust of the data.

In our latest DSC webinar we will focus our discussion on how the combination of governed self-service data preparation, advanced analytics and data visualization can help organizations:

•Make informed decisions faster by empowering data architects and non-technical business analysts to visually communicate data flows and business logic.
•Improve transparency by removing the "Black Box" and provide actionable, accurate insights that all parties involved can trust.
•Ensure governance and data quality through self-service data provisioning and embedded business rules and data sourcing.
•Enable anyone to quickly visualize and securely share analysis through cloud-based dashboards.

Speaker:
Dan Donovan, Lead Technology Evangelist, Lavastorm Analytics
Paul Lilford, Director - Technology Partners, Tableau Software

Hosted by:
Bill Vorhies, Editorial Director -- Data Science Central
Recorded Oct 22 2015 63 mins
Your place is confirmed,
we'll send you email reminders
Presented by
Dan Donovan, Lead Technology Evangelist, Lavastorm & Paul Lilford, Director - Technology Partners, Tableau Software
Presentation preview: Democratizing Data and Predictive Analytics

Network with like-minded attendees

  • [[ session.user.profile.displayName ]]
    Add a photo
    • [[ session.user.profile.displayName ]]
    • [[ session.user.profile.jobTitle ]]
    • [[ session.user.profile.companyName ]]
    • [[ userProfileTemplateHelper.getLocation(session.user.profile) ]]
  • [[ card.displayName ]]
    • [[ card.displayName ]]
    • [[ card.jobTitle ]]
    • [[ card.companyName ]]
    • [[ userProfileTemplateHelper.getLocation(card) ]]
  • Channel
  • Channel profile
  • Semantic AI: Bringing Machine Learning and Knowledge Graphs Together Recorded: May 23 2018 64 mins
    Kirk Borne, Principal Data Scientist, Booz Allen Hamilton & Andreas Blumauer, CEO, Managing Partner Semantic Web Company
    Implementing AI applications based on machine learning is a significant topic for organizations embracing digital transformation. By 2020, 30% of CIOs will include AI in their top five investment priorities according to Gartner’s Top 10 Strategic Technology Trends for 2018: Intelligent Apps and Analytics. But to deliver on the AI promise, organizations need to generate good quality data to train the algorithms. Failure to do so will result in the following scenario: "When you automate a mess, you get an automated mess."

    This webinar covers:

    - An introduction to machine learning use cases and challenges provided by Kirk Borne, Principal Data Scientist at Booz Allen Hamilton and top data science and big data influencer.
    - How to achieve good data quality based on harmonized semantic metadata presented by Andreas Blumauer, CEO and co-founder of Semantic Web Company and a pioneer in the application of semantic web standards for enterprise data integration.
    - How to apply a combined approach when semantic knowledge models and machine learning build the basis of your cognitive computing. (See Attachment: The Knowledge Graph as the Default Data Model for Machine Learning)
    - Why a combination of machine and human computation approaches is required, not only from an ethical but also from a technical perspective.
  • Comparing Apache Ignite & Cassandra for Hybrid Transactional Analytical Apps Recorded: Mar 28 2018 61 mins
    Denis Magda, Director of Product Management, GridGain Systems
    The 10x growth of transaction volumes, 50x growth in data volumes and drive for real-time visibility and responsiveness over the last decade have pushed traditional technologies including databases beyond their limits. Your choices are either buy expensive hardware to accelerate the wrong architecture, or do what other companies have started to do and invest in technologies being used for modern hybrid transactional analytical applications (HTAP).

    Learn some of the current best practices in building HTAP applications, and the differences between two of the more common technologies companies use: Apache® Cassandra™ and Apache® Ignite™. This session will cover:

    - The requirements for real-time, high volume HTAP applications
    - Architectural best practices, including how in-memory computing fits in and has eliminated tradeoffs between consistency, speed and scale
    - A detailed comparison of Apache Ignite and GridGain® for HTAP applications

    About the speaker: Denis Magda is the Director of Product Management at GridGain Systems, and Vice President of the Apache Ignite PMC. He is an expert in distributed systems and platforms who actively contributes to Apache Ignite and helps companies and individuals deploy it for mission-critical applications. You can be sure to come across Denis at conferences, workshop and other events sharing his knowledge about use case, best practices, and implementation tips and tricks on how to build efficient applications with in-memory data grids, distributed databases and in-memory computing platforms including Apache Ignite and GridGain.

    Before joining GridGain and becoming a part of Apache Ignite community, Denis worked for Oracle where he led the Java ME Embedded Porting Team -- helping bring Java to IoT.
  • How to Share State Across Multiple Apache Spark Jobs using Apache Ignite Recorded: Mar 28 2018 42 mins
    Akmal Chaudhri, Technology Evangelist, GridGain Systems
    Attend this session to learn how to easily share state in-memory across multiple Spark jobs, either within the same application or between different Spark applications using an implementation of the Spark RDD abstraction provided in Apache Ignite. During the talk, attendees will learn in detail how IgniteRDD – an implementation of native Spark RDD and DataFrame APIs – shares the state of the RDD across other Spark jobs, applications and workers. Examples will show how IgniteRDD, with its advanced in-memory indexing capabilities, allows execution of SQL queries many times faster than native Spark RDDs or Data Frames.

    Akmal Chaudhri has over 25 years experience in IT and has previously held roles as a developer, consultant, product strategist and technical trainer. He has worked for several blue-chip companies such as Reuters and IBM, and also the Big Data startups Hortonworks (Hadoop) and DataStax (Cassandra NoSQL Database). He holds a BSc (1st Class Hons.) in Computing and Information Systems, MSc in Business Systems Analysis and Design and a PhD in Computer Science. He is a Member of the British Computer Society (MBCS) and a Chartered IT Professional (CITP).
  • Scalable Monitoring for the Growing CERN Infrastructure Recorded: Mar 28 2018 45 mins
    Daniel Lanza Garcia, Big Data Engineer, CERN
    When monitoring an increasing number of machines, the infrastructure and tools need to be rethinked. A new tool, ExDeMon, for detecting anomalies and raising actions, has been developed to perform well on this growing infrastructure. Considerations of the development and implementation will be shared.

    Daniel has been working at CERN for more than 3 years as Big Data developer, he has been implementing different tools for monitoring the computing infrastructure in the organisation.
  • The Data Lake for Agile Ingest, Discovery, & Analytics in Big Data Environments Recorded: Mar 27 2018 58 mins
    Kirk Borne, Principal Data Scientist, Booz Allen Hamilton
    As data analytics becomes more embedded within organizations, as an enterprise business practice, the methods and principles of agile processes must also be employed.

    Agile includes DataOps, which refers to the tight coupling of data science model-building and model deployment. Agile can also refer to the rapid integration of new data sets into your big data environment for "zero-day" discovery, insights, and actionable intelligence.

    The Data Lake is an advantageous approach to implementing an agile data environment, primarily because of its focus on "schema-on-read", thereby skipping the laborious, time-consuming, and fragile process of database modeling, refactoring, and re-indexing every time a new data set is ingested.

    Another huge advantage of the data lake approach is the ability to annotate data sets and data granules with intelligent, searchable, reusable, flexible, user-generated, semantic, and contextual metatags. This tag layer makes your data "smart" -- and that makes your agile big data environment smart also!
  • Is the Traditional Data Warehouse Dead? Recorded: Mar 27 2018 61 mins
    James Serra, Data Platform Solution Architect, Microsoft
    With new technologies such as Hive LLAP or Spark SQL, do you still need a data warehouse or can you just put everything in a data lake and report off of that? No! In the presentation, James will discuss why you still need a relational data warehouse and how to use a data lake and an RDBMS data warehouse to get the best of both worlds.

    James will go into detail on the characteristics of a data lake and its benefits and why you still need data governance tasks in a data lake. He'll also discuss using Hadoop as the data lake, data virtualization, and the need for OLAP in a big data solution, and he will put it all together by showing common big data architectures.
  • HDFS on Kubernetes: Lessons Learned Recorded: Sep 19 2017 46 mins
    Kimoon Kim, Pepperdata Software Engineer
    HDFS on Kubernetes: Lessons Learned is a webinar presentation intended for software engineers, developers, and technical leads who develop Spark applications and are interested in running Spark on Kubernetes. Pepperdata has been exploring Kubernetes as potential Big Data platform with several other companies as part of a joint open source project.

    In this webinar, Kimoon Kim will show you how to: 

    –Run Spark application natively on Kubernetes
    –Enable Spark on Kubernetes read and write data securely on HDFS protected by Kerberos
  • Data scientists: Can't live with them, can't live without them. Recorded: Aug 24 2017 45 mins
    Wyatt Benno, CEO, DataHero
    There has been a flood of publicity around big data, data processing, and the role of predictive analytics in businesses of the future.
    As business operators how do we get access to these valuable business insights, even when there is not a data analyst around to walk us through their results?

    - Should your software emulate a data scientist?
    - Learn about the power of data visualizations.
    - Learn about creating value from disperse data sets.
  • Hunting Criminals with Hybrid Analytics, Semi-supervised Learning, & Feedback Recorded: Aug 23 2017 62 mins
    David Talby, CTO, Pacific AI
    Fraud detection is a classic adversarial analytics challenge: As soon as an automated system successfully learns to stop one scheme, fraudsters move on to attack another way. Each scheme requires looking for different signals (i.e. features) to catch; is relatively rare (one in millions for finance or e-commerce); and may take months to investigate a single case (in healthcare or tax, for example) – making quality training data scarce.

    This talk will cover a code walk-through, the key lessons learned while building such real-world software systems over the past few years. We'll look for fraud signals in public email datasets, using IPython and popular open-source libraries (scikit-learn, statsmodel, nltk, etc.) for data science and Apache Spark as the compute engine for scalable parallel processing.

    David will iteratively build a machine-learned hybrid model – combining features from different data sources and algorithmic approaches, to catch diverse aspects of suspect behavior:

    - Natural language processing: finding keywords in relevant context within unstructured text
    - Statistical NLP: sentiment analysis via supervised machine learning
    - Time series analysis: understanding daily/weekly cycles and changes in habitual behavior
    - Graph analysis: finding actions outside the usual or expected network of people
    - Heuristic rules: finding suspect actions based on past schemes or external datasets
    - Topic modeling: highlighting use of keywords outside an expected context
    - Anomaly detection: Fully unsupervised ranking of unusual behavior

    Apache Spark is used to run these models at scale – in batch mode for model training and with Spark Streaming for production use. We’ll discuss the data model, computation, and feedback workflows, as well as some tools and libraries built on top of the open-source components to enable faster experimentation, optimization, and productization of the models.
  • Toward Internet of Everything: Architectures, Standards, & Interoperability Recorded: Jun 21 2017 63 mins
    Ram D. Sriram, Chief of the Software and Systems Division, IT Lab at National Institute of Standards and Technology
    In this talk, Ram will provide a unified framework for Internet of Things, Cyber-Physical Systems, and Smart Networked Systems and Societies, and then discuss the role of ontologies for interoperability.

    The Internet, which has spanned several networks in a wide variety of domains, is having a significant impact on every aspect of our lives. These networks are currently being extended to have significant sensing capabilities, with the evolution of the Internet of Things (IoT). With additional control, we are entering the era of Cyber-physical Systems (CPS). In the near future, the networks will go beyond physically linked computers to include multimodal-information from biological, cognitive, semantic, and social networks.

    This paradigm shift will involve symbiotic networks of people (social networks), smart devices, and smartphones or mobile personal computing and communication devices that will form smart net-centric systems and societies (SNSS) or Internet of Everything. These devices – and the network -- will be constantly sensing, monitoring, interpreting, and controlling the environment.

    A key technical challenge for realizing SNSS/IoE is that the network consists of things (both devices & humans) which are heterogeneous, yet need to be interoperable. In other words, devices and people need to interoperate in a seamless manner. This requires the development of standard terminologies (or ontologies) which capture the meaning and relations of objects and events. Creating and testing such terminologies will aid in effective recognition and reaction in a network-centric situation awareness environment.

    Before joining the Software and Systems Division (his current position), Ram was the leader of the Design and Process group in the Manufacturing Systems Integration Division, Manufacturing Engineering Lab, where he conducted research on standards for interoperability of computer-aided design systems.
  • The Ways Machine Learning and AI Can Fail Recorded: Apr 13 2017 48 mins
    Brian Lange, Partner and Data Scientist, Datascope
    Good applications of machine learning and AI can be difficult to pull off. Join Brian Lange, Partner and Data Scientist at data science firm Datascope, as he discusses a variety of ways machine learning and AI can fail (from technical to human factors) so that you can avoid repeating them yourself.
  • Logistics Analytics: Predicting Supply-Chain Disruptions Recorded: Feb 16 2017 47 mins
    Dmitri Adler, Chief Data Scientist, Data Society
    If a volcano erupts in Iceland, why is Hong Kong your first supply chain casualty? And how do you figure out the most efficient route for bike share replacements?

    In this presentation, Chief Data Scientist Dmitri Adler will walk you through some of the most successful use cases of supply-chain management, the best practices for evaluating your supply chain, and how you can implement these strategies in your business.
  • Unlock real-time predictive insights from the Internet of Things Recorded: Feb 16 2017 60 mins
    Sam Chandrashekar, Program Manager, Microsoft
    Continuous streams of data are generated in every industry from sensors, IoT devices, business transactions, social media, network devices, clickstream logs etc. Within these streams of data lie insights that are waiting to be unlocked.

    This session with several live demonstrations will detail the build out of an end-to-end solution for the Internet of Things to transform data into insight, prediction, and action using cloud services. These cloud services enable you to quickly and easily build solutions to unlock insights, predict future trends, and take actions in near real-time.

    Samartha (Sam) Chandrashekar is a Program Manager at Microsoft. He works on cloud services to enable machine learning and advanced analytics on streaming data.
  • Bridging the Data Silos Recorded: Feb 15 2017 48 mins
    Merav Yuravlivker, Chief Executive Officer, Data Society
    If a database is filled automatically, but it's not analyzed, can it make an impact? And how do you combine disparate data sources to give you a real-time look at your environment?

    Chief Executive Officer Merav Yuravlivker discusses how companies are missing out on some of their biggest profits (and how some companies are making billions) by aggregating disparate data sources. You'll learn about data sources available to you, how you can start automating this data collection, and the many insights that are at your fingertips.
  • Strategies for Successful Data Preparation Recorded: Feb 14 2017 33 mins
    Raymond Rashid, Senior Consultant Business Intelligence, Unilytics Corporation
    Data scientists know, the visualization of data doesn't materialize out of thin air, unfortunately. One of the most vital preparation tactics and dangerous moments happens in the ETL process.

    Join Ray to learn the best strategies that lead to successful ETL and data visualization. He'll cover the following and what it means for visualization:

    1. Data at Different Levels of Detail
    2. Dirty Data
    3. Restartability
    4. Processing Considerations
    5. Incremental Loading

    Ray Rashid is a Senior Business Intelligence Consultant at Unilytics, specializing in ETL, data warehousing, data optimization, and data visualization. He has expertise in the financial, manufacturing and pharmaceutical industries.
  • HPE ALM Standardization as a Precursor for Data Warehousing Recorded: Feb 9 2017 59 mins
    Tuomas Leppilampi , Assure
    Agenda:
    Data warehousing at a glance
    Wild West vs Enterprise HPE ALM Template
    Planning and configuring the template
    Customer use case: Standardization project walkthrough
    How to maintain a standardized environment
    Next steps with HPE ALM
  • Building Enterprise Scale Solutions for Healthcare with Modern Data Architecture Recorded: Nov 10 2016 47 mins
    Ramu Kalvakuntla, Sr. Principal, Big Data Practice, Clarity Solution Group
    We all are aware of the challenges enterprises are having with growing data and silo’d data stores. Businesses are not able to make reliable decisions with un-trusted data and on top of that, they don’t have access to all data within and outside their enterprise to stay ahead of the competition and make key decisions for their business.

    This session will take a deep dive into current Healthcare challenges businesses are having today, as well as, how to build a Modern Data Architecture using emerging technologies such as Hadoop, Spark, NoSQL datastores, MPP Data stores and scalable and cost effective cloud solutions such as AWS, Azure and BigStep.
  • Data at the corner of SAP and AWS Recorded: Nov 9 2016 48 mins
    Frank Stienhans, CTO, Ocean9
    Past infrastructures provided compute, storage and network enabling static enterprise deployments which changed every few years. This talk will analyze the consequences of a world where production SAP and Spark clusters including data can be provisioned in minutes with the push of a button.

    What does it mean for the IT architecture of an enterprise? How to stay in control in a super agile world?
  • 3 Critical Data Preparation Mistakes and How-to Avoid them Recorded: Oct 20 2016 32 mins
    Mark Vivien, Business Development, Big Data
    Whether you're just starting out or a seasoned solution architect, developer, or data scientist, there are most likely key mistakes that you've probably made in the past, may be making now, or will most likely make in the future. In fact, these same mistakes are most likely impacting your company's overall success with their analytics program.

    Join us for our upcoming webinar, 3 Critical Data Preparation Mistakes and How to avoid them, as we discuss 3 of the most critical, fundamental pitfalls and more!

    • Importance of early and effective business partner engagement
    • Importance of business context to governance
    • Importance of change and learning to your development methodology
  • Practical Data Cleaning Recorded: Oct 13 2016 38 mins
    Lee Baker, CEO, Chi-Squared Innovations
    The basics of data cleaning are remarkably simple, yet few take the time to get organized from the start.

    If you want to get the most out of your data, you're going to need to treat it with respect, and by getting prepared and following a few simple rules your data cleaning processes can be simple, fast and effective.

    The Practical Data Cleaning webinar is a thorough introduction to the basics of data cleaning and takes you through:

    • Data Collection
    • Data Cleaning
    • Data Classification
    • Data Integrity
    • Working Smarter, Not Harder
DBM
DBM

Embed in website or blog

Successfully added emails: 0
Remove all
  • Title: Democratizing Data and Predictive Analytics
  • Live at: Oct 22 2015 3:00 pm
  • Presented by: Dan Donovan, Lead Technology Evangelist, Lavastorm & Paul Lilford, Director - Technology Partners, Tableau Software
  • From:
Your email has been sent.
or close