Hi [[ session.user.profile.firstName ]]

Data Management

  • Date
  • Rating
  • Views
  • Long-term Data Retention: Challenges, Standards and Best Practices Long-term Data Retention: Challenges, Standards and Best Practices Simona Rabinovici-Cohen, IBM, Phillip Viana, IBM, Sam Fineberg Recorded: Feb 16 2017 61 mins
    The demand for digital data preservation has increased drastically in recent years. Maintaining a large amount of data for long periods of time (months, years, decades, or even forever) becomes even more important given government regulations such as HIPAA, Sarbanes-Oxley, OSHA, and many others that define specific preservation periods for critical records.

    While the move from paper to digital information over the past decades has greatly improved information access, it complicates information preservation. This is due to many factors including digital format changes, media obsolescence, media failure, and loss of contextual metadata. The Self-contained Information Retention Format (SIRF) was created by SNIA to facilitate long-term data storage and preservation. SIRF can be used with disk, tape, and cloud based storage containers, and is extensible to any new storage technologies. It provides an effective and efficient way to preserve and secure digital information for many decades, even with the ever-changing technology landscape.
Join this webcast to learn:
    •Key challenges of long-term data retention
    •How the SIRF format works and its key elements
    •How SIRF supports different storage containers - disks, tapes, CDMI and the cloud
    •Availability of Open SIRF

    SNIA experts that developed the SIRF standard will be on hand to answer your questions.
  • Logistics Analytics: Predicting Supply-Chain Disruptions Logistics Analytics: Predicting Supply-Chain Disruptions Dmitri Adler, Chief Data Scientist, Data Society Recorded: Feb 16 2017 47 mins
    If a volcano erupts in Iceland, why is Hong Kong your first supply chain casualty? And how do you figure out the most efficient route for bike share replacements?

    In this presentation, Chief Data Scientist Dmitri Adler will walk you through some of the most successful use cases of supply-chain management, the best practices for evaluating your supply chain, and how you can implement these strategies in your business.
  • Unlock real-time predictive insights from the Internet of Things Unlock real-time predictive insights from the Internet of Things Sam Chandrashekar, Program Manager, Microsoft Recorded: Feb 16 2017 60 mins
    Continuous streams of data are generated in every industry from sensors, IoT devices, business transactions, social media, network devices, clickstream logs etc. Within these streams of data lie insights that are waiting to be unlocked.

    This session with several live demonstrations will detail the build out of an end-to-end solution for the Internet of Things to transform data into insight, prediction, and action using cloud services. These cloud services enable you to quickly and easily build solutions to unlock insights, predict future trends, and take actions in near real-time.

    Samartha (Sam) Chandrashekar is a Program Manager at Microsoft. He works on cloud services to enable machine learning and advanced analytics on streaming data.
  • Machine Learning towards Precision Medicine Machine Learning towards Precision Medicine Paul Hellwig Director, Research & Development, at Elsevier Health Analytics Recorded: Feb 16 2017 47 mins
    Medicine is complex. Correlations between diseases, medications, symptoms, lab data and genomics are of a complexity that cannot be fully comprehended by humans anymore. Machine learning methods are required that help mining these correlations. But a pure technological or algorithm-driven approach will not suffice. We need to get physicians and other domain experts on board, we need to gain their trust in the predictive models we develop.

    Elsevier Health Analytics has developed a first version of the Medical Knowledge Graph, which identifies correlations (ideally: causations) between diseases, and between diseases and treatments. On a dataset comprising 6 million patient lives we have calculated 2000+ models predicting the development of diseases. Every model adjusts for ~3000 covariates. Models are based on linear algorithms. This allows a graphical visualization of correlations that medical personnel can work with.
  • Bridging the Data Silos Bridging the Data Silos Merav Yuravlivker, Chief Executive Officer, Data Society Recorded: Feb 15 2017 48 mins
    If a database is filled automatically, but it's not analyzed, can it make an impact? And how do you combine disparate data sources to give you a real-time look at your environment?

    Chief Executive Officer Merav Yuravlivker discusses how companies are missing out on some of their biggest profits (and how some companies are making billions) by aggregating disparate data sources. You'll learn about data sources available to you, how you can start automating this data collection, and the many insights that are at your fingertips.
  • Comparison of ETL v  Streaming Ingestion,Data Wrangling in Machine/Deep Learning Comparison of ETL v Streaming Ingestion,Data Wrangling in Machine/Deep Learning Kai Waehner, Technology Evangelist, TIBCO Recorded: Feb 15 2017 45 mins
    A key task to create appropriate analytic models in machine learning or deep learning is the integration and preparation of data sets from various sources like files, databases, big data storages, sensors or social networks. This step can take up to 50% of the whole project.

    This session compares different alternative techniques to prepare data, including extract-transform-load (ETL) batch processing, streaming analytics ingestion, and data wrangling within visual analytics. Various options and their trade-offs are shown in live demos using different advanced analytics technologies and open source frameworks such as R, Python, Apache Spark, Talend or KNIME. The session also discusses how this is related to visual analytics, and best practices for how the data scientist and business user should work together to build good analytic models.

    Key takeaways for the audience:
    - Learn various option for preparing data sets to build analytic models
    - Understand the pros and cons and the targeted persona for each option
    - See different technologies and open source frameworks for data preparation
    - Understand the relation to visual analytics and streaming analytics, and how these concepts are actually leveraged to build the analytic model after data preparation
  • Strategies for Successful Data Preparation Strategies for Successful Data Preparation Raymond Rashid, Senior Consultant Business Intelligence, Unilytics Corporation Recorded: Feb 14 2017 33 mins
    Data scientists know, the visualization of data doesn't materialize out of thin air, unfortunately. One of the most vital preparation tactics and dangerous moments happens in the ETL process.

    Join Ray to learn the best strategies that lead to successful ETL and data visualization. He'll cover the following and what it means for visualization:

    1. Data at Different Levels of Detail
    2. Dirty Data
    3. Restartability
    4. Processing Considerations
    5. Incremental Loading

    Ray Rashid is a Senior Business Intelligence Consultant at Unilytics, specializing in ETL, data warehousing, data optimization, and data visualization. He has expertise in the financial, manufacturing and pharmaceutical industries.
  • Data Science Apps: Beyond Notebooks with Apache Toree, Spark and Jupyter Gateway Data Science Apps: Beyond Notebooks with Apache Toree, Spark and Jupyter Gateway Natalino Busa, Head of Applied Data Science, Teradata Recorded: Feb 14 2017 48 mins
    Jupyter notebooks are transforming the way we look at computing, coding and problem solving. But is this the only “data scientist experience” that this technology can provide?

    In this webinar, Natalino will sketch how you could use Jupyter to create interactive and compelling data science web applications and provide new ways of data exploration and analysis. In the background, these apps are still powered by well understood and documented Jupyter notebooks.

    They will present an architecture which is composed of four parts: a jupyter server-only gateway, a Scala/Spark Jupyter kernel, a Spark cluster and a angular/bootstrap web application.
  • Visualization: A tool for knowledge Visualization: A tool for knowledge Luis Melgar, Visual Reporter at Univision News Recorded: Feb 14 2017 49 mins
    During the last decades, concepts such as Big Data and Data Visualization have become more popular and present in our daily lives. But what is visualization?

    Visualization is an intellectual discipline that allows to generate knowledge through visual forms. And as in every other field, there are good and bad practices that can help consumers or mislead them.

    In this webinar, we will address:

    -What it’s Data Visualization and why it’s important
    -How to choose the right graphic forms in order to represent complex information
    -Interactivity and new narratives
    -What tools can be used
  • How to Setup and Manage a Corporate Self Service Analytics Environment How to Setup and Manage a Corporate Self Service Analytics Environment Ronald van Loon, Top Big Data and IoT influencer and Ian Macdonald, Principal Technologist (Pyramid Analytics) Recorded: Feb 14 2017 48 mins
    As companies face the challenges arising from a surge in the number of customer interactions and data, it can be difficult to successfully manage the vast quantities of information and still provide a positive customer experience. It is incumbent upon businesses to create a consumer-centric experience that is powered by (predictive) analytics.

    Adopting a data-driven approach through a corporate self-service analytics (SSA) environment is integral to strengthening your data and analytics strategy.


    During the webinar, speakers Ronald van Loon & Ian Macdonald will:

    •Expand upon on the benefits of a corporate SSA environment
    •Define how your business can successfully manage a corporate SSA environment
    •Present supportive case studies
    •Demonstrate practical examples of analytic governance in an SSA environment using BI Office from Pyramid Analytics.
    •Discuss practical tips on how to get started
    •Cover how to avoid common pitfalls associated with a SSA environment

    Stay tuned for a Q&A with speaker Ronald van Loon and domain expert Ian Macdonald, Principal Technologist, Pyramid Analytics.

Embed in website or blog