Hi [[ session.user.profile.firstName ]]

Database Management

  • Date
  • Rating
  • Views
  • Semantic AI: Bringing Machine Learning and Knowledge Graphs Together
    Semantic AI: Bringing Machine Learning and Knowledge Graphs Together Kirk Borne, Principal Data Scientist, Booz Allen Hamilton & Andreas Blumauer, CEO, Managing Partner Semantic Web Company Recorded: May 23 2018 64 mins
    Implementing AI applications based on machine learning is a significant topic for organizations embracing digital transformation. By 2020, 30% of CIOs will include AI in their top five investment priorities according to Gartner’s Top 10 Strategic Technology Trends for 2018: Intelligent Apps and Analytics. But to deliver on the AI promise, organizations need to generate good quality data to train the algorithms. Failure to do so will result in the following scenario: "When you automate a mess, you get an automated mess."

    This webinar covers:

    - An introduction to machine learning use cases and challenges provided by Kirk Borne, Principal Data Scientist at Booz Allen Hamilton and top data science and big data influencer.
    - How to achieve good data quality based on harmonized semantic metadata presented by Andreas Blumauer, CEO and co-founder of Semantic Web Company and a pioneer in the application of semantic web standards for enterprise data integration.
    - How to apply a combined approach when semantic knowledge models and machine learning build the basis of your cognitive computing. (See Attachment: The Knowledge Graph as the Default Data Model for Machine Learning)
    - Why a combination of machine and human computation approaches is required, not only from an ethical but also from a technical perspective.
  • Comparing Apache Ignite & Cassandra for Hybrid Transactional Analytical Apps
    Comparing Apache Ignite & Cassandra for Hybrid Transactional Analytical Apps Denis Magda, Director of Product Management, GridGain Systems Recorded: Mar 28 2018 61 mins
    The 10x growth of transaction volumes, 50x growth in data volumes and drive for real-time visibility and responsiveness over the last decade have pushed traditional technologies including databases beyond their limits. Your choices are either buy expensive hardware to accelerate the wrong architecture, or do what other companies have started to do and invest in technologies being used for modern hybrid transactional analytical applications (HTAP).

    Learn some of the current best practices in building HTAP applications, and the differences between two of the more common technologies companies use: Apache® Cassandra™ and Apache® Ignite™. This session will cover:

    - The requirements for real-time, high volume HTAP applications
    - Architectural best practices, including how in-memory computing fits in and has eliminated tradeoffs between consistency, speed and scale
    - A detailed comparison of Apache Ignite and GridGain® for HTAP applications

    About the speaker: Denis Magda is the Director of Product Management at GridGain Systems, and Vice President of the Apache Ignite PMC. He is an expert in distributed systems and platforms who actively contributes to Apache Ignite and helps companies and individuals deploy it for mission-critical applications. You can be sure to come across Denis at conferences, workshop and other events sharing his knowledge about use case, best practices, and implementation tips and tricks on how to build efficient applications with in-memory data grids, distributed databases and in-memory computing platforms including Apache Ignite and GridGain.

    Before joining GridGain and becoming a part of Apache Ignite community, Denis worked for Oracle where he led the Java ME Embedded Porting Team -- helping bring Java to IoT.
  • Scalable Monitoring for the Growing CERN Infrastructure
    Scalable Monitoring for the Growing CERN Infrastructure Daniel Lanza Garcia, Big Data Engineer, CERN Recorded: Mar 28 2018 45 mins
    When monitoring an increasing number of machines, the infrastructure and tools need to be rethinked. A new tool, ExDeMon, for detecting anomalies and raising actions, has been developed to perform well on this growing infrastructure. Considerations of the development and implementation will be shared.

    Daniel has been working at CERN for more than 3 years as Big Data developer, he has been implementing different tools for monitoring the computing infrastructure in the organisation.
  • The Data Lake for Agile Ingest, Discovery, & Analytics in Big Data Environments
    The Data Lake for Agile Ingest, Discovery, & Analytics in Big Data Environments Kirk Borne, Principal Data Scientist, Booz Allen Hamilton Recorded: Mar 27 2018 58 mins
    As data analytics becomes more embedded within organizations, as an enterprise business practice, the methods and principles of agile processes must also be employed.

    Agile includes DataOps, which refers to the tight coupling of data science model-building and model deployment. Agile can also refer to the rapid integration of new data sets into your big data environment for "zero-day" discovery, insights, and actionable intelligence.

    The Data Lake is an advantageous approach to implementing an agile data environment, primarily because of its focus on "schema-on-read", thereby skipping the laborious, time-consuming, and fragile process of database modeling, refactoring, and re-indexing every time a new data set is ingested.

    Another huge advantage of the data lake approach is the ability to annotate data sets and data granules with intelligent, searchable, reusable, flexible, user-generated, semantic, and contextual metatags. This tag layer makes your data "smart" -- and that makes your agile big data environment smart also!
  • Is the Traditional Data Warehouse Dead?
    Is the Traditional Data Warehouse Dead? James Serra, Data Platform Solution Architect, Microsoft Recorded: Mar 27 2018 61 mins
    With new technologies such as Hive LLAP or Spark SQL, do you still need a data warehouse or can you just put everything in a data lake and report off of that? No! In the presentation, James will discuss why you still need a relational data warehouse and how to use a data lake and an RDBMS data warehouse to get the best of both worlds.

    James will go into detail on the characteristics of a data lake and its benefits and why you still need data governance tasks in a data lake. He'll also discuss using Hadoop as the data lake, data virtualization, and the need for OLAP in a big data solution, and he will put it all together by showing common big data architectures.
  • HDFS on Kubernetes: Lessons Learned
    HDFS on Kubernetes: Lessons Learned Kimoon Kim, Pepperdata Software Engineer Recorded: Sep 19 2017 46 mins
    HDFS on Kubernetes: Lessons Learned is a webinar presentation intended for software engineers, developers, and technical leads who develop Spark applications and are interested in running Spark on Kubernetes. Pepperdata has been exploring Kubernetes as potential Big Data platform with several other companies as part of a joint open source project.

    In this webinar, Kimoon Kim will show you how to: 

    –Run Spark application natively on Kubernetes
    –Enable Spark on Kubernetes read and write data securely on HDFS protected by Kerberos
  • Data scientists: Can't live with them, can't live without them.
    Data scientists: Can't live with them, can't live without them. Wyatt Benno, CEO, DataHero Recorded: Aug 24 2017 45 mins
    There has been a flood of publicity around big data, data processing, and the role of predictive analytics in businesses of the future.
    As business operators how do we get access to these valuable business insights, even when there is not a data analyst around to walk us through their results?

    - Should your software emulate a data scientist?
    - Learn about the power of data visualizations.
    - Learn about creating value from disperse data sets.
  • Hunting Criminals with Hybrid Analytics, Semi-supervised Learning, & Feedback
    Hunting Criminals with Hybrid Analytics, Semi-supervised Learning, & Feedback David Talby, CTO, Pacific AI Recorded: Aug 23 2017 62 mins
    Fraud detection is a classic adversarial analytics challenge: As soon as an automated system successfully learns to stop one scheme, fraudsters move on to attack another way. Each scheme requires looking for different signals (i.e. features) to catch; is relatively rare (one in millions for finance or e-commerce); and may take months to investigate a single case (in healthcare or tax, for example) – making quality training data scarce.

    This talk will cover a code walk-through, the key lessons learned while building such real-world software systems over the past few years. We'll look for fraud signals in public email datasets, using IPython and popular open-source libraries (scikit-learn, statsmodel, nltk, etc.) for data science and Apache Spark as the compute engine for scalable parallel processing.

    David will iteratively build a machine-learned hybrid model – combining features from different data sources and algorithmic approaches, to catch diverse aspects of suspect behavior:

    - Natural language processing: finding keywords in relevant context within unstructured text
    - Statistical NLP: sentiment analysis via supervised machine learning
    - Time series analysis: understanding daily/weekly cycles and changes in habitual behavior
    - Graph analysis: finding actions outside the usual or expected network of people
    - Heuristic rules: finding suspect actions based on past schemes or external datasets
    - Topic modeling: highlighting use of keywords outside an expected context
    - Anomaly detection: Fully unsupervised ranking of unusual behavior

    Apache Spark is used to run these models at scale – in batch mode for model training and with Spark Streaming for production use. We’ll discuss the data model, computation, and feedback workflows, as well as some tools and libraries built on top of the open-source components to enable faster experimentation, optimization, and productization of the models.
  • Toward Internet of Everything: Architectures, Standards, & Interoperability
    Toward Internet of Everything: Architectures, Standards, & Interoperability Ram D. Sriram, Chief of the Software and Systems Division, IT Lab at National Institute of Standards and Technology Recorded: Jun 21 2017 63 mins
    In this talk, Ram will provide a unified framework for Internet of Things, Cyber-Physical Systems, and Smart Networked Systems and Societies, and then discuss the role of ontologies for interoperability.

    The Internet, which has spanned several networks in a wide variety of domains, is having a significant impact on every aspect of our lives. These networks are currently being extended to have significant sensing capabilities, with the evolution of the Internet of Things (IoT). With additional control, we are entering the era of Cyber-physical Systems (CPS). In the near future, the networks will go beyond physically linked computers to include multimodal-information from biological, cognitive, semantic, and social networks.

    This paradigm shift will involve symbiotic networks of people (social networks), smart devices, and smartphones or mobile personal computing and communication devices that will form smart net-centric systems and societies (SNSS) or Internet of Everything. These devices – and the network -- will be constantly sensing, monitoring, interpreting, and controlling the environment.

    A key technical challenge for realizing SNSS/IoE is that the network consists of things (both devices & humans) which are heterogeneous, yet need to be interoperable. In other words, devices and people need to interoperate in a seamless manner. This requires the development of standard terminologies (or ontologies) which capture the meaning and relations of objects and events. Creating and testing such terminologies will aid in effective recognition and reaction in a network-centric situation awareness environment.

    Before joining the Software and Systems Division (his current position), Ram was the leader of the Design and Process group in the Manufacturing Systems Integration Division, Manufacturing Engineering Lab, where he conducted research on standards for interoperability of computer-aided design systems.
  • The Ways Machine Learning and AI Can Fail
    The Ways Machine Learning and AI Can Fail Brian Lange, Partner and Data Scientist, Datascope Recorded: Apr 13 2017 48 mins
    Good applications of machine learning and AI can be difficult to pull off. Join Brian Lange, Partner and Data Scientist at data science firm Datascope, as he discusses a variety of ways machine learning and AI can fail (from technical to human factors) so that you can avoid repeating them yourself.

Embed in website or blog