Escape the Data Torture! Finding the Purpose of Big Data
At present, we have started to fit hypothesis to data. This is a flawed approach designed to obtain a result no matter as to its veracity. We have all heard the term, if you torture data long enough, it'll tell you what you want to hear. What we need to do is inject scientific rigour back into the data analysis process. Finding spurious correlations and short term patterns does little to help build and develop value. In this session, we will look at the requirements around testing and causal statements and how we can ensure that we obtain valuable information from the increasing volumes of data. In this session, we will look at the main flaws and irregular processes that are used to make data fit an assumption and to get a result no matter how little this actually helps in the long run.
Following this session, you will know the basic statistical requirements needed to make sense of data. We will detail the scientific processes for testing data and allow you to better make a set of tests that find causal and not simply correlational relationships. In this, we teach you to take the where, what and when and use this to discover the how and why.
RecordedSep 10 201346 mins
Your place is confirmed, we'll send you email reminders
Hélène Lyon, IBM, Distinguished Engineer, IBM Z Solutions Architect
IT is a key player in the digital and cognitive transformation of business processes delivering solutions for improved business value with analytics. This session will step by step explain the journey to secure production while adopting new analytics technologies leveraging mainframe core business assets
Data Scientists are rare and highly valued individuals, and for good reason: making sense of data, and using the machine learning libraries requires an unusual blend of advanced skills. Why is it then that Data Scientists spend the majority of their time getting data ready for models, and a fraction actually doing the high value work?
In this talk we introduce the concept of Data Fabric, a new way to provide a self-service model for data, where data scientists can easily discover, curate, share, and accelerate data analysis using Python, R, and visualization tools, no matter where the data is managed, no matter the structure, and no matter the size.
We will talk through the role of Apache Arrow, the in-memory columnar data standard that is accelerating analytics for GPU-based processing, as well as the role of Pandas and Arrow in providing unprecedented speed in accessing datasets from Python.
Jim Jagielski, Sr Distinguished Engineer at Capital One and Vice Chairman and Founder of the Apache Software Foundation
Just as Enterprise companies are leveraging Open Source technologies, they are also learning how Open Source drives Innovation. The software development paradigms of Open Source can be used in-house or in Enterprise companies to gain all the advantages of Open Source.
Join this presentation to learn how corporations are using Inner Source, and the lessons-learned of successful open source project management.
Jim Jagielski is a world-recognized figured in the Open Source community and has insights, understanding, experience and expertise in Open Source development, foundations, governance and licensing.
Public cloud deployments have become irresistible in terms of flexibility, low barriers to entry, security, and developer friendliness. But the sheer inertia of traditional data lakes make them difficult to transition to cloud. In this talk we'll look at examples of how leading companies have made the transition using open source technologies and hybrid strategies.
Instead of following a "lift and shift" strategy for moving data lake workloads to the cloud, there are new considerations unique to cloud that should be considered alongside traditional approaches related to compute (eg, GPU, FPGA), storage (object store vs. file store), integrations, and security.
Viewers will take away techniques they can immediately apply to their own projects.
Maloy Manna, PM Engineering, AXA Data Innovation Lab, Paris
The concept of Data lakes evolved to address challenges and opportunities in managing big data.
Organizations are investing massive amounts of time and money to upgrade existing data infrastructures and build data lakes whether on-premises or in the cloud.
This talk will discuss architectures and design options to implement data lakes with open source tools. Also covered are challenges of upgrade & migration from existing data warehouses, metadata management, supporting self-service and managing production deployments.
Hélène Lyon, IBM, Distinguished Engineer, IBM Z Solutions Architect
As an Enterprise customer, you are potentially using IBM Z in a hybrid cloud implementation. Let's understand how to benefit from cloud access to mainframe data without moving it outside z; thereby improving security, reducing integration challenges and answering your GDPR auditor's needs.
Iver van de Zand will talk and demo on the latest SAP innovations for analytics in the cloud. Keywords are live connectivity and the closed loop of combined business intelligence, planning and predictive analytics all in one environment. Fully ready and prepared for big data.
David Siegel, Blockchain, decentralization and business agility expert
Still confused about this whole Blockchain thing? Interested in investing in digital currencies, but not sure where to start? Want to get a better idea of the threats and opportunities?
David Siegel is a Blockchain, decentralization and business agility expert who has been a high-level management & strategy consultant to companies like Sony, Hewlett Packard, Amazon, NASA, Intel, and many start-ups. David has been praised for being able to explain Blockchain in the most simple and interesting way.
What you will learn:
-What is Bitcoin?
-What is the blockchain?
-What is Ethereum? What is Ether?
-What is a distributed application?
-What is a smart contract?
-What is a triple ledger?
-What about identity and security?
-What business models are at risk?
-What are the opportunities?
-What should we do?
Vivek Bajaj, Global VP of Solutions for IBM Financial Services
Today the payments industry faces a rebirth by necessity. Financial institutions process massive volumes of customer and payments transaction data, much of it unstructured and untapped.
Cognitive Systems have the ability to understand, reason and learn. In Financial Services applying cognitive capabilities to real world payments issues like safer and faster payments is yielding significant results. Furthermore Risk and Compliance and segment of one engagement are areas where ROI is tremendous when leveraging advanced analytics and artificial intelligence in cohesion.
Learn from real world use cases of how financial institutions globally have gained significant competitive advantage by becoming a truly Cognitive Bank.
HDFS on Kubernetes: Lessons Learned is a webinar presentation intended for software engineers, developers, and technical leads who develop Spark applications and are interested in running Spark on Kubernetes. Pepperdata has been exploring Kubernetes as potential Big Data platform with several other companies as part of a joint open source project.
In this webinar, Kimoon Kim will show you how to:
–Run Spark application natively on Kubernetes
–Enable Spark on Kubernetes read and write data securely on HDFS protected by Kerberos
Dr. Umesh Hodeghatta Rao, CTO, Nu-Sigma Analytics Labs
Data visualization must be intuitive in order for non-IT business leaders to see data patterns. Representing data in a graphical or pictorial format is easy, but constructing the data in the best and most logical way can be tricky.
In this session, Umesh will talk about how to represent data simply to make quicker and better business decisions. He will walk through several data visualization techniques through business cases and examples. By the end of the session, you will not only know different data visualization techniques, but also have an understanding of circumstances under which each technique should be used and the best way to represent particular data sets for different business cases.
Predictive Analytics - everyone is talking about it and many organisations claim to be doing it. But are they? And what insights do they gain to then make tactical or strategic changes? How can analysts work with decision makers by sharing results in a visually effective and meaningful way while also informing them about possible courses of action?
This webinar is presented by Andy Kriebel, Head Coach at the Data School and Eva Murray, Tableau Evangelist at Exasol. Our guest speaker on Predictive Analytics is Benedetta Tagliaferri, Consulting Analyst at The Information Lab.
The webinar will look at some examples of predictive analysis and will show data visualization examples that are actionable and can drive further questions and discussions in an organisation.
Carl Edwards, BI Consultant, Brett Churchill, BI Consultant
Looking to take your graphs to the next level? Want to make sure you choose the right visualization? Plagued by the challenges of geospatial heat maps?
Get your questions ready and join this session where data experts Carl and Brett will go over the common questions they get asked and answer all the data visualization issues you've been plagued with, including how to:
-Use location-based data to put your visualization on the map
-Uncover new relationships, patterns and opportunities
-Identify emerging trends
-Answering comparative business questions with set analysis
-Understand best practices for creating an aesthetically-pleasing and useful visualization
When analysis needs to be used by decision makers that didn’t create it, the communication of the information and the message it conveys becomes critical. There is a plethora of ways to layout reports and dashboards, even within a single company.
Enter the SUCCESS formula, that “lightbulb” moment.
Introduced by the IBCS Association (International Business Communication Standards) the SUCCESS formula provides conceptual, perceptual and semantic rules that enable faster, better, and less-costly results in all stages of business communications and decision-making processes.
This webinar will introduce the 7 Rules of SUCCESS that provides a toolkit to aid analysts in designing their visualisations for better reach and decisions in their target audience.
The webinar will also introduce The Philips journey to implementing IBCS principles in their global "Accelerate!” Initiative.
Marwa Ayad Mohamed ( Founder of YourChildCode ,Team lead software Engineer, Women Techmakers Cairo Lead)
Tensorflow is an open source software library for numerical computation and machine learning.
Join this session where Marwa will discuss:
-Introduction to Artificial intelligence, machine learning and deep learning
-Sample of machine learning applications
-Tensorflow Story, Model and windows installation steps with object recognition demo.
Arinze Akutekwe, PhD Data Scientist, BAS EMEIA – Intelligent Enterprise - Analytics at Fujitsu
Artificial intelligence has greatly changed the way we live since the 20th century. It involves the science and engineering of making machines intelligent and autonomous using computer programs.
The processing power of computers has been on the exponential increase with cost of processors and storage decreasing. This has made research and developments efforts in AI areas such as deep learning, once thought to be impossible possible.
In this webinar, we will examine current methods, application domains of specific methods, their impacts on our daily lives and try to answer questions on ethics of applying these technologies.
Fraud detection is a classic adversarial analytics challenge: As soon as an automated system successfully learns to stop one scheme, fraudsters move on to attack another way. Each scheme requires looking for different signals (i.e. features) to catch; is relatively rare (one in millions for finance or e-commerce); and may take months to investigate a single case (in healthcare or tax, for example) – making quality training data scarce.
This talk will cover a code walk-through, the key lessons learned while building such real-world software systems over the past few years. We'll look for fraud signals in public email datasets, using IPython and popular open-source libraries (scikit-learn, statsmodel, nltk, etc.) for data science and Apache Spark as the compute engine for scalable parallel processing.
David will iteratively build a machine-learned hybrid model – combining features from different data sources and algorithmic approaches, to catch diverse aspects of suspect behavior:
- Natural language processing: finding keywords in relevant context within unstructured text
- Statistical NLP: sentiment analysis via supervised machine learning
- Time series analysis: understanding daily/weekly cycles and changes in habitual behavior
- Graph analysis: finding actions outside the usual or expected network of people
- Heuristic rules: finding suspect actions based on past schemes or external datasets
- Topic modeling: highlighting use of keywords outside an expected context
- Anomaly detection: Fully unsupervised ranking of unusual behavior
Apache Spark is used to run these models at scale – in batch mode for model training and with Spark Streaming for production use. We’ll discuss the data model, computation, and feedback workflows, as well as some tools and libraries built on top of the open-source components to enable faster experimentation, optimization, and productization of the models.
Prof. Dr. Michael Feindt, Founder & Chief Scientific Officer, Blue Yonder
Artificial Intelligence (AI) is not a technology for the future; it’s a huge business opportunity for today. But how can your organisation become a trailblazer for AI innovation, transforming the way you work to deliver immediate – and lasting – bottom line value?
Former CERN scientist, Prof. Dr. Michael Feindt, is one of the brightest minds in Machine Learning. Join him for a 30-minute masterclass in how to apply AI to your business.
You’ll learn how AI can:
•Make sense of market and customer complexity, to deliver quick and effective decisions every single day
•Increase workforce productivity to improve output and staff morale
•Enhance decision-making and forecasting accuracy, for operational efficiency and improved productivity
•Be implemented into your business quickly, easily, with minimal disruption
Michael will also share real-life examples of how international businesses are using AI as a transformation tool, from his experience as founder of market-leading AI solution provider, Blue Yonder.