An insight into the analysis of the ever growing data management tasks IT departments face today. This webinar will highlight the challenges faced by IT departments from big data in terms of explosive data growth. It will focus on the problems faced by organizations who are now storing, processing and retaining exponentially more data than previously, to the issues around legacy systems and their ability to cope and will offer solutions to these ongoing challenges.
RecordedSep 18 201246 mins
Your place is confirmed, we'll send you email reminders
This talk tells the story of implementation and optimization of a sparse logistic regression algorithm in spark. I would like to share the lessons I learned and the steps I had to take to improve the speed of execution and convergence of my initial naive implementation. The message isn’t to convince the audience that logistic regression is great and my implementation is awesome, rather it will give details about how it works under the hood, and general tips for implementing an iterative parallel machine learning algorithm in spark.
The talk is structured as a sequence of “lessons learned” that are shown in form of code examples building on the initial naive implementation. The performance impact of each “lesson” on execution time and speed of convergence is measured on benchmark datasets.
You will see how to formulate logistic regression in a parallel setting, how to avoid data shuffles, when to use a custom partitioner, how to use the ‘aggregate’ and ‘treeAggregate’ functions, how momentum can accelerate the convergence of gradient descent, and much more. I will assume basic understanding of machine learning and some prior knowledge of spark. The code examples are written in scala, and the code will be made available for each step in the walkthrough.
Lorand is a data scientist working on risk management and fraud prevention for the payment processing system of Zalando, the leading fashion platform in Europe. Previously, Lorand has developed highly scalable low-latency machine learning algorithms for real-time bidding in online advertising.
Jean-Frederic Clere, Manager, Software Engineering, Red Hat
You can do a lot with a Raspberry and ASF projects. From a tiny object
connected to the internet to a small server application. The presentation
will explain and demo the following:
- Raspberry as small server and captive portal using httpd/tomcat.
- Raspberry as a IoT Sensor collecting data and sending it to ActiveMQ.
- Raspberry as a Modbus supervisor controlling an Industruino
(Industrial Arduino) and connected to ActiveMQ.
Denis Magda, Director of Product Management, GridGain Systems
The 10x growth of transaction volumes, 50x growth in data volumes and drive for real-time visibility and responsiveness over the last decade have pushed traditional technologies including databases beyond their limits. Your choices are either buy expensive hardware to accelerate the wrong architecture, or do what other companies have started to do and invest in technologies being used for modern hybrid transactional analytical applications (HTAP).
Learn some of the current best practices in building HTAP applications, and the differences between two of the more common technologies companies use: Apache® Cassandra™ and Apache® Ignite™. This session will cover:
- The requirements for real-time, high volume HTAP applications
- Architectural best practices, including how in-memory computing fits in and has eliminated tradeoffs between consistency, speed and scale
- A detailed comparison of Apache Ignite and GridGain® for HTAP applications
About the speaker: Denis Magda is the Director of Product Management at GridGain Systems, and Vice President of the Apache Ignite PMC. He is an expert in distributed systems and platforms who actively contributes to Apache Ignite and helps companies and individuals deploy it for mission-critical applications. You can be sure to come across Denis at conferences, workshop and other events sharing his knowledge about use case, best practices, and implementation tips and tricks on how to build efficient applications with in-memory data grids, distributed databases and in-memory computing platforms including Apache Ignite and GridGain.
Before joining GridGain and becoming a part of Apache Ignite community, Denis worked for Oracle where he led the Java ME Embedded Porting Team -- helping bring Java to IoT.
Subscription businesses can lose the happiest of subscribers because of involuntary churn—that deadly form of attrition that comes from card declines and invoice failures.
Even slight variations in a subscription business’ churn rate can have significant impact on revenues, so it’s critical to address involuntary churn -- and easier than ever. The latest subscription technology leverages machine learning, which can improve transaction success rates and billing continuity, helping automatically reduce involuntary churn and boost monthly recurring revenue by an average of 9 percent.
Want to know more about how subscription businesses are making a positive impact on revenue? How can you optimize decline management and revenue recovery strategies based on your own unique business needs? Join our latest VB Live event and you’ll learn how to start and where, plus get a first look at the latest Revenue Recovery Benchmarks, which reveal the powerful impact of machine learning.
Don’t miss out!
Registration is free.
In this webinar, you’ll learn...
* The power of dynamic retry logic, optimized for each individual invoice
* The incremental lift that a well-designed dunning strategy can have on revenue
* The key metrics every subscription business should understand to prevent churn
* How to develop a comprehensive decline management and revenue recovery plan using proven strategies for successful transactions.
* Emma Clark, Director of Product, Recurly
* Devin Brady, Data Scientist, Recurly
* Stewart Rogers, Analyst-at-Large, VentureBeat
* Rachael Brownell, Moderator, VentureBeat
Akmal Chaudhri, Technology Evangelist, GridGain Systems
Attend this session to learn how to easily share state in-memory across multiple Spark jobs, either within the same application or between different Spark applications using an implementation of the Spark RDD abstraction provided in Apache Ignite. During the talk, attendees will learn in detail how IgniteRDD – an implementation of native Spark RDD and DataFrame APIs – shares the state of the RDD across other Spark jobs, applications and workers. Examples will show how IgniteRDD, with its advanced in-memory indexing capabilities, allows execution of SQL queries many times faster than native Spark RDDs or Data Frames.
Akmal Chaudhri has over 25 years experience in IT and has previously held roles as a developer, consultant, product strategist and technical trainer. He has worked for several blue-chip companies such as Reuters and IBM, and also the Big Data startups Hortonworks (Hadoop) and DataStax (Cassandra NoSQL Database). He holds a BSc (1st Class Hons.) in Computing and Information Systems, MSc in Business Systems Analysis and Design and a PhD in Computer Science. He is a Member of the British Computer Society (MBCS) and a Chartered IT Professional (CITP).
When monitoring an increasing number of machines, the infrastructure and tools need to be rethinked. A new tool, ExDeMon, for detecting anomalies and raising actions, has been developed to perform well on this growing infrastructure. Considerations of the development and implementation will be shared.
Daniel has been working at CERN for more than 3 years as Big Data developer, he has been implementing different tools for monitoring the computing infrastructure in the organisation.
Kirk Borne, Principal Data Scientist, Booz Allen Hamilton
As data analytics becomes more embedded within organizations, as an enterprise business practice, the methods and principles of agile processes must also be employed.
Agile includes DataOps, which refers to the tight coupling of data science model-building and model deployment. Agile can also refer to the rapid integration of new data sets into your big data environment for "zero-day" discovery, insights, and actionable intelligence.
The Data Lake is an advantageous approach to implementing an agile data environment, primarily because of its focus on "schema-on-read", thereby skipping the laborious, time-consuming, and fragile process of database modeling, refactoring, and re-indexing every time a new data set is ingested.
Another huge advantage of the data lake approach is the ability to annotate data sets and data granules with intelligent, searchable, reusable, flexible, user-generated, semantic, and contextual metatags. This tag layer makes your data "smart" -- and that makes your agile big data environment smart also!
James Serra, Data Platform Solution Architect, Microsoft
With new technologies such as Hive LLAP or Spark SQL, do you still need a data warehouse or can you just put everything in a data lake and report off of that? No! In the presentation, James will discuss why you still need a relational data warehouse and how to use a data lake and an RDBMS data warehouse to get the best of both worlds.
James will go into detail on the characteristics of a data lake and its benefits and why you still need data governance tasks in a data lake. He'll also discuss using Hadoop as the data lake, data virtualization, and the need for OLAP in a big data solution, and he will put it all together by showing common big data architectures.
Robin Marcenac, Sr. Managing Consultant, IBM, Ross Ackerman, Dir. Digital Support Strategy, NetApp, Alex McDonald, SNIA CSI
Watson is a computer system capable of answering questions posed in natural language. Watson was named after IBM's first CEO, Thomas J. Watson. The computer system was specifically developed to answer questions on the quiz show Jeopardy! (where it beat its human competitors) and was then used in commercial applications, the first of which was helping with lung cancer treatment.
NetApp is now using IBM Watson in Elio, a virtual support assistant that responds to queries in natural language. Elio is built using Watson’s cognitive computing capabilities. These enable Elio to analyze unstructured data by using natural language processing to understand grammar and context, understand complex questions, and evaluate all possible meanings to determine what is being asked. Elio then reasons and identifies the best answers to questions with help from experts who monitor the quality of answers and continue to train Elio on more subjects.
Elio and Watson represent an innovative and novel use of large quantities of unstructured data to help solve problems, on average, four times faster than traditional methods. Join us at this webcast, where we’ll discuss:
•The challenges of utilizing large quantities of valuable yet unstructured data
•How Watson and Elio continuously learn as more data arrives, and navigates an ever growing volume of technical information
•How Watson understands customer language and provides understandable responses
Learn how these new and exciting technologies are changing the way we look at and interact with large volumes of traditionally hard-to-analyze data.
After the webcast, check-out the Q&A blog http://www.sniacloud.com/?p=296
Dr. Umesh Hodeghatta Rao, CTO, Nu-Sigma Analytics Labs
AI is changing the way organizations do businesses and how they interact with customers. AI continues to drive the change. Deep Learning and Natural Language Processing will become standards in AI solutions. Deep Learning is based on brain simulations and uses deep neural networks. AlphaGo is the first AI system to defeat a professional human Go player, the first program to defeat a Go world champion, and arguably the strongest Go player in history. Baidu improved speech recognition from 89% to 99% using Deep Learning. Every AI and Machine learning scientist is required to know Deep Learning tools in his / her current job scenario.
In this session, we will be discussing what is Deep Learning and why it is gaining popularity. We will explain AI solutions using Deep Learning with a practical example. Deep Learning has an edge over other machine learning techniques as with the increased volume of data, performance increases with Deep Learning. Further, Deep Learning enables Hierarchical Feature Learning i.e. learning feature hierarchies.
Jen Stirrup, Gordon Tredgold, Joanna Schloss, & Lyndsay Wise
Join Jenn Stirrup (Director, DataRelish), Gordon Tredgold (CEO & Founder, Leadership Principles LLC), Joanna Schloss (Data Expert) and Lyndsay Wise (Solution Director, Information Builders) as they discuss what it takes to take a business from needing analytics to leveraging analytics successfully.
In this talk we will see whether we are building our first product or revamping an existing one, Embedded Analytics can help us solve real customer problems, which builds product value and creates a competitive differentiator to propel our business forward.
Additionally, we'll deeply look into how Embedded Analytics is different from Traditional Business Intelligence and what are the factors/trends driving Embedded Analytics.
Charalampos Xanthopoulakis, Data Visualizations Architect
Selling your house in the financial crisis-stricken Greece is up to this day a great ordeal. When faced with such a challenge, I was baffled by the sparsity of conclusive data on land value at my birthplace city, Thessaloniki. Embarking on a personal mission and collecting and processing more than 10K online housing ads together with open data, I managed to render an insightful interactive visualization of the actual real estate values on borough and city block level that was published through the Greek media. Join me on this thought process journey to find out how to
o Gather vast online data with simple scripting
o Combine your data with open data into meaningful structures
o Create interactive data visualizations that have an actual impact @ infographeo.com
This will be an interactive session, so please feel free to bring your thoughts and questions to share during the session.
Dan Sommer Senior Director, Market Intelligence Lead at Qlik
It can be hard to keep up with the rapidly changing BI landscape. But it doesn't have to be. Reserve your spot at Qlik's annual BI Trends Webinar.
In this global webinar live replay, we’ll reveal the top BI Trends for the coming year and how they can help you transform your data. Join Qlik’s Global Market Intelligence lead and former Gartner analyst Dan Sommer to learn why 2018 is the year for the “desilofication of data.”
Recent events like the Equifax data leak and new regulations like the EU's General Data Protection Regulation have increased the urgency for further change in the BI landscape and to move data out of silos.
What is the right strategy and framework?
How can you easily move from "all data," to "combinations of data," to "data insights"?
Can data literacy and augmented intelligence create a data-driven culture?
The volume of data available to decision makers continues to be massive, and is growing faster than our ability to consume it. Learn how to move your data out of silos and turn your data into insights.
RIDE is an all-in-one, multi-user, multi-tenant, secure and scalable platform for developing and sharing Data Science and Analytics, Machine Learning (ML) and Artificial Intelligence (AI) solutions in R, Python and SQL.
RIDE supports developing in notebooks, editor, RMarkdown, shiny app, Bokeh and other frameworks. Supported by R-Brain’s optimized kernels, R and Python 3 have full language support, IntelliSense, debugger and data view. Autocomplete and content assistant are available for SQL and Python 2 kernels. Spark (standalone) and Tesnsorflow images are also provided.
Using Docker in managing workspaces, this platform provides an enhanced secure and stable development environment for users with a powerful admin control for controlling resources and level of access including memory usage, CPU usage, and Idle time.
The latest stable version of IDE is always available for all users without any need of upgrading or additional DevOps work. R-Brain also delivers customized development environment for organizations who are able to set up their own Docker registry to use their customized images.
The RIDE Platform is a turnkey solution that increases efficiency in your data science projects by enabling data science teams to work collaboratively without a need to switch between tools. Explore and visualize data, share analyses, all in one IDE with root access, connection to git repositories and databases.
Andy Kriebel, Eva Murray, Paul Banoub, Emma Whyte, Josh Tapley, Simon Beaumont
Hear from our expert panel how they built a strong analytics culture in their organisations to enable data-driven decision-making.
We have invited Paul Banoub (UBS, United Kingdom), Emma Whyte (The Information Lab, United Kingdom), Simon Beaumont (NHS), and Josh Tapley (ComCast, United States) to discuss the following topics with us:
-How to find talented people and how to keep them engaged, challenged and motivated
-How to establish the right environment with processes and systems that foster innovation, learning, collaboration and analytical excellence
-How to setup best practices and governance while staying responsive to the organisation's need for information and insights RIGHT NOW
-How to make self-service analytics a success
During the panel discussion you have the chance to ask questions and get answers from our experts.
Presenters: Andy Kriebel, Head Coach at The Data School & Eva Murray, Head of BI and Tableau Evangelist at Exasol
Panel: Paul Banoub (Director, Analytics as a Service at UBS Investment Bank), Emma Whyte (Head of Centre of Excellence and Customer Advocacy at The Information Lab), Josh Tapley (Director, Data Visualization at Comcast)
Hélène Lyon, IBM, Distinguished Engineer, IBM Z Solutions Architect
IT is a key player in the digital and cognitive transformation of business processes delivering solutions for improved business value with analytics. This session will step by step explain the journey to secure production while adopting new analytics technologies leveraging mainframe core business assets
Rob Anderson, Head of Field Operations (Privitar),Tim Hickman, Associate (White & Case)
Today's modern businesses gain competitive edge and remain innovative by using advanced analytics and machine learning. Utilising big data can build customer loyalty by improving personalised marketing campaigns; optimises fraud detection; and improves products and services by advanced testing. However, the data sets required for advanced analytics are often sensitive, containing personal customer information, and therefore come with an inherent set of privacy risks and concerns.
This roundtable will cover a few key questions on data utility and privacy:
- In what ways advanced analytics help businesses gain competitive edge?
- What is defined as sensitive data?
- Will GDPR affect the way you're allowed to use customer data?
- What opportunities are there to utilise sensitive data?
Unlocking the data’s true value is a challenge, but there are a range of tools and techniques that can help. This live discussion will focus on the data analytics landscape; compliance considerations and opportunities for improving data utility in 2018 and beyond.
- A view of the data protection landscape
- How to remaining compliant with GDPR when using customer data
- Use cases for advanced analytics and machine learning
- Opportunities for maximising data utility in 2018
Mixed reality is the result of blending the physical world with the digital world. Though it is relatively new technology and its adoption is still in initial stages. Mixed Reality devices and applications are projected to be the next technological era after smart phones.
The webinar will give a brief on Mixed Reality Potential Usecases those provide an immersive experience but also revenues streams to the creators.
Managing and analyzing data to inform business decisions
Data is the foundation of any organization and therefore, it is paramount that it is managed and maintained as a valuable resource.
Subscribe to this channel to learn best practices and emerging trends in a variety of topics including data governance, analysis, quality management, warehousing, business intelligence, ERP, CRM, big data and more.