Hi [[ session.user.profile.firstName ]]

How to Get Started With NLP

Natural Language Processing (NLP), the branch of machine learning and AI which deals with bridging the gap between human language and computer understanding, is all the rage right now. Once a relatively niche topic, in the past few years landmark new models and applications have brought NLP to the center-stage of real-world enterprise data science and AI.

This webinar will give data scientists a framework for getting started with NLP projects. It will go over:
• What exactly NLP is and how it’s used
• How to clean and pre-process text for machine learning projects
• An overview of some of the main NLP algorithms and how they work

Katie Gross is a Lead Data Scientist at Dataiku, where she helps clients across industries develop AI solutions using Dataiku DSS. Previously, she worked as a data scientist at a marketing science firm, Schireson and did freelance data science work for IBM and a dating app, Radiate. Prior to her data science life, Katie spent three years as a CPG consultant at Nielsen. Katie holds a BA in Economics from Colgate University.
Recorded May 19 2020 66 mins
Your place is confirmed,
we'll send you email reminders
Presented by
Katie Gross, Lead Data Scientist @ Dataiku
Presentation preview: How to Get Started With NLP

Network with like-minded attendees

  • [[ session.user.profile.displayName ]]
    Add a photo
    • [[ session.user.profile.displayName ]]
    • [[ session.user.profile.jobTitle ]]
    • [[ session.user.profile.companyName ]]
    • [[ userProfileTemplateHelper.getLocation(session.user.profile) ]]
  • [[ card.displayName ]]
    • [[ card.displayName ]]
    • [[ card.jobTitle ]]
    • [[ card.companyName ]]
    • [[ userProfileTemplateHelper.getLocation(card) ]]
  • Channel
  • Channel profile
  • Portfolio Construction and Optimization Dec 16 2020 7:00 pm UTC 60 mins
    Suresh Vadakath, Financial Services Sales Engineer at Dataiku
    In this talk, Suresh Vadakath, financial services sales engineer at Dataiku, will demonstrate how he leverages Dataiku DSS to mine data using Principal Component Analysis (PCA) to evaluate equities for portfolio construction. From there, he will review how to balance risk versus return while being ESG-conscious in portfolio optimization.

    Dataiku is a leading end-to-end, collaborative data science platform that enables technical and non-technical users to collaborate on building data science and analytics projects to aid data-driven decision making across the enterprise.
  • AI-fuelled Social Listening Dec 16 2020 9:00 am UTC 60 mins
    Dhiman Dey, Clearpeaks, JOhn Savio, Clearpeaks, Sid Bhatia, Dataiku
    Social media platforms see millions of messages posted every day. How can marketing professionals track and analyse this deluge of information to monitor your company's brand equity?

    Join our webinar with leading AI firm Clearpeaks who explain how AI can fuel your Social Listening efforts with a concrete use case on Twitter!
  • Standardization and Improvements of Data Analytics Projects Dec 8 2020 10:00 am UTC 60 mins
    Xavier Maréchal (Reacfin), Samuel Mahy (Reacfin), Julien Antunes Mendes (Reacfin)
    Standardization and Improvements of Data Analytics Projects for Financial Institutions

    Data Analytics is a hot topic for many financial institutions. Making the most of their data and becoming data driven companies is a strategic differentiator.

    In this webinar, we identify practical difficulties in running relevant data analytics projects in financial institutions. Starting from typical projects (e.g. product pricing and behavioral modeling in banks and insurance), we explore some of these difficulties and provide practical solutions to implement a relevant data science pipe-line in financial institutions. We build a standardized approach along with the following steps:
    -Business problem framing
    -Data Management

    With some additional focus on the communication around data analytics projects and governance aspects. We advocate how data science platforms can help in robustifying this process, decreasing risks and increasing efficiency and added value of data analytics projects.

    - Xavier Maréchal, CEO at Reacfin
    - Samuel Mahy, Director at Reacfin
    - Julien Antunes Mendes, Manager at Reacfin

    Please be aware that by registering for this webinar, you agree to have your personal information shared with both partners Dataiku and Reacfin. They may contact you with information that could be of interest to you.
  • Navigating the Tricky Process of Becoming A Data Scientist Dec 4 2020 7:00 pm UTC 75 mins
    Dr. Natalie Morse (Data Scientist @ BMW)
    Tentative Schedule: (EST)

    2:00pm: Dataiku x Bots & AI Intro
    2:05pm: Navigating the Tricky Process of Becoming A Data Scientist w/ BMW
    2:45pm: Q&A

    Talk Abstract:

    So you want to be a data scientist? Seems like everyone is typing that into Indeed. What does it actually mean to land your first role? How do you transition to data science from a different field? What does it take to be successful once you’re hired?
    I’m not going to bore you with the tech stack requirements (although we’ll touch on this), but I’ll mainly focus on what I’ve learned though my own experience breaking into the field, and now as a hiring manager for data science interns.

    Speaker Bio:

    Dr. Natalie Morse is a data scientist at BMW in South Carolina, USA. She works within their innovation and research group to develop technology for the BMW Group. In addition to this role, she works as a graduate coach helping others navigate the tricky grad school world. Her goals are to bring more diverse voices to academia and tech.

    Disclaimer: All views, thoughts, & opinions expressed in the webinar belong solely to the panelists, & not to the panelists’ employer, organization, committee, other group or individual.
  • Demo webinar: 4 pillars of advanced analytics Recorded: Dec 2 2020 32 mins
    Sofiane Fessi, Sales Engineering Director, Dataiku
    Get one step further in advanced analytics with a short overview of 4 features in DSS: Data exploration, data preparation, Analytics automation and a bit of AutoML, all in 30 minutes, with in-depth demo in our solution DSS.
  • Migrating Away From Excel: Use Case Deep Dive & Q&A Recorded: Nov 24 2020 31 mins
    Will Nowak, Senior Solutions Engineer, Dataiku
    It's no secret that Excel poses massive limitations for organizations as they look to most effectively utilize data, with its lack of functions for size and volume, auditability, and automated workflows. In this session, we'll provide you with a solution to bypass the frustration. Enter Dataiku as an alternative, and learn how to work on big data as easily as small data. In this session, our financial services expert and solutions engineer, Will Nowak, will address this key challenge through use cases, notable examples, and industry observations from our customers in the space.
  • [Dataiku x Bots & AI] Data Science in Compliance and Fraud Detection Recorded: Nov 18 2020 49 mins
    Harry Lu (Data Analytics Manager @ Spotify)
    Tentative Schedule: (EST)

    2:00pm: Intro
    2:05pm: Data Science in Compliance and Fraud Detection w/ Spotify
    2:45pm: Q&A

    Talk Abstract:

    Data Science is an emerging function in a variety of industries and a greater number of data scientists have begun working on personalization, recommendations, or sales optimizations.
    The cost of compliance has also been expanding in most industries and especially in the technology sector. This is a result of the Public’s attention shift from traditional frauds to antitrust, conflict of interest, and data privacy. While there are plenty of opportunities to leveraging DS/ML to solve such problems, the complex nature of such compliance and fraud detection issues stymies data practitioners from being able to grow the data science practice. The “language barrier” of communication between the company’s DS/business/compliance appears to hide the low hanging fruits.
    In this talk, Harry will share his experience in connecting data science with compliance, examples of the DS/ML use cases, and key takeaways for our two groups of the audience (Data Scientists and Business Stakeholders.)
    • Hidden Data Science Opportunity in Compliance and Finance
    • Career Journey: from a fraud investigator to a data scientist
    • Opportunities that often get overlooked in compliance/fraud functions
    • How to speak two languages: recommendations on connecting data science resources with compliance business

    Disclaimer: All views, thoughts, & opinions expressed in the webinar belong solely to the panelists, & not to the panelists’ employer, organization, committee, other group or individual.
  • Dataiku Demo Days Ep.4: Optimizing Preventative Maintenance Recorded: Nov 17 2020 48 mins
    Aashish Majethia, Solutions Engineer, Dataiku
    Leveraging AI is an efficient way to provide real-time visibility into the production process to reduce downtime for maintenance and costs for efficient operations. Join us as we walk through how sensor data can be transformed into timely insights via predictive maintenance with automated insight improvement.

    During this webinar, we will:
    - Determine when purchased equipment might fail to deploy resources to service customers
    - Look at root cause analysis and model drift.
  • The Human Role in AutoML Recorded: Nov 12 2020 44 mins
    Timm Grosser, Senior Analyst, Business Applications Research Center (BARC)
    Will AI replace the Data Scientist?

    AI promises a lot of potential for speeding up processes and making tools, functions, even entire workflows easier to use. This is valid throughout the entire analytics process, from model creation to model operation.

    During the webinar, we will learn:
    1) Potential and impacts of AI
    2) Illustrate this using the example of the Advanced Analytics process.
    3) Outlook on future application scenarios of AI
  • ML-based Fraud Detection and Prevention in Healthcare Recorded: Nov 6 2020 66 mins
    Grant Case, Solution Engineering Director
    Fraud in the healthcare industry is on the rise globally and APAC is no exception. There has been a considerable increase in fraud/deceptive activities in the APAC region owing to the growing penetration of the internet and an increase in the use of mobile internet.
    Preventing Fraud has the potential to make medicine better, more affordable, and more accessible.

    During the webinar, we will cover the following:

    Define what is Healthcare fraud & evaluate what can be done to detect and prevent fraud
    Deep dive 4 different options to combat fraud
    Finally, see how to combine the traditional and Machine-Learning Based methods within a Machine Learning Framework

    Participate in a data science project showcase followed by Q&A with one of our Sales Engineering director.
  • Hype und Realität von KI und Data Science in datengesteuerten Unternehmen Recorded: Nov 5 2020 121 mins
    Manuel Nitzsche/ Dataiku, Clarissa Vogelbacher/ ITM Predictive GmbH, Frank Oechsle/ Esentri
    Wie sieht die Transformation zu einem datengesteuerten Unternehmen aus? Wie wird ein Data Science Projekt umgesetzt und für welche Anwendungsfälle? Nach der 1-stündigen Vorstellung mit Experten zu diesen Themen zeigen wir Ihnen exklusiv den Film "Data Science Pioneers" über die Realität und Herausforderungen eines Data Scientists und die Chancen von Data Science für Unternehmen und unseren Alltag.
  • Comment engager durablement vos clients grâce à l'usage de la data ? Recorded: Nov 5 2020 44 mins
    Alexia Dumas (LineUp 7), Alexis Kesseler (LineUp 7)
    Si on vous dit Capital Client, Patrimoine Client, Connaissance Client, Personnalisation, Engagement....ça vous parle ?

    Venez nous rejoindre au webinar du 5 novembre pour découvrir comment LineUP 7, agence de Data Marketing, accompagne ses clients dans la conception de dispositifs personnalisés et performants permettant de réengager durablement vos clients ! Et tout cela grâce à un usage optimal de la Data !

    - Comment utiliser & capitaliser sur la Data pour engager vos clients
    - Quels leviers Data activer pour optimiser votre ROI ?
    - Comment consolider votre base client et optimiser votre expérience client ?
    - Comment l'usage de la Data peut être un atout majeur pour répondre à vos besoins métier ?

    - Alexia Dumas, Consulting Manager chez LineUP 7
    - Alexis Kesseler, Data Scientist chez LineUP 7

    En vous inscrivant à ce webinaire, vous acceptez que vos informations soient partagées avec le partenaire de Dataiku, LineUP 7.
  • Roll out insights to the whole business using Data Science Platform Recorded: Nov 5 2020 53 mins
    Slava Razbash
    You might have an excellent data science team, but how do they productionalize their insights so that they are accessible by all stakeholders, all the time? If your team is emailing spreadsheets with confidential information around the company, it's time for a better solution.
    Slava will present how teams can use Dataiku DSS to securely share insights with stakeholders via a centralized platform.
  • Understand the US Elections with Data Science Feat. Jupiter Asset Management Recorded: Oct 29 2020 24 mins
    Leo Murison, Data Scientist at Jupiter Asset Management
    The leader of the free world will be elected in just under a month’s time. While sophisticated models
    from the Economist and FiveThirtyEight both put Joe Biden in front, it is by no means a certainty.

    In this talk, we demonstrate an end-to-end project in Data Science Studio (DSS) that can deliver timely insights into the presidential races in each state.
    We show how DSS can be used to:
    + Collect
    + Transform
    + Present data
    All the while giving stakeholders the ability to customise specific attributes. Lastly, we illustrate how DSS can be used in conjunction with other common technologies, simplifying how data science teams can interact with other areas of the business.

    About the speaker:
    Leo Murison is a data scientist at Jupiter Asset Management, a leading active fund management
    house with ~£56 billion of assets under management. Since joining in September 2019, he has
    helped produce a number of highly predictive machine learning models leverage alternative data,
    including the first productionised model using Dataiku. Previously, Leo worked in the data science
    team at DAZN, a global sports streaming service with millions of global subscribers, where he
    developed models to predict customer behaviour. Leo holds a degree in Neuroscience from Edinburgh University.
  • Dataiku Demo Days Ep.3: Sales Forecasting and Geocoding Recorded: Oct 28 2020 44 mins
    Emma Irwin, Sales Engineer, Dataiku & Claude Perdigou, Senior Product Manager, Dataiku
    Understanding the location data of sales and having the ability to accurately forecast revenue are critical components for a business’s success. Join us for Demo Days where we’ll show you how you can build predictive models to predict revenue for the coming days or weeks, and understand, optimize, and visualize your data by location in order to optimize business practices and streamline your day-to-day operations.

    Dataiku Demo Days is a series of expert-led demos on various high-value AI use cases, such as driving efficiencies in the data-to-insights process and maximizing campaign impact with AutoML. These digestible sessions are designed to help jumpstart your organizations’ data efforts and inject agility at every step of the process.
  • Fraud Detection- How to Operationalize your Models? Recorded: Oct 27 2020 49 mins
    Alexandre Hubert, Sales Engineering Director
    We are pleased to bring the second installment of a two-part webinar series on Fraud Detection. During the 1st webinar, we addressed why fraud hasn’t been solved yet. Addressed the need to move beyond a rule-based approach and how to navigate the creative thinking of fraudsters.

    During this webinar, we will look at fraud detection from a platform perspective and discover how to integrate ML techniques to the existing rule-based approach within an ML Framework and how to feed your fraud detection pipeline with right set of algorithms in order to ease the implementation of your use cases.
  • Fraud Detection: Why It Hasn't Been Solved Yet? Recorded: Oct 20 2020 26 mins
    Alexandre Hubert, Sales Engineering Director
    Global fraud in 2019 was nearly $60 billion, demonstrating how it is a global problem and not siloed to one industry. Fraudsters are always looking for new ways to subvert legitimate transaction systems — traditional rules-based approaches are no longer sufficient (or efficient enough) to combat fraud.

    During the first installment of this two-part webinar series, you will discover why there is a need to move beyond a rules-based approach for fraud detection, how to navigate the creative thinking of fraudsters, and the many factors that go into a machine learning fraud detection model
  • Foundational Data Science for Personalized Communications w/ Nike Recorded: Oct 19 2020 49 mins
    Ankit Gupta (Sr. Data Scientist @ Nike)
    Tentative Schedule: (EST)

    7:00pm: Intro
    7:05pm: Foundational Data Science for Personalized Communications w/ Nike
    7:45pm: Q&A

    Talk Abstract:

    Email communication is one of the most common activities on PCs and mobile devices and this holds true now more than ever. Nike invests a lot of effort into Email communication. Although the cost of sending one email may be small, the cost builds up as the number of emails aggregates. Also, the user engagement governs the reputation of Nike IP addresses and the KPIs. Therefore, it is important to identify the campaigns relevant to consumers with a high propensity of engagement.
    The goal of the Personalized Communications team is to serve the Nike consumers with the most relevant campaign emails at the right time with the right frequency. In this talk, Ankit will address important questions/topics relevant to the Personalization Communications team. He will also answer questions that explain how Nike is currently handling personalized communications, what’s working well and what’s not, and how to build the data foundation for Personalized Communications. Finally, Ankit will do a deep-dive into some of the data science models that the team is currently working on.

    Speaker bio:

    Ankit has 4+ years of experience informing business decisions through data science and statistical modeling. He is currently working as Sr. Data Scientist at Personalization and Data Science team at Nike. His previous experiences include working in Ad Tech and Finance industry.
  • Machine Learning for Optimizing Cloud Server Spend with Aptitive Recorded: Oct 19 2020 26 mins
    Ryan Lewis - Data Science Consultant at Aptitive
    Calculating your capacity needs for a cloud database server is not for the faint of heart. Underestimate and you’re stuck without enough capacity to run your operations. Overestimate and you could overspend by millions.

    In this session, Ryan Lewis from Aptitive will show you how he used machine learning to not only calculate the capacity needed but also to optimize spend based on the time of day and the day of the week. This solution helped a global financial institution to save an estimated $19 millions dollars annually by toggling server capacity as needed.
  • What can you do with unstructured text data? w/ PwC Recorded: Oct 15 2020 47 mins
    Abdallah Musmar (Data Science Lead @ PwC)
    Tentative Schedule: (EST)

    2:00pm: Intro
    2:05pm: What can you do with unstructured text data? w/ PwC
    2:45pm: Q&A

    Talk Abstract:

    In this talk we will explore the opportunities that arise from unstructured text data. Then we will take a deep dive into a few concepts that are used in applying Machine Learning to text data and discuss how can they be leveraged using deep learning and other methods

    Speaker Bio:

    Abdallah Musmar is a Manager at PricewaterhouseCoopers. He is a technical lead for machine learning projects that help the firm improve their internal business processes. He also is a PhD candidate at the University of South Florida and his research focuses on the use of Machine Learning and Agent Based Modeling in Business. He also is managing a Citizen Data Science program for undergraduate business students at the University of South Florida which he built from scratch. In 2020, he has received an award for one of his projects to be one of "The best of AI in PwC" in a worldwide AI summit made internally by the firm.

    Disclaimer: All views, thoughts, & opinions expressed in the webinar belong solely to the panelists, & not to the panelists’ employer, organization, committee, other group or individual.
Your Path to Enterprise AI
Dataiku is the centralized data platform that moves businesses along their data journey from analytics at scale to enterprise AI. By providing a common ground for data experts and explorers, a repository of best practices, shortcuts to machine learning and AI deployment/management, and a centralized, controlled environment, Dataiku is the catalyst for data-powered companies.

Customers like Unilever, GE, BNP Paribas, Santander use Dataiku to ensure they are moving quickly and growing exponentially along with the amount of data they’re collecting. By removing roadblocks, Dataiku ensures more opportunity for business-impacting models and creative solutions, allowing teams to work faster and smarter.

Embed in website or blog

Successfully added emails: 0
Remove all
  • Title: How to Get Started With NLP
  • Live at: May 19 2020 4:30 pm
  • Presented by: Katie Gross, Lead Data Scientist @ Dataiku
  • From:
Your email has been sent.
or close