Hi [[ session.user.profile.firstName ]]

HPCC Systems Open Source Big Data Platform

  • Date
  • Rating
  • Views
  • The Download: Tech Talks by the HPCC Systems Community, Episode 17
    The Download: Tech Talks by the HPCC Systems Community, Episode 17 HPCC Systems Recorded: Sep 13 2018 82 mins
    Speakers and topics for this episode include:

    Farah Al Shanik, Clemson University - Equivalence Terms for Text Search Bundle
    Text Search Bundle (TSB) is an open source project for searching on XML text documents & contains many subtasks, one being equivalence terms. We can consider equivalence terms as strong synonyms for TSB. Several term equivalences: initialism, abbreviation, synonyms & similarity based on context. We used HPCC Systems to develop a Text search tool via Moby thesaurus to return a set of synonyms, word2vec algorithm to return similar words, then built a dataset for state names & its abbreviation to return the set of related documents while improving the initialism for TSB to find strings with or without the punctuation.

    Soukaina Filali, Georgia State University - Fraud Detection on Transactional Data using a Time Series Mining Approach
    The project consists of detecting fraudulent pre-paid cards from non-fraudulent ones using mined patterns on their respective historical bank transactions data. There are numerous types of card programs, each of which comes with different fraud risk levels. Every fraud category has representative patterns that a human manually monitors on a daily basis. The goal here is to combine the domain expert engineered features with time series shapelets mining techniques to provide an automated fraud detection solution, which can potentially help in early fraud detection.

    Lili Xu, Clemson University & Gus Reyna, LexisNexis - Using HPCC Systems ML to Map Thousands of Public Records Data Descriptions to Standard Codes
    There is a challenge of incorporating public records data into business processes given disparate descriptions across states for similar events, and finding standards giving a consistent meaning for use. This session tells the story of how HPCC Systems ML addressed the problem of mapping thousands of disparate public record data descriptions to a corresponding set of standard codes.
  • HPCC Systems Commuity Focus: 5 Questions with Itauma Itauma
    HPCC Systems Commuity Focus: 5 Questions with Itauma Itauma Itauma Itauma Recorded: Aug 23 2018 13 mins
    In this session, we are highlighting some of the rock stars of the HPCC Systems Community. Today's session is 5 Questions with Itauma Itauma.

    Itauma Itauma is a doctoral candidate at Keiser University and a computer science instructor at Wayne State University. His interests lie in learning analytics and utilizing HPCC Systems for educational research. He has an undergraduate degree in Electrical Engineering from the University of Ilorin and two Masters Degrees, a Master of Science in Computer Engineering from Istanbul Technical University, majoring in human-robot interaction and a Master of Science in Computer Science from Wayne State University where his thesis was based on leveraging HPCC Systems for Big Data analytics.
  • The Download: Tech Talks by the HPCC Systems Community, Episode 16
    The Download: Tech Talks by the HPCC Systems Community, Episode 16 HPCC Systems Recorded: Aug 2 2018 106 mins
    This episode will feature our 2018 HPCC Systems summer interns:

    Shah Muhammad Hamdi, PhD student, CS at Georgia State University - Dimensionality Reduction and Feature Selection in ECL-ML

    Hamdi will discuss the parallel implementation of Principal Component Analysis (PCA) using the Parallel Block Basic Linear Algebra Subsystem (PBblas) library and ECL implementations of feature selection algorithms for the HPCC Systems platform.

    Robert Kennedy, PhD student in Computer Science at Florida Atlantic University - Parallel Distributed Deep Learning on HPCC Systems

    Robert will cover what he implemented during his summer internship. Combining HPCC Systems and Google’s TensorFlow, Robert created a parallel stochastic gradient descent algorithm to provide a basis for future deep neural network research and to enhance HPCC System’s distributed neural network training capabilities.

    Aramis Tanelus, programmer and senior at American Heritage High School where he is the lead programmer for the Advanced Robotics Team - Developing HPCC Systems Data Ingestion APIs for Common Robotic Sensors.

    Aramis’s project will make it easy for anyone in robotics around the world to ingest data from common robotic sensors into an HPCC Systems platform for use in data analysis. Aramis will be speaking about his work on the autonomous agricultural robot and implementing new packages for the Robotics Operating System to interface with HPCC Systems for big data analysis.

    Saminda Wijeratne, Masters student, Computational Science and Engineering at Georgia Institute of Technology, Atlanta - MPI Proof of Concept

    The built-in "Message Passing" library in HPCC Systems is designed to handle these communications among dissimilar components and perform non-trivial communication patterns among them. Saminda will explore how this library currently operates and how we can introduce a different implementation such as an existing popular library called MPI.
  • The Download: Tech Talks by the HPCC Systems Community, Episode 15
    The Download: Tech Talks by the HPCC Systems Community, Episode 15 HPCC Systems Recorded: Jun 28 2018 64 mins
    Join us as we continue this series of webinars specifically designed for the community by the community with the goal to share knowledge, spark innovation and further build and link the relationships within our HPCC Systems community. This episode will feature three speakers on the following topics:

    Jingqing Zhang, Imperial College of London
    Deep Sequence Learning and Text Classification

    Bob Foreman, LexisNexis Risk Solutions
    ECL Summer Code Camp Review
    On May 16th, five HPCC Systems Ambassadors along with Flavio Villanustre met with eight iRISE2 members for a two-hour ECL Code Camp. The event was a great success, and I thought I’d share with the community what we did and some of the ECL ideas that came out of it. Tips from Data Ingestion to ECL to Data Evaluation will be included in this segment.
  • Boost Mobile & Digital Banking Engagement while Reducing Churn With Big Data
    Boost Mobile & Digital Banking Engagement while Reducing Churn With Big Data Anirudh Shah, Founder & CEO, 3LOQ Labs and Flavio Villanustre, VP Technology, HPCC Systems Recorded: May 22 2018 50 mins
    Join Anirudh Shah, Founder & CEO, 3LOQ Labs, and Flavio Villanustre, VP Technology, HPCC Systems, to learn how 3LOQ is solving the problem of customer churn with open source big data and machine learning technology. 3LOQ addresses this challenge by deploying proprietary machine learning algorithms to analyze billions of data points and map out dynamic feature recommendations to reinforce repeated usage of a product. The end result? Reduced churn with high customer engagement for businesses.

    3LOQ recently partnered with a leading Indian banking institution to increase adoption of their digital channels. The project yielded impressive results for the client, including a:

    · 45% reduction in customer churn
    · 145% increase in digital banking transactions
    · 75% increase in users who made four or more transactions per month

    In this webcast, Flavio will give an overview of one of the key tech tools that contributes to 3LOQ's success, the completely free, open source HPCC Systems big data platform. Anirudh will share how 3LOQ Labs leverages this platform to:

    • Analyze four terabytes of data combined with built-in analytics libraries to create personalized recommendations
    • Utilize efficient coding in an implicitly parallel platform that allows prototypes to be developed and iterated quickly
    • Enable horizontal scaling on commodity hardware, with the flexibility to deploy both on premises and in the cloud
  • The Download: Tech Talks by the HPCC Systems Community, Episode 14
    The Download: Tech Talks by the HPCC Systems Community, Episode 14 HPCC Systems Recorded: May 17 2018 88 mins
    Join us as we continue this series of webinars specifically designed for the community by the community with the goal to share knowledge, spark innovation and further build and link the relationships within our HPCC Systems community. This episode will feature three speakers on the following topics:

    Tai Donovan, Robotics Director, American Heritage School - High School Autonomous Agricultural Project
    A group of 5-6 students are working on an autonomous agricultural project with the goal of providing time sensitive data to the owner-operator/farmer/grower of a production farm. Tai will discuss their challenges and how he is using HPCC Systems.

    Lorraine Chapman, Consulting Business Analyst, LexisNexis Risk Solutions - Meet Our Summer Interns
    By the end of 2018, ten students will have completed projects as part of the HPCC Systems intern program. Find out about these students, including where and what they are studying, the projects they will be working on and the intern experience we provide to help them feel part of the team. Lorraine will also speak about how you can get involved with the program by being a mentor, or contributing a project idea for a new feature or enhancement to the HPCC Systems platform and/or Machine Learning Library.

    Richard Taylor, Chief Trainer, HPCC Systems, LexisNexis Risk Solutions – Current/Longest Event Sequence by Month
    Richard will discuss processing event dates to discover for each event within a given time frame: the current number of sequential months the event occurred, and the longest contiguous month-by-month sequence. This topic is based on questions from one of our Statistical Modelers (new to ECL) regarding how to approach the problem in a non-procedural manner. The example code will make use of the GROUP and HAVING functions.
  • The Download: Tech Talks by the HPCC Systems Community, Episode 13
    The Download: Tech Talks by the HPCC Systems Community, Episode 13 HPCC Systems Recorded: Apr 19 2018 98 mins
    Join us as we continue this series of webinars specifically designed for the community by the community with the goal to share knowledge, spark innovation and further build and link the relationships within our HPCC Systems community.

    Episode 13 includes Tech Talks featuring speakers from our community on topics covering the Future of Automotive Telemetry: Assessing Autonomous Vehicle Risk Implications using Simulated Data, Developing A Custom, Pluggable HPCC Systems Security Manager and Understanding the ECL Watch Graphs. View the full details at hpccsystems.com
  • The Download: Tech Talks by the HPCC Systems Community, Episode 12
    The Download: Tech Talks by the HPCC Systems Community, Episode 12 HPCC Systems Recorded: Mar 15 2018 95 mins
    Join us as we continue this series of webinars specifically designed for the community by the community with the goal to share knowledge, spark innovation and further build and link the relationships within our HPCC Systems community.

    Episode 12 includes Tech Talks featuring speakers from our community on topics covering exploratory data analysis, geospatial solutions and ECL Tips leveraging the HPCC Systems platform.


    1) Itauma Itauma, PhD Candidate, Keiser University - Conducting exploratory data analysis in educational research using HPCC Systems®

    2) Ignacio Calvo, LexisNexis Risk Solutions - Big Data and Geospatial with HPCC Systems®

    3) Bob Foreman, Senior Software Engineer, HPCC Systems, LexisNexis Risk Solutions - ECL Tip of the Month
  • No Ordinary Hard Hat: Improving Health & Safety with Open Source Big Data
    No Ordinary Hard Hat: Improving Health & Safety with Open Source Big Data Anupam Sengupta, CTO Guardhat & Flavio Villanustre, VP Technology, HPCC Systems Recorded: Mar 2 2018 64 mins
    Over 4,000 U.S. workers die on the job every year. While new wearable technologies are aggressively entering consumer applications, industrial safety equipment has not seen a fundamental innovation in the last decade.

    Join us to learn how Guardhat CTO Anupam Sengupta and Guardhat use open source big data technology to address this issue with its “smart hard hat ecosystem”, an industrial wearable that uses IoT and wireless communications systems to protect and empower industrial workers.

    In this webcast, Flavio will give an overview of the completely free, open source HPCC Systems big data platform.

    Anupam will share how Guardhat leveraged this platform to:
    • Allow real-time complex event processing of vast amounts of streaming data.
    • Enable horizontal scaling on commodity hardware, with the flexibility to deploy both on premises and in the cloud.
    • Support big data analytics including the ability to analyze, identify, and predict trends.
    • Enable rapid green-field development
  • The Download: Tech Talks by the HPCC Systems Community, Episode 11
    The Download: Tech Talks by the HPCC Systems Community, Episode 11 HPCC Systems Recorded: Feb 15 2018 91 mins
    Join us as we continue this series of webinars specifically designed for the community by the community with the goal to share knowledge, spark innovation and further build and link the relationships within our HPCC Systems community.

    Episode 11 includes Tech Talks featuring speakers from our community on topics covering Big Data solutions, Spark Integration and other ECL Tips leveraging the HPCC Systems platform.

    1) Raj Chandrasekaran, CTO & Co-Founder, ClearFunnel - Scaling Data Science capabilities: Leveraging a homogeneous Big Data ecosystem

    2) James McMullan, Software Engineer III, LexisNexis Risk Solutions - HDFS Connector Preview

    3) Bob Foreman, Senior Software Engineer, LexisNexis Risk Solutions - Building a RELATIONal Dataset - A Valentine’s Day Special!

Embed in website or blog