The Download: Tech Talks by the HPCC Systems Community, Episode 17

Presented by

HPCC Systems

About this talk

Speakers and topics for this episode include: Farah Al Shanik, Clemson University - Equivalence Terms for Text Search Bundle Text Search Bundle (TSB) is an open source project for searching on XML text documents & contains many subtasks, one being equivalence terms. We can consider equivalence terms as strong synonyms for TSB. Several term equivalences: initialism, abbreviation, synonyms & similarity based on context. We used HPCC Systems to develop a Text search tool via Moby thesaurus to return a set of synonyms, word2vec algorithm to return similar words, then built a dataset for state names & its abbreviation to return the set of related documents while improving the initialism for TSB to find strings with or without the punctuation. Soukaina Filali, Georgia State University - Fraud Detection on Transactional Data using a Time Series Mining Approach The project consists of detecting fraudulent pre-paid cards from non-fraudulent ones using mined patterns on their respective historical bank transactions data. There are numerous types of card programs, each of which comes with different fraud risk levels. Every fraud category has representative patterns that a human manually monitors on a daily basis. The goal here is to combine the domain expert engineered features with time series shapelets mining techniques to provide an automated fraud detection solution, which can potentially help in early fraud detection. Lili Xu, Clemson University & Gus Reyna, LexisNexis - Using HPCC Systems ML to Map Thousands of Public Records Data Descriptions to Standard Codes There is a challenge of incorporating public records data into business processes given disparate descriptions across states for similar events, and finding standards giving a consistent meaning for use. This session tells the story of how HPCC Systems ML addressed the problem of mapping thousands of disparate public record data descriptions to a corresponding set of standard codes.

Related topics:

More from this channel

Upcoming talks (0)
On-demand talks (49)
Subscribers (2168)
HPCC Systems is an open source Big Data analytics solution for businesses of all sizes, allowing them to improve critical time to results and decisions. Subscribe to our channel to keep informed of the latest HPCC Systems events.