Global Market Intro to EDRM's "CROSS PLATFORM EMAIL DUPLICATE IDENTIFICATION"
Beth Patterson, Craig Ball, Matt Golab and Paul Sirkis
About this talk
You are invited to join EDRM’s Email Duplicate Identification Global Project Team as they share how they have solved the painful and oftentimes expensive process of duplicate identification across emails processed by multiple vendor platforms. This webinar will be held in multiple time zones over two days so everyone can participate.
During discovery, disclosure or an investigation, it is often useful to identify duplicate emails in data exchanged between parties. This can deliver many benefits, including the ability for legal teams to rapidly triage emails already reviewed that also reside in data received from others. While current approaches effectively identify email duplicates within native datasets processed by a single vendor, they do not enable duplicate identification across emails processed by multiple vendor platforms. Vendors use similar methods to detect email duplicates, but there are nuanced differences in their proprietary algorithms.
Currently no means of cross platform email duplicate identification exists, except to reprocess the data using a single vendor platform, often expending significant time and cost. The EDRM Duplicate Email Identification project set out to develop a solution to cross platform email duplicate identification. Our solution is a simple, but effective approach which involves the use of the hash value of an email Message ID metadata field that we have named the EDRM Message Identification Hash (“MIH”). This new approach need not replace current vendor email deduplication methods, and will enable cross platform email duplicate identification. It is expected cross platform duplicate identification using the MIH will be applied to email data sets that have already been deduplicated using a vendor’s standard deduplication process.