Seeing the Forest and the Trees: Data Discovery in Billion Row datasets