Applying Machine Learning to unstructured files and data for research