At the core of many AI applications in eDiscovery is Machine Learning (ML). ML algorithms enable computers to learn from data without being explicitly programmed. In eDiscovery, this means training algorithms on a subset of documents to identify characteristics of relevant or privileged information. Key ML concepts include:
Supervised Learning: Training a model on labeled data (e.g., documents marked as relevant or not relevant) to predict outcomes for new, unlabeled data.
Unsupervised Learning: Identifying patterns and structures in unlabeled data, often used for clustering similar documents.
Feature Engineering: The process of selecting and transforming raw data into features that can be used by ML algorithms.