What is Snorkel?

Build Training Sets Programmatically

Labeling and managing training datasets by hand is one of the biggest bottlenecks in machine learning. In Snorkel, write heuristic functions to do this programmatically instead!

Model Weak Supervision

Programmatic or weak supervision sources can be noisy and correlated. Snorkel uses novel, theoretically-grounded unsupervised modeling techniques to automatically clean and integrate them.

Train Modern ML Models

Snorkel outputs clean, confidence-weighted training datasets that easily plug into any modern machine learning framework.