Our paper on Systematic Error Analysis and Labeling (SEAL) 🦭 has been accepted at EMNLP demo track 🎉

Problem: How can we help users find systematic bugs in their models?

Eg: Image classification model on low light images, sentiment classifier on gym reviews

Intuition: There are patterns to model failures. Can we extract those and give them natural language descriptions?

1 & 2: extract the model embeddings and cluster points with high-loss.