Our paper on Systematic Error Analysis and Labeling (SEAL) 🦠has been accepted at EMNLP demo track 🎉
Problem: How can we help users find systematic bugs in their models?
Eg: Image classification model on low light images, sentiment classifier on gym reviews
Intuition: There are patterns to model failures. Can we extract those and give them natural language descriptions?
1 & 2: extract the model embeddings and cluster points with high-loss.