What is a Golden Set?
A golden set is a curated dataset used for benchmarking model outputs.
Design Principles
- Ensure diversity in the dataset to cover various scenarios.
- Include both correct and incorrect outputs for comprehensive evaluation.
- Regularly update the golden set based on model improvements.