SDR Evaluation Approach
In the TREC tradition:
- Create realish but doable application task
- Increase realism (and difficulty) each year
- NIST creates:
- infrastructure: test collection, queries, task definition, relevance judgements
- task includes several different control conditions: recognizer, boundaries, etc.
- Sites submit:
- speech recognizer transcripts for benchmarking and sharing
- rank-ordered retrieval lists for scoring