Interactive Retrieval Evaluation
Very difficult to do well
Two particular problems
- modern systems are too good:
- effectiveness measures limited by user agreement with relevance judgments
- usually assumes na�ve users
- variation among user performance enormous
- isn�t realistic