Cranfield Tradition
Test collections are abstractions, but laboratory tests are useful nonetheless
- evaluation technology is predictive (i.e., results transfer to operational settings)
- different relevance judgments almost always produce the same comparative results
- adequate pools allow unbiased evaluation of unjudged runs