The document collection used in the TREC-9 QA task was the
set of newspaper/newswire documents on the TIPSTER and TREC disks.
In particular, this includes
AP newswire (Disks 1-3)
Wall Street Journal (Disks 1-2)
San Jose Mercury News (Disk 3)
Financial Times (Disk 4)
Los Angeles Times (Disk 5)
Foreign Broadcast Information Service (FBIS) (Disk 5)
The question were taken from a log of questions submitted to
the Encarta system made available by Microsoft plus questions
derived from an Excite query log. Eleven questions were eliminated
from the test when final scores were computed. These include
questions 322, 333, 339, 443, 591, 598, 656, 663, 794, 796, and 811.
TREC-9 test questions
TREC-9 top docs ranked list (.gz)
TREC-9 judgment set (.gz)
TREC-9 pattern set
TREC-9 pattern evaluation perl script
TREC-9 question variants key
The TREC-9 test set of questions contain questions that
were created by the assessors to be semantically identical
but syntactically different from a question already
in the test set. This files lists the original question
and the set of variants for that original.
TREC-9 original answers
This file lists the answers the assessors found during question
development when they sought answers in the document collection.