This directory contains the evaluation data from the TREC 2002 novelty track. In the TREC 2002 novelty track, systems were given a set of relevant documents in a particular order, and were to return two sets of sentences. The first set of sentences was to be all of the RELEVANT sentences, that is, sentences that contained relevant information. The second set, a subset of the first, was to be the set of NEW sentences, sentences that contained information not previously returned by relevant sentences that preceded it. The data consists of the following files: * the text of the topics These are the original texts of the topics as released in earlier TRECs plus new "description" fields if the novelty assessor judged the topic differently * the document text files The document text for each topic contains up to 25 relevant (as determined by the original TREC assessor) documents ordered by the retrieval rank assigned by a particular search engine. Since the document text files contain document text :-), they must be password protected. To obtain access to the document texts, send email to Lori Buckland, lori.buckland@nist.gov, requesting the password. We must have a signed TREC data use release form on file from you to give you the password. The document text is segmented into sentence boundaries such that each sentence is assigned a sentence number within its document. * four different judgment ("qrels") files A judgment file for the novelty track contains a list of sentence identifiers of the form : for each topic. If the name of the judgment file ends in ".relevant", then the sentences are the set of RELEVANT sentences as determined by the human assessor. If the judgment file ends in ".new", then the sentences are the set of NEW sentences as determined by the assessor. To check how often humans agree on the sets of RELEVANT and NEW sentences, we had two assessors independently judge each topic. The judge who selected the smaller number of RELEVANT sentences was called the minimum judge; the other judge was the maximum judge. Ties were broken arbitrarily. The minimum judgments were used for the official TREC 2002 novelty track results. The four judgment files result from the cross product of two judges (min, max) and two set types (relevant, new). Note that the minimum assessor could not find any relevant sentences for topic 310, so the min_qrels files contain no sentences for topic 310. IMPORTANT NOTE: This version of the min_qrels files should have been equivalent to the qrels files released during TREC 2002, but it is not. We found some errors in the data after TREC 2002 and corrected those errors in this version of the qrels. In the original TREC 2002 version of the qrels, the "new" and the "relevant" sentences for two topics (382 and 397) were switched such that the relevant set was a subset of the new set, instead of the reverse. Thus evaluation results will not be identical to those shown in the TREC 2002 results. You should download this version of the qrels file to replace the one released during TREC 2002 if you have the older version. * a script, eval_novelty_run.pl This perl script takes the type of sentences to evaluate, a judgment file, and a result file as arguments and prints an evaluation of the result file. The type of sentences is either "relevant" or "new". A judgment file is one of the files above, and should match the type specified. The format of the result file is the same as a TREC 2002 novelty track submission file: topic-num relevant|new sequence-no doc-id sentence-num tag where the second field is either the literal "relevant" or the literal "new". The sequence-no must be present, but is otherwise ignored so can be anything. The evaluation scores printed are per-topic precision, recall, precision*recall, and F, plus averages over the entire topic set for each measure. Since the minimum assessor had no sentences for topic 310, this script ignores topic 310, even if given the max_qrels file. To include topic 310, include 310 in the array of topic numbers at the top of the script.