- TREC 2023
-
Deep Learning track web page
(See track web page for links to test and training corpora.)
NIST judgments for the Passage Ranking task. This qrels file includes judgments for near-duplicates in MSMARCO v2, see below about the duplicate IDs.
NIST judgments for the Passage Ranking task, with "1" judgments mapped to "0" because related passages are not relevant. Use this qrels with trec-eval for all measures
except NDCG variants. Use the full qrels with trec-eval for NDCG-related
measures.
Equivalence classes of duplicate passages. This file has three fields, an equivalence class ID, a passage ID, and the first bit of the paragraph text. The equivalence class ID is the passage ID that was chosen as the class representative. This means that when the first two fields don’t match, the passage ID is a duplicate. In the "withDupes" qrels files above, the class representative was judged, and then the judgment propagated to all duplicates in that class.
Document-level judgments, inferred from the passages. If a passage was relevant, the document containing it was relevant.
Document-level judgments, inferred from the passages, with "1" judgments mapped to "0" as with passages.
Note: Documents were judged on a four-point scale of Not Relevant (0), Relevant (1),
Highly Relevant (2) and Perfect (3). Levels 1--3 are considered to be relevant for
measures that use binary relevance judgments.
Passages were judged on a four-point scale of Not Relevant (0), Related (1),
Highly Relevant (2), and Perfect (3), where 'Related' is actually NOT
Relevant---it means that the passage was on the same general topic, but did not
answer the question. Thus, for Passage Ranking task runs (only), to
compute evaluation measures that use binary relevance judgments using trec_eval,
you either need to use trec_eval's -l option [trec_eval -l 2 qrelsfile runfile]
or modify the qrels file to change all 1 judgments to 0.
|
|