rm3_negations — Retrieval Task

Submission Details

Organization: UAmsterdam
Track: Tip-of-the-Tongue Search
Task: Retrieval Task
Date: 2025-09-10

Run Description

Please describe in details how this run was generated: Corpus and index: TREC ToT 2025 Wikipedia JSONL; PyTerrier/Terrier index over title + full text. Software/config: PyTerrier 0.10.0, Terrier 5.11, terrier-prf plugin; parse=false. Query processing: Parser-safe normalization only; no hedge removal in this run. Negation detection: We analyze the normalized query for negation cues (single-token cues: not/no/never/without/cannot; two-token not; split contractions like “don t”, “isn t”). After a cue, we capture up to 4 subsequent tokens and keep a span only if it contains an attribute head from data/neg_heads.txt (version/remake/year/language/color/cut/etc.). Retrieval (pseudo-relevance feedback): BM25 with feedback depth 50 → RM3 (fb_docs=10, fb_terms=20) → BM25 final retrieval with 1000 results per query. Negation-aware re-scoring: After the final BM25 stage, we penalize candidates that highlight a negated span in title/lead: −2.0 if matched within ~first 128 chars; −1.0 if matched within ~first 400 chars. We do not remove terms or filter documents. Ranking/output: Sort by adjusted score; ensure exactly 1000 results per query; TREC format with run_id rm3_negpen. External resources/baselines: No LLMs or official baseline runfiles used. Run type: Automatic.
Specify datasets used in this run.: ['Other']
(if you checked "other", describe here): none
Are you 100% confident that no data from https://github.com/microsoft/Tip-of-the-Tongue-Known-Item-Retrieval-Dataset-for-Movie-Identification or iRememberThisMovie.com (besides the training data provided as part of this year's track) was used for producing this run (including any data used for pretraining models that you are building on top of)?: Yes I am confident that no data from those sources except the official track training data was used to produce this run
Did you use any of the official baseline runs in any way to produce this run?: no
If you did use any of the official baseline runs in any way to produce this run, please describe how below in sufficient detail (e.g., as reranking candidates or in ensemble with other approaches).

Evaluation Files

rm3_negations.trec_eval (trec_eval)

Paper