bm25-porterblk-test — Retrieval Task

Submission Details

Organization: DUTH
Track: Tip-of-the-Tongue Search
Task: Retrieval Task
Date: 2025-08-31

Run Description

Please describe in details how this run was generated: Automatic BM25 baseline using PyTerrier/Terrier over the official TREC ToT 2025 Wikipedia corpus. Index: Terrier with Stopwords + PorterStemmer, EnglishTokeniser, and blocks (positions) enabled. Query processing: remove control chars and punctuation; keep ≤128 tokens (Terrier further truncates). Retrieval: BM25, top-1000 per query. No manual intervention on test; parameters verified on the provided dev splits.
Specify datasets used in this run.: ["This year's TREC TOT training data"]
(if you checked "other", describe here)
Are you 100% confident that no data from https://github.com/microsoft/Tip-of-the-Tongue-Known-Item-Retrieval-Dataset-for-Movie-Identification or iRememberThisMovie.com (besides the training data provided as part of this year's track) was used for producing this run (including any data used for pretraining models that you are building on top of)?: Yes I am confident that no data from those sources except the official track training data was used to produce this run
Did you use any of the official baseline runs in any way to produce this run?: no
If you did use any of the official baseline runs in any way to produce this run, please describe how below in sufficient detail (e.g., as reranking candidates or in ensemble with other approaches).: We did not use any official baseline run files in any way. Χρησιμοποιήσαμε μόνο το επίσημο corpus του ToT-2025 και τα train/dev topics για ρύθμιση· δεν χρησιμοποιήθηκαν εξωτερικά run outputs.

Evaluation Files

bm25-porterblk-test.trec_eval (trec_eval)

Paper