scrb-tot-04 — Retrieval Task

Submission Details

Organization: SRCB
Track: Tip-of-the-Tongue Search
Task: Retrieval Task
Date: 2025-09-08

Run Description

Please describe in details how this run was generated: A pipeline composed of Dense Retriever, Reranker, and LLM Reranker Query processing: all queries are converted to a list of cues by DeepSeek-V3 Dense Retriever based on Qwen/Qwen3-Embedding-8B: For movie domain: finetuned on movie data (augmented data based on train, dev1, and 5000 samples from tomt-kis dataset), creating index for 500k+ movie docs filtered by wikidata properties For other domain: use the original Qwen3-Emebedding-8b to create the index for all docs Reranker: Rerank top 2000 results from the retriever. For movie domain: finetuned Qwen3-Reranker-8B on augmented data based on train, dev1, dev2 and 300 samples from tomt-kis dataset. For other domain: finetuned Qwen3-Reranker-8B on augmented data based on train, dev1, dev2, samples from tomt-kis dataset and 1766 synthetic data created by glm-4-plus. The queries of the synthetic data were created from the top-5 docs retrieved by our baseline system. LLM Retriever: use DeepSeek-R1 to retrieve up to 10 Wikipedia entities and align them with the doc id in the corpus. Listwise Reranker using Deepseek-V3: We design a three-stage ranking pipeline for Movie domain only. First, the LLM retrieval results are inserted into the candidate list starting from rank 6, while ranks 1–5 are preserved from the baseline ranking. Second, we apply DeepSeek-v3 in a listwise ranking setting to reorder candidates from rank 2 through rank 10. Third, from the resulting ranking, we select the top four titles and conduct a fine-grained reranking using GPT-5 with the analyze-ranking strategy. The final output is obtained from this refined ranking. For other domain, we replace the top 11~20 docs from the reranker with the docs retrieved by LLM Retriever, and then rerank top-20 results (deduplicated)
Specify datasets used in this run.: ["This year's TREC TOT training data", 'Other']
(if you checked "other", describe here): Additional movie data from webis/tip-of-my-tongue-known-item-search-triplets Wikidata dumps used to filter documents of movie domain Synthetic data with docs and queries created based on the top-5 docs from the run of our baseline system.
Are you 100% confident that no data from https://github.com/microsoft/Tip-of-the-Tongue-Known-Item-Retrieval-Dataset-for-Movie-Identification or iRememberThisMovie.com (besides the training data provided as part of this year's track) was used for producing this run (including any data used for pretraining models that you are building on top of)?: no
Did you use any of the official baseline runs in any way to produce this run?: no
If you did use any of the official baseline runs in any way to produce this run, please describe how below in sufficient detail (e.g., as reranking candidates or in ensemble with other approaches).

Evaluation Files

scrb-tot-04.trec_eval (trec_eval)

Paper