TREC 2025 Proceedings

garamp_zephyr7b_t2

Submission Details

Organization
DUTH
Track
Detection, Retr., and Gen for Understanding News
Task
Report Generation Task
Date
2025-08-23

Run Description

Is this run manual or automatic?
automatic
Is this run based on the provided starter kit?
no
Briefly describe this run
BM25 retrieval with Pyserini over the MS MARCO V2.1 (Segmented) Lucene index. For each topic we retrieve k=40 segments and keep up to 10 evidence passages after de-dup/length filtering. A single LLM pass (Zephyr-7B-Beta) produces a ≤250-word report in ~4 sentences; each sentence cites up to 3 MS MARCO segment docids. Post-processing validates JSON, clips citations to ≤3, and aligns outputs 1:1 with the official topics list.
What other datasets or services (e.g. Google/Bing web search, ChatGPT, Perplexity, etc.)were used in producing the run?
MS MARCO V2.1 (Segmented) prebuilt Lucene index; Pyserini/Anserini (Lucene, Java 21); Hugging Face Transformers; local GPU inference. No external web data beyond the MS MARCO collection.
Briefly describe LLMs used for this run (optional)
HuggingFaceH4/zephyr-7b-beta via HF Transformers (temperature≈0.2, top_p≈0.9, max_new_tokens capped to fit the context window).Instruction asks the model to write a well-attributed trustworthiness report grounded only in the provided MS MARCO segments: discuss source bias/motivation, assess cited evidence, and present alternative viewpoints. Hard constraints: 3–5 sentences, total ≤250 words; per-sentence up to 3 MS MARCO segment docids as citations; skip unsupported points rather than speculate. Outputs are validated and aligned to the topic list.
Please give this run a priority for inclusion in manual assessments.
5 (bottom)

Evaluation Files

Paper