TREC 2025 Proceedings

bm25_NITA_JH

Submission Details

Organization
NITATREC
Track
Retrieval-Augmented Generation
Task
Retrieval Only Task
Date
2025-08-17

Run Description

Is this a manual (human intervention) or automatic run?
automatic
Does this run leverage neural networks?
no
Does this run leverage proprietary models in any step of the retrieval pipeline?
no
Does this run leverage open-weight LLMs (> 5B parameters) in any step of the retrieval pipeline?
no
Does this run leverage smaller open-weight language models in any step of the retrieval pipeline?
no
Was this run padded with results from a baseline run?
no
What would you categorize this run as?
Traditional Only
Please provide a short description of this run
This run applies a BM25 retrieval pipeline using Pyserini over the MS MARCO v2.1 segmented corpus. A Lucene index was constructed with positional information, document vectors, and raw text storage enabled, and queries were preprocessed into TSV format for compatibility. Retrieval was performed with BM25 (k1=1.2, b=0.75), returning the top-100 ranked segments per query, and outputs were generated in standard TREC run file format.
Please give this run a priority for inclusion in manual assessments.
2

Evaluation Files