TREC 2025 Proceedings
NITA_R_DPR
Submission Details
- Organization
- NITATREC
- Track
- Retrieval-Augmented Generation
- Task
- Retrieval Only Task
- Date
- 2025-08-17
Run Description
- Is this a manual (human intervention) or automatic run?
- automatic
- Does this run leverage neural networks?
- yes
- Does this run leverage proprietary models in any step of the retrieval pipeline?
- no
- Does this run leverage open-weight LLMs (> 5B parameters) in any step of the retrieval pipeline?
- no
- Does this run leverage smaller open-weight language models in any step of the retrieval pipeline?
- yes
- Was this run padded with results from a baseline run?
- no
- What would you categorize this run as?
- Multi-Stage Pipeline pointwise
- Please provide a short description of this run
- This submission implements a three-stage hybrid retrieval pipeline for TREC RAG 2025. Stage 1 utilizes pre-computed BM25 results to select the top 500 lexical candidates per query. Stage 2 applies DPR semantic filtering using Facebook's dpr-question_encoder-single-nq-base and dpr-ctx_encoder-single-nq-base models to reduce candidates to the top 200 based on cosine similarity of dense embeddings. Stage 3 performs neural reranking with cross-encoder/ms-marco-MiniLM-L-12-v2 for final relevance scoring. The system processes 105 queries with GPU batch inference (batch size 256), delivering top-100 results that combine lexical matching with semantic understanding through transformer-based architectures.
- Please give this run a priority for inclusion in manual assessments.
- 3
Evaluation Files