TREC 2025 Proceedings

NITA_R_DPR

Submission Details

Organization
NITATREC
Track
Retrieval-Augmented Generation
Task
Retrieval Only Task
Date
2025-08-17

Run Description

Is this a manual (human intervention) or automatic run?
automatic
Does this run leverage neural networks?
yes
Does this run leverage proprietary models in any step of the retrieval pipeline?
no
Does this run leverage open-weight LLMs (> 5B parameters) in any step of the retrieval pipeline?
no
Does this run leverage smaller open-weight language models in any step of the retrieval pipeline?
yes
Was this run padded with results from a baseline run?
no
What would you categorize this run as?
Multi-Stage Pipeline pointwise
Please provide a short description of this run
This submission implements a three-stage hybrid retrieval pipeline for TREC RAG 2025. Stage 1 utilizes pre-computed BM25 results to select the top 500 lexical candidates per query. Stage 2 applies DPR semantic filtering using Facebook's dpr-question_encoder-single-nq-base and dpr-ctx_encoder-single-nq-base models to reduce candidates to the top 200 based on cosine similarity of dense embeddings. Stage 3 performs neural reranking with cross-encoder/ms-marco-MiniLM-L-12-v2 for final relevance scoring. The system processes 105 queries with GPU batch inference (batch size 256), delivering top-100 results that combine lexical matching with semantic understanding through transformer-based architectures.
Please give this run a priority for inclusion in manual assessments.
3

Evaluation Files