TREC 2025 Proceedings

infolab_UD_run2

Submission Details

Organization
UDInfo
Track
Million LLM
Task
LLM Ranking Task
Date
2025-09-22

Run Description

Is the run manual or automatic?
automatic
Did you use the response metadata?
no
Did you use any additional data or external knowledge?
yes
Did you use the development set?
yes
Did you train on the development set?
no
Provide a description of this run, including details about your answers above.
Run 2: Hybrid ranking model. Consists of five main components: 1. Weak labeling: discovery datasets were labeled with an LLM as a judge on how good the answers were given the query. 2. Global expertise: For every LLM, a global reference expertise score was calculated. 3. Query Encoding: Queries were embedded. 4. Local similarity (KNN): Retrieves the top k most similar training queries. 5. Domain expertise (Clusters): Cluster all training queries in the embedding space. At inference: New query -> encoded -> Local Similarity (KNN) -> local score -> cluster centroids -> cluster score -> Final score (Weighted combination of both local and cluster score) *Run Type: Automatic and External – no manual intervention was performed after seeing the test queries. And, it is external because an LLM was used for weak labeling.
Priority for pooling
2

Evaluation Files

Paper