TREC 2025 Proceedings

infolab_UD_run1

Submission Details

Organization
UDInfo
Track
Million LLM
Task
LLM Ranking Task
Date
2025-09-21

Run Description

Is the run manual or automatic?
automatic
Did you use the response metadata?
no
Did you use any additional data or external knowledge?
no
Did you use the development set?
yes
Did you train on the development set?
yes
Provide a description of this run, including details about your answers above.
Run 1: Hierarchical single-index ranking model. Consists of two main components: 1. Cluster Profiles (Broad Scoring) : Answers provided on the discovery datasets were embedded, and the embeddings were clustered to create centroids representing the expertise areas of each LLM. 2. Answer Embedding Index (Refine Scoring): Each LLM's answers were stored in an index for fast nearest-neighbor retrieval. At inference: New query -> encoded -> compared to each LLM centroid -> max similarity -> broad score -> searched against each index -> top k most similar answers are retrieved -> averaged score -> refine score -> Final score (Weighted combination of both broad and refine scores) *Run Type: Automatic and Internal – no manual intervention was performed after seeing the test queries. And, only the discovery data was used to build profiles
Priority for pooling
1 (top)

Evaluation Files

Paper