TREC 2025 Proceedings
infolab_UD_run1
Submission Details
- Organization
- UDInfo
- Track
- Million LLM
- Task
- LLM Ranking Task
- Date
- 2025-09-21
Run Description
- Is the run manual or automatic?
- automatic
- Did you use the response metadata?
- no
- Did you use any additional data or external knowledge?
- no
- Did you use the development set?
- yes
- Did you train on the development set?
- yes
- Provide a description of this run, including details about your answers above.
- Run 1: Hierarchical single-index ranking model. Consists of two main components:
1. Cluster Profiles (Broad Scoring) : Answers provided on the discovery datasets were embedded, and the embeddings were clustered to create centroids representing the expertise areas of each LLM.
2. Answer Embedding Index (Refine Scoring): Each LLM's answers were stored in an index for fast nearest-neighbor retrieval.
At inference: New query -> encoded -> compared to each LLM centroid -> max similarity -> broad score -> searched against each index -> top k most similar answers are retrieved -> averaged score -> refine score -> Final score (Weighted combination of both broad and refine scores)
*Run Type: Automatic and Internal – no manual intervention was performed after seeing the test queries. And, only the discovery data was used to build profiles
- Priority for pooling
- 1 (top)
Evaluation Files
Paper