TREC 2025 Proceedings

lambdamart_profiles

Submission Details

Organization
paulphys
Track
Million LLM
Task
LLM Ranking Task
Date
2025-09-22

Run Description

Is the run manual or automatic?
automatic
Did you use the response metadata?
yes
Did you use any additional data or external knowledge?
yes
Did you use the development set?
yes
Did you train on the development set?
yes
Provide a description of this run, including details about your answers above.
This run uses listwise learning to rank with LambdaMART to predict query-LLM relevance. Each LLM is represented by a behavioral profile built with the Qwen3-Embedding-8B model, which encodes domain expertise (10 categories), task capabilities (7 types), and response metrics (log-probs, refusal rate, confidence). For each query, a 64-dimensional feature vector is constructed, combining profile features, query and statistical descriptors.
Priority for pooling
1 (top)

Evaluation Files