infolab_UD_run2 — LLM Ranking Task

Is the run manual or automatic?: automatic
Did you use the response metadata?: no
Did you use any additional data or external knowledge?: yes
Did you use the development set?: yes
Did you train on the development set?: no
Provide a description of this run, including details about your answers above.: Run 2: Hybrid ranking model. Consists of five main components: 1. Weak labeling: discovery datasets were labeled with an LLM as a judge on how good the answers were given the query. 2. Global expertise: For every LLM, a global reference expertise score was calculated. 3. Query Encoding: Queries were embedded. 4. Local similarity (KNN): Retrieves the top k most similar training queries. 5. Domain expertise (Clusters): Cluster all training queries in the embedding space. At inference: New query -> encoded -> Local Similarity (KNN) -> local score -> cluster centroids -> cluster score -> Final score (Weighted combination of both local and cluster score) *Run Type: Automatic and External – no manual intervention was performed after seeing the test queries. And, it is external because an LLM was used for weak labeling.
Priority for pooling: 2