non-neural — LLM Ranking Task

Is the run manual or automatic?: automatic
Did you use the response metadata?: no
Did you use any additional data or external knowledge?: no
Did you use the development set?: yes
Did you train on the development set?: no
Provide a description of this run, including details about your answers above.: This is a non-neural ensemble method that combines three different ranking approaches: 1. Bradley-Terry Model: Learns global LLM skills from ~15,000 discovery queries by analyzing response quality (length, structure, reasoning, examples) and computing pairwise win/loss ratios. 2. Enhanced Random Forest: Predicts query-specific relevance using 150 trees trained on 250k examples with features including TF-IDF query vectors, LLM embeddings, query complexity metrics, and discovery data profiles. 3. LightGBM Meta-Ranker: Learns ranking patterns using 17 meta-features that combine judge scores, Bradley-Terry skills, query characteristics, and cross-component interactions.
Priority for pooling: 5 (bottom)