jcru-ansR — Relevance Judgment subtask

Is this a manual (human intervention) or automatic run?: automatic
Does this run leverage neural networks?: yes
Does this run leverage proprietary models in any step of the retrieval pipeline?: yes
Does this run leverage open-weight LLMs (> 5B parameters) in any step of the retrieval pipeline?: yes
Does this run leverage smaller open-weight language models in any step of the retrieval pipeline?: no
What would you categorize this run as?: Generation-in-the-loop Pipeline
Please provide a short description of this run: Crucible@rag25 Original run tag: filtered-covered-covextr-crucible-retrieved_docs-rag25_qwen3_merged_questions-retrieved-qwen3_32b.retrieved_docs.jsonl-SupportedAnswerExtractorRequest Question-answering prompt. Filtering with argue_eval. Crucible report generation. Guiding nuggets: most_common Document source: nugget citations. Nugget extraction prompt 'SupportedAnswerExtractorAll' on collection "ragtime-mt" LLM: llama3.3-70b-instruct Sentences retained when citations supported, at least one nugget covers the summary sentence, at least one nugget covers extractive document segment according to argue_eval. The frequency with which a cited document is used for sentences is used as relevance score.
Please give this run a priority for inclusion in manual assessments.: 1 (top)