jcru-ansR-all — Relevance Judgment subtask

Is this a manual (human intervention) or automatic run?: automatic
Does this run leverage neural networks?: yes
Does this run leverage proprietary models in any step of the retrieval pipeline?: yes
Does this run leverage open-weight LLMs (> 5B parameters) in any step of the retrieval pipeline?: yes
Does this run leverage smaller open-weight language models in any step of the retrieval pipeline?: no
What would you categorize this run as?: Generation-in-the-loop Pipeline
Please provide a short description of this run: Crucible@rag25 Original run tag: crucible-retrieved_docs-rag25_qwen3_merged_questions-retrieved-qwen3_32b.retrieved_docs.jsonl-SupportedAnswerExtractorRequest Question-answering prompt. No filtering with argue_eval. Crucible report generation. Guiding nuggets: most_common Document source: nugget citations. Nugget extraction prompt 'SupportedAnswerExtractorAll' on collection "ragtime-mt" LLM: llama3.3-70b-instruct The frequency with which a cited document is used for sentences is used as relevance score.
Please give this run a priority for inclusion in manual assessments.: 2