TREC 2025 Proceedings
cru-ablR-conf_
Submission Details
- Organization
- HLTCOE
- Track
- Detection, Retr., and Gen for Understanding News
- Task
- Report Generation Task
- Date
- 2025-08-20
Run Description
- Is this run manual or automatic?
- automatic
- Is this run based on the provided starter kit?
- yes
- Briefly describe this run
- Crucible@dragun
Original run tag: strict-filtered-crucible-retrieved_docs-most_common-retrieved-reranker.retrieved_docs.jsonl-SupportedAnswerabilityExtractorRequest
Answerability prompt. Just check citation support, then rely on extraction confidence.
Crucible report generation.
Guiding nuggets: most_common
Document source: nugget citations.
Nugget extraction prompt 'SupportedAnswerExtractorAll' on collection "ragtime-mt"
LLM: llama3.3-70b-instruct
Sentences retained when citations supported according to argue_eval.
Using abstractive summarization
Only retain sentences that have extraction confidence value >= 0.5, are not already selected (according to stopped and stemmed match), do not contain the expression 'source document'
For each nugget, among remaining sentence candidates, select the sentence with highest extraction confidence.
Chop to 250 words.
Created on 2025-08-20
- What other datasets or services (e.g. Google/Bing web search, ChatGPT, Perplexity, etc.)were used in producing the run?
- No external data set was used; except Claude was used via an LLM API.From the starter kit we only used the document ranking provided in the internal data file. We used up to 20 documents from the input document ranking 'llm_selected'.
- Briefly describe LLMs used for this run (optional)
- For the most part we used Llama-3.3-70B Instruct (70B). Runs with name 'clod' or 'cloch' used Claude 4 sonnet for nugget generation.
- Please give this run a priority for inclusion in manual assessments.
- 2
Evaluation Files
Paper