lg_e2_3q5r3l — Report Generation Task

Submission Details

Organization: hltcoe-multiagt
Track: RAG TREC Instrument for Multilingual Evaluation
Task: Report Generation Task
Date: 2025-08-21

Run Description

Document collection: ['English subset', 'Arabic subset', 'Chinese subset', 'Russian subset']
Machine translation of documents: ['Yes we used the organizer-provided machine translations']
Write a short description of your retrieval process: Reciprocal ranked fusion over multilingual retrieval using plaid-x, multilingual LSR, and Qwen3 single dense vectors with ANN search.
Write a short description of your generation process: This run leverages the langraph framework. In a round, the approach produces a set of 3 queries. Similarity with the query based on Qwen3 emeddings are use to select snippets of the MT'ed documents from which the a document-based report is generated. The top 5 documents for each query are included. The snippets from documents retrieved with a single query are used to generate a partial answer. Partial answers from all queries are examined for completeness. If the answer is deemed to be incomplete, up to 3 new queries are produced to fill knowledge gaps. At the completion of at most 3 rounds, an answer is drafted and then shortened to fit the length limit. Each citation is checked against the native document to see that it supports the sentence. Unfaithful citations are removed. If a substitute can be found, another document is used instead. Otherwise, the sentence is removed. All generation is based on machine translated documents.
Which LLM(s) where used by your system?: Retrieval: XLMR and Qwen3-8B-Embedding; Generation: Qwen3-8B-Embedding and Llama-3.3-70B-Instruct
Open repository link: No
Assessing priority: 6

Evaluation Files

lg_e2_3q5r3l.autoargue (autoargue)
lg_e2_3q5r3l.almost-human-judgments.tsv (almost-human-judgments.tsv)
lg_e2_3q5r3l.almost-human-scores.tsv (almost-human-scores.tsv)
lg_e2_3q5r3l.autoargue-scores.tsv (autoargue-scores.tsv)
lg_e2_3q5r3l.autoargue-judgments.jsonl (autoargue-judgments.jsonl)