TREC 2025 Proceedings

IDACCS_nugget_tb4.1

Submission Details

Organization
IDACCS
Track
RAG TREC Instrument for Multilingual Evaluation
Task
Report Generation Task
Date
2025-08-20

Run Description

Document collection
['English subset', 'Arabic subset', 'Chinese subset', 'Russian subset']
Machine translation of documents
['Yes we used the organizer-provided machine translations']
Write a short description of your retrieval process
\item The organizers served to retrieve the top 30 documents using the background and problem statement as a query. \item We reranked the documents to get the top 10 using \texttt{mxbai-rerank-large-v1} on 10 sentence chunks with an overlap of 5, using a query generated by GPT-4o based on the title, background, and problem statement.
Write a short description of your generation process
\item An \texttt{occams} extractive summary of length twice the target length was produced, where the target length is 2500 for the 10000-long summaries (as the generation was done per language), and the target length was 4000 for the 2000-long summaries. \item GPT-4.1, with a prompt to either \begin{enumerate} \item form ``nuggets'' not to exceed the target length, was used to generate the report. \end{enumerate} \item Attribution was done using our ``blame'' semantic similarity method
Which LLM(s) where used by your system?
GPT-4o, gpt-4.1
Open repository link
na
Assessing priority
3

Evaluation Files

Paper