TREC 2025 Proceedings
IDACCS_nugget_tb4.1
Submission Details
- Organization
- IDACCS
- Track
- RAG TREC Instrument for Multilingual Evaluation
- Task
- Report Generation Task
- Date
- 2025-08-20
Run Description
- Document collection
- ['English subset', 'Arabic subset', 'Chinese subset', 'Russian subset']
- Machine translation of documents
- ['Yes we used the organizer-provided machine translations']
- Write a short description of your retrieval process
- \item The organizers served to retrieve the top 30 documents using the background and problem statement as a query.
\item We reranked the documents to get the top 10 using \texttt{mxbai-rerank-large-v1} on 10 sentence chunks with an overlap of 5, using a query generated by GPT-4o based on the title, background, and problem statement.
- Write a short description of your generation process
- \item An \texttt{occams} extractive summary of length twice the target length was produced, where the target length is 2500 for the 10000-long summaries (as the generation was done per language), and the target length was 4000 for the 2000-long summaries.
\item GPT-4.1, with a prompt to either
\begin{enumerate}
\item form ``nuggets'' not to exceed the target length, was used to generate the report.
\end{enumerate}
\item Attribution was done using our ``blame'' semantic similarity method
- Which LLM(s) where used by your system?
- GPT-4o, gpt-4.1
- Open repository link
- na
- Assessing priority
- 3
Evaluation Files
Paper