IDACCS_nugget_4.1 — Report Generation Task

Document collection: ['English subset', 'Arabic subset', 'Chinese subset', 'Russian subset']
Machine translation of documents: ['Yes we used the organizer-provided machine translations']
Write a short description of your retrieval process: The following steps were done. 1. The organizers serve to retrieve the top 30 documents using the background and problem statement as a query. 2. We reranked the document to get the top 10 using mxbai-rerank-large-v1 on 10 sentence chunks with an overlap of 5 using a query generated by gpt-4o base on the title, background, and problem statement.
Write a short description of your generation process: 3. An occams extractive summary of length twice the target length, where the target length is 2500 for the 10000-lon summaries, as the generation was done per language, and the target length was 4000 for the 2000-long summaries. 4. GPT-4.1, with a prompt to form "nuggets" not to exceed the target length, was used to generate the report. 5. Attribution was done using our "blame" semantic similarity method using a t5-base model.
Which LLM(s) where used by your system?: gpt-4o, gpt-4.1
Open repository link: na
Assessing priority: 1 (highest)