TREC 2025 Proceedings

AMU1ENG

Submission Details

Organization
AMU
Track
RAG TREC Instrument for Multilingual Evaluation
Task
Report Generation Task
Date
2025-08-21

Run Description

Document collection
['English subset', 'Arabic subset', 'Chinese subset', 'Russian subset']
Machine translation of documents
['Yes we used the organizer-provided machine translations']
Write a short description of your retrieval process
The entire retrieval phase was performed on a locally hosted vector database. During the process of inserting texts into the database, they were divided into chunks of approximately 5 to 10 sentences, depending on their length. In edge cases, the maximum chunk size was set to 8k characters. These chunks were then vectorized using the base model BAAI/BGE-m3 with GPU acceleration. Retrieval was conducted using a standard similarity search based on the cosine similarity score.
Write a short description of your generation process
The generation process was carried out using an external closed-source model accessed via API. For the report generation task, we created a dedicated prompt, which produced a report in JSON format. During the work, open-source models ranging in size from 32B to 235B were also analyzed and showed promising results. However, due to time constraints, it was decided to complete the task using a closed-source model. For the generation process, the top 15 retrieved chunks were provided as input.
Which LLM(s) where used by your system?
GPT5-mini
Open repository link
not-public
Assessing priority
2

Evaluation Files

Paper