TREC 2025 Proceedings
AMU1ML
Submission Details
- Organization
- AMU
- Track
- RAG TREC Instrument for Multilingual Evaluation
- Task
- Report Generation Task
- Date
- 2025-08-21
Run Description
- Document collection
- ['English subset', 'Arabic subset', 'Chinese subset', 'Russian subset']
- Machine translation of documents
- ['None']
- Write a short description of your retrieval process
- The entire retrieval phase was performed on a locally hosted vector database. During the process of inserting texts into the database, they were divided into chunks of approximately 5 to 10 sentences, depending on their length. In edge cases, the maximum chunk size was set to 8k characters. These chunks were then vectorized using the base model BAAI/BGE-m3 with GPU acceleration. Retrieval was conducted using a standard similarity search based on the cosine similarity score without reranking process
- Write a short description of your generation process
- The generation process was carried out using an external closed-source model accessed via API. For the report generation task, we created a dedicated prompt, which produced a report in JSON format. During the work, open-source models ranging in size from 32B to 235B were also analyzed and showed promising results. However, due to time constraints, it was decided to complete the task using a closed-source model. For the generation process, the top 15 retrieved chunks were provided as input.
- Which LLM(s) where used by your system?
- GPT5-mini
- Open repository link
- not-public
- Assessing priority
- 1 (highest)
Evaluation Files
Paper