TEXT RETRIEVAL CONFERENCE (TREC) 2025
February 2025 - November 2025
Conducted by:
National Institute of Standards and Technology (NIST)
The Text Retrieval Conference (TREC) workshop series encourages research in information retrieval and related applications by providing a large test collection, uniform scoring procedures, and a forum for organizations interested in comparing their results. Details about TREC can be found at the TREC web site, http://trec.nist.gov.
You are invited to participate in TREC 2025. TREC 2025 will consist of a set of tasks known as "tracks". Each track focuses on a particular subproblem or variant of the retrieval task as described below. Organizations may choose to participate in any or all of the tracks. Training and test materials are available from NIST for some tracks; other tracks will provide instructions for dataset download.
Dissemination of TREC work and results other than in the (publicly available) conference proceedings is welcomed, but the conditions of participation specifically preclude any advertising claims based on TREC results. All retrieval results submitted to NIST are published in the Proceedings and are archived on the TREC web site with the submitting organization identified.
TREC participants are added to the TREC Slack instance, and the primary mode of communication in TREC is Slack. There is a general mailing list ([email protected]) but this is used for major announcements only. Some tracks have mailing lists which you should follow if you are interested in those tracks.
Schedule
- As soon as possible -- submit your application to participate in TREC 2025. Go to ir.nist.gov/evalbase, create an account, and register your organization.
(The organization structure lets you have everyone on your team have access as a group. You can participate solo, too, you just need to create your own organization. If you participated before, you can reuse your organization, but you do need to register.)
Submitting an application will add you to Slack and the active participants' mailing list. On March 15th, NIST will announce a new password for the "active participants" portion of the TREC web site. We accept applications to participate until late May, but applying earlier means you can be involved in track discussions. Processing applications requires some manual effort on our end. Once your application is processed (at most a few business days), the "Welcome to TREC" email message with details about TREC participation will be sent to the email address provided in the application.
- June--August
Results submission deadline for most tracks. Specific deadlines for each track will be included in the track guidelines, which will be finalized in the spring.
- September 30 (estimated)
Relevance judgments and individual evaluation scores due back to participants.
- Nov 17--21
TREC 2025 in-person conference at NIST in Gaithersburg, MD, USA with a remote attendance option
Task Description
Below is a brief summary of the tasks. Complete descriptions of tasks performed in previous years are included in the Overview papers in each of the TREC proceedings (in the Publications section of the web site).
The exact definition of the tasks to be performed in each track for TREC 2025 is still being formulated. Track discussion takes place on the track mailing list (or other communication medium). To join the discussion, follow the instructions for the track as detailed below.
TRECVID, TREC's sister evaluation of multimedia understanding, and TAC, TREC's sister evaluation of NLP, have been folded back into TREC. TREC 2025 has 11 tracks: AVS, BioGen, Change Detection, IKAT, Million LLM, Product, RAG, RAGTIME, ToT, DRAGUN, and VQA.
Change Detection, RAGTIME, Million LLM, and VQA are new tracks. RAGTIME is the next iteration of NeuCLIR, DRAGUN is the 2025 iteration of Lateral Reading, while VQA is the next iteration of Video-to-Text.
Ad-hoc Video Search (AVS)
This track will evaluate video search engines on retrieving relevant video shots satisfying textual queries combining different facets such as people, actions, locations and objects.
The testing dataset is the V3C2 (Vimeo Creative Commons) with a total 1.3 Million video shots and average duration of about 9 min. The task will test systems on last year's queries as a progress set.
Anticipated timeline: submissions due end of July.
Track coordinators:
George Awad, NIST
Track Web Page: https://www-nlpir.nist.gov/projects/tv2025/avs.html
Biomedical Generative Retrieval (BioGen)
The track will evaluate technologies in the domain of biomedical documents retrieval. Specifically those with generative retrieval capabilities.
Documents including literature abstracts from the U.S. National Library of Medicine (MEDLINE) with over 30 million abstracts will be utilized.
Anticipated timeline: TBD
Track coordinators:
Deepak Gupta, NIH
Dina Demner-Fushman, NIH
Steven Bedrick, Oregon Health & Science University
Bill Hersh, Oregon Health & Science University
Kirk Roberts, UTHealth
Track Web Page: https://dmice.ohsu.edu/trec-biogen/
Change Detection
This track models an expert user following a topic of interest over time. The model of interaction follows an "inbox" or reading queue, with the goal to maximize importance and novelty in the queue.
Anticipated timeline: runs due end of September
Track coordinators:
Kristine Rogers
David Grossman
John Frank
Peter Gantz
Megan Niemczyk
Track web page: TBD
Detection, Retrieval, and Augmented Generation for Understanding News (DRAGUN)
The goal of this track is to support people's trustworthiness assessment of online news. There are two tasks: (1) Question Generation and (2) Report Generation. The Question Generation task focuses on detecting critical questions readers should consider during their trustworthiness assessment. Those questions should guide readers' investigation, such as the bias or motivations of the source and narratives from other sources. Meanwhile, the Report Generation task involves creating a well-attributed and comprehensive report that provides readers with the background and context they need to perform a more informed trustworthiness evaluation. Both tasks run in parallel, with the same submission due date. This track differs from traditional fact-checking by aiming to assist readers in making their trustworthiness assessments from a neutral perspective, helping them to form their own judgments rather than dictating conclusions.
Anticipated timeline: runs due in mid-August.
Track coordinators:
Mark Smucker, University of Waterloo
Charlie Clarke, University of Waterloo
Dake Zhang, University of Waterloo
Track Web Page: https://trec-dragun.github.io
Interactive Knowledge Assistance Track (iKAT)
iKAT is about conversational information seeking search. It is the successor to the Conversational Assistance Track (CAsT). TREC iKAT evolves CAsT to focus on supporting multi-path, multi-turn, multi-perspective conversations. That is for a given topic, the direction and the conversation that evolves depends not only on the prior responses but also on the user. Users are modeled with a knowledge base of prior knowledge, preferences, and constraints.
Anticipated timeline: runs due mid-July
Track coordinators:
Mohammed Aliannejadi, University of Amsterdam
Zahra Abbasiantaeb, University of Amsterdam
Simon Lupart, University of Amsterdam
Shubham Chatterjee, Missouri University of Science and Tech
Jeff Dalton, University of Edinburgh
Leif Azzopardi, University of Strathclyde
Track Web Page: https://trecikat.com
Mailing list: Google group, name: trec_ikat
Million LLM
Imagine that in the future LLM-powered generative tools abound, specialized for every kind of use. Given a user's query and a set of LLMs, rank the LLMs on the basis of their ability to answer the query correctly.
Anticipated timeline: runs due end of September
Track coordinators:
Evangelos Kanoulas, University of Amsterdam
Jamie Callan, Carnegie Mellon University
Panagiotis Eustratiadis, University of Amsterdam
Mark Anderson, RMIT
Track Web Page: TBD
Product Search and Recommendation Track
The product search track focuses on IR tasks in the world of product search and discovery. This track seeks to understand what methods work best for product search, improve evaluation methodology, and provide a reusable dataset which allows easy benchmarking in a public forum.
This year the track is expanding to include a recommendation task.
Anticipated timeline: Runs due end of August
Track coordinators:
Daniel Campos, University of Illinois at Urbana-Champaign
Corby Rosset, Microsoft
Surya Kallumadi, Lowes
ChengXiang Zhai, University of Illinois at Urbana-Champaign
Sahiti Labhishetty, University of Illinois at Urbana-Champaign
Alessandro Magnani, Walmart
Track Web Page: https://trec-product-search.github.io/
Retrieval Augmented Generation (RAG)
The RAG track aims to enhance retrieval and generation effectiveness to focus on varied information needs in an evolving world. Data sources will include a large corpus and topics that capture long-form definitions, list, and ambiguous information needs.
The track will involve 2 subtasks:
1- Retrieval Task : Rank passages for a given queries
2- RAG Task : Generate answers with supporting passage attributes
The second task takes the primary focus of the track.
Anticipated timeline: runs due end of July.
Track coordinators:
Shivani Upadhyay, University of Waterloo
Ronak Pradeep, University of Waterloo
Nandan Thakur, University of Waterloo
Jimmy Lin, University of Waterloo
Nick Craswell, Microsoft
Track Web Page: https://trec-rag.github.io/
RAG TREC Instrument for Multilingual Evaluation (RAGTIME) Track
In 2024, the NeuCLIR track piloted a Report Generation task, which you might think of as RAG for expert users and information analysts, as opposed to web searchers. The main task will be generating a report with citations, based on a retrieval of documents from a trilingual corpus of Russian, Chinese, and Arabic web news.
Anticipated timeline: Document collection available in February, submissions due in mid-to-late July.
Track coordinators:
Dawn Lawrie, Johns Hopkins University
Sean MacAvaney, University of Glasgow
James Mayfield, Johns Hopkins University
Paul McNamee, Johns Hopkins University
Douglas W. Oard, University of Maryland
Luca Soldaini, Allen Institute for AI
Eugene Yang, Johns Hopkins University
Track Web Page: TBD
Mailing list: Google group, name: neuclir-participants
Tip-of-the-Tongue Track
The Tip-of-the-Tongue (ToT) Track focuses on the known-item identification task where the searcher has previously experienced or consumed the item (e.g., a movie) but cannot recall a reliable identifier (i.e., "It's on the tip of my tongue..."). Unlike traditional ad-hoc keyword-based search, these information requests tend to be natural-language, verbose, and complex containing a wide variety of search strategies such as multi-hop reasoning, and frequently express uncertainty and suffer from false memories.
The primary task is ToT known-item search for new domains with ToT query elicitation (domains include landmarks, celebrities, movies, etc.)
Anticipated timeline: Results due August 31
Track coordinators:
Jaime Arguello, University of North Carolina
Bhaskar Mitra, Microsoft Research
Fernando Diaz, Carnegie Mellon University
To Eun Kim, Carnegie Mellon University
Maik Fröbe, Friedrich-Schiller-Universität Jena
Track Web Page: https://trec-tot.github.io/
Twitter: @TREC_ToT
Mastodon: @[email protected]
Video Question Answering (VQA)
The new Video QA track is about question answering from multimedia sources. The goals of the new track include pushing multi-modal integration and complex reasoning. There will be a generation subtask and a multiple-choice subtask. The track will use either the V3C collection, or a set of YouTube shorts.
Anticipated timeline: TBA
Track coordinators:
George Awad, NIST
Sanjay Purushotam, UMBC
Yvette Graham, UCD
Afzal Godil, NIST
Track Web Page: https://www-nlpir.nist.gov/projects/tv2025/vqa.html
Conference Format
The conference itself will be used as a forum both for presentation of results (including failure analyses and system comparisons), and for more lengthy system presentations describing retrieval techniques used, experiments run using the data, and other issues of interest to researchers in information retrieval. All groups will be invited to present their results in a joint poster session. Some groups may also be selected to present during plenary talk sessions.
Application Details
Organizations wishing to participate in TREC 2025 should respond to this call for participation by submitting an application. Participants in previous TRECs who wish to participate in TREC 2025 must submit a new application.
To apply, use the new Evalbase web app at http://ir.nist.gov/evalbase. First you will need to create an account and profile, then you can register a participating organization from the main Evalbase page.
Any questions about conference participation should be sent to the general TREC email address, trec (at) nist.gov.