Call to TREC 2024

Return to the TREC home page TREC home          National Institute of Standards and Technology Home Page

TREC Statement on Product Testing and Advertising



February 2024 — November 2024

Conducted by:
National Institute of Standards and Technology (NIST)

The Text Retrieval Conference (TREC) workshop series encourages research in information retrieval and related applications by providing a large test collection, uniform scoring procedures, and a forum for organizations interested in comparing their results. Details about TREC can be found at the TREC web site,

You are invited to participate in TREC 2024. TREC 2024 will consist of a set of tasks known as "tracks". Each track focuses on a particular subproblem or variant of the retrieval task as described below. Organizations may choose to participate in any or all of the tracks. Training and test materials are available from NIST for some tracks; other tracks will provide instructions for dataset download.

Dissemination of TREC work and results other than in the (publicly available) conference proceedings is welcomed, but the conditions of participation specifically preclude any advertising claims based on TREC results. All retrieval results submitted to NIST are published in the Proceedings and are archived on the TREC web site with the submitting organization identified.

TREC participants are added to the TREC Slack instance, and the primary mode of communication in TREC is Slack. There is a general mailing list ( but this is used for major announcements only. Some tracks have mailing lists which you should follow if you are interested in those tracks.


As soon as possible — submit your application to participate in TREC 2024 as described below.

Submitting an application will add you to the active participants' mailing list. On March 1st, NIST will announce a new password for the "active participants" portion of the TREC web site.

We accept applications to participate until late May, but applying earlier means you can be involved in track discussions. Processing applications requires some manual effort on our end. Once your application is processed (at most a few business days), the "Welcome to TREC" email message with details about TREC participation will be sent to the email address provided in the application.

July — August
Results submission deadline for most tracks. Specific deadlines for each track will be included in the track guidelines, which will be finalized in the spring.

September 30 (estimated)
Relevance judgments and individual evaluation scores due back to participants.

Nov 18 — 22
TREC 2024 conference at NIST in Gaithersburg, MD, USA this week.

Task Descriptions

Below is a brief summary of the tasks. Complete descriptions of tasks performed in previous years are included in the Overview papers in each of the TREC proceedings (in the Publications section of the web site).

The exact definition of the tasks to be performed in each track for TREC 2024 is still being formulated. Track discussion takes place on the track mailing list (or other communication medium). To join the discussion, follow the instructions for the track as detailed below.

TRECVID, TREC's sister evaluation of multimedia understanding, and TAC, TREC's sister evaluation of NLP, have been folded back into TREC. TREC 2024 thus has fourteen tracks. Eight of the tracks are continuing: AToMiC, AVS, iKAT, MedVidQA, NeuCLIR, Product, ToT, and VTT. The six new tracks this year are BioGen, CCU, Lateral Reading, PLABA, RAG, and RUFEERS.

There are two groups of tracks with a somewhat common focus. AToMiC, AVS, MedVidQA, Product, and VTT are all multimedia tasks. IKAT, NeuCLIR, BioGen, Lateral Reading, PLABA, and RAG all have a generative aspect to the task. (The remaining three tracks, ToT, CCU, and RUFEERS defy categorization.)

Activities in Extended Video (ActEV)

The Activity in Extended Video (ActEV) challenge main focus is on human activity detection in multi-camera video streams. Activity detection has been an active research area in computer vision in recent years. The ActEV evaluation is being conducted to assess the robustness of automatic activity detection for a multi-camera streaming video environment. The evaluation is based on a portion of the (MEVA) Known Facility (KF) datasets and run as a self-reported leaderboard evaluation. The evaluation is based on 20 activities as described HERE . For both the Activity and Object Detection (AOD, primary task) and Activity Detection (AD) tasks, the submitted results are measured by Probability of Missed Detection (Pmiss) at a Rate of Fixed False Alarm (RateFA) of 0.1 (Pmiss@0.1RFA)

Anticipated timeline:
TRECVID ActEV 2024 test dataset release: March 15, 2024
ActEV SRL Challenge Opens: May 10, 2024
Deadline for ActEV SRL Challenge results submission: September 15, 2024: 4:00 PM EST

Track coordinators:
Jonathan Fiscus, NIST
Afzal Godil, NIST

Track Web Page:
Contact Email :

Ad-hoc Video Search (AVS)

This track will evaluate video search engines on retrieving relevant video shots satisfying textual queries combining different facets such as people, actions, locations and objects.

The testing dataset is the V3C2 (Vimeo Creative Commons) with total 1.3 Million video shot and average duration of about 9 min. The task will test systems on 20 new queries and 20 progress queries (fixed from 2022 - 2024) for a total of 40 queries.

Anticipated timeline: submissions due end of July.

Track coordinators:
Georges Quenot, LIG
George Awad, NIST

Track Web Page:

AToMiC Track

The Authoring Tools for Multimedia Content (AToMiC) Track aims to build reliable benchmarks for multimedia search systems. The focus of this track is to develop and evaluate IR techniques for text-to-image and image-to-text search problems.

Anticipated timeline: Document (Image and Texts) collection available in January, evaluation topics in June, final submissions in July.

Track coordinators:
Jheng-Hong (Matt) Yang, University of Waterloo
Carlos Lassance, Naver Labs Europe
Mariya Hendriksen, University of Amsterdam
Thong Nguyen, University of Amsterdam
Andrew Yates, University of Amsterdam
Krishna Srinivasan, Google Research
Miriam Redi, Wikimedia Foundation
Jimmy Lin, University of Waterloo

Track Web Page:
Mailing list: Google group, name: atomic-participants

Biomedical Generative Retrieval (BioGen)

The track will evaluate technologies in the domain of biomedical documents retrieval. Specifically those with generative retrieval capabilities.

Documents including literature abstracts from the U.S. National Library of Medicine (MEDLINE) with over 30 million abstracts will be utilized.

Anticipated timeline: TBD

Track Coordinators:
Kirk Roberts, UTHealth
Dina Demner-Fushman, NIH
Steven Bedrick, Oregon Health & Science University
Bill Hersh, Oregon Health & Science University

Track Web Page: TBD

Computational Cultural Understanding (CCU)

This new track focuses on the detection of sociocultural norms in video recordings of naturally occurring interactions between two or more people conversing in Mandarin Chinese. Successful communication entails not only knowing the local language but also understanding the local cultures and customs. Violation of cultural norms may derail a conversation and lead to disastrous consequences. As such, detecting social norms and determining if a speaker is adhering to or violating them are foundational components in dialogue assistance applications to facilitate successful communication between individuals who do not speak a common language and are not familiar with each other’s culture.

The evaluation will employ a set of about 2500 video recordings of people in some type of conversations with each other in Mandarin Chinese..

Anticipated timeline: runs due mid-September 2024

Track coordinators:
Audrey Tong, NIST
Jonathan fiscus, NIST
Leora Morgenstern, SRI
Stephanie Strassel, LDC

Track Web Page:

Interactive Knowledge Assistance Track (iKAT)

iKAT is about conversational information seeking search. It is the successor to the Conversational Assistance Track (CAsT). TREC iKAT evolves CAsT to focus on supporting multi-path, multi-turn, multi-perspective conversations. That is for a given topic, the direction and the conversation that evolves depends not only on the prior responses but also on the user. Users are modeled with a knowledge base of prior knowledge, preferences, and constraints.

Anticipated timeline: runs due August 15

Track coordinators:
Mohammed Aliannejadi, University of Amsterdam
Zahra Abbasiantaeb, University of Amsterdam
Shubham Chatterjee, University of Glasgow
Jeff Dalton, University of Glasgow

Track Web Page:
Mailing list: Google group, name: trec-ikat

Lateral Reading

Detection of misinformation and its evaluation is constrained by the need to define truth. Lateral reading is a process people can use to determine the trustworthiness of information through asking questions about document sources and evidence and seeking answers via search engines.

This track's goal is to develop technologies to support and encourage the use of lateral reading. The track will organize 2 subtasks:

  1. Question Generation (given 50 news articles from ClueWeb22B-English, produce 20 questions for each article)
  2. Document Retrieval (given ClueWeb22B-English and 50 news articles with pooled questions from task 1, return top-10 documents/passages as sources of answers)

Anticipated timeline: runs due end of June/early July.

Track coordinators:
Mark Smucker, University of Waterloo
Charlie Clarke, University of Waterloo
Dake Zhang, University of Waterloo

Track Web Page:

Medical Video Question Answering (MedVidQA)

This track aims to use medical instructional video corpus to evaluate systems on retrieving and localizing visual answers given a medical query and collection of videos.

Another subtask will include generating a step-by-step textual summary of the visual instructional segment in a given medical video and a query.

Anticipated timeline: runs due in mid-August.

Track coordinators:
Deepak Gupta, NIH
Dina Demner-Fushman, NIH

Track Web Page:

NeuCLIR Track

Cross-language Information Retrieval (CLIR) has been studied at TREC and subsequent evaluation forums for more than twenty years, but recent advances in neural approaches to information retrieval (IR) warrant a new, large-scale effort that will enable exploration of classical and modern IR techniques for this task.

The task will support 3 languages (Russian, Chinese, and Farsi) and evaluate 3 subtasks:

  1. CLIR/MLIR Tasks: Given English topic, return ranking of Russian, Chinese, and Farsi news documents (MT to English provided by organizers).
  2. Technical Docs Task: Given English topic, return ranking of Chinese technical abstracts.
  3. Report Generation Pilot (2024): Generate a multi-paragraph report in English based on information supported in the Chinese, Russian, or Persian document collection.

Anticipated timeline: Document collection available in January, submissions due in August.

Track coordinators:
Dawn Lawrie, Johns Hopkins University
Sean MacAvaney, University of Glasgow
James Mayfield, Johns Hopkins University
Paul McNamee, Johns Hopkins University
Douglas W. Oard, University of Maryland
Luca Soldaini, Allen Institute for AI
Eugene Yang, Johns Hopkins University

Track Web Page:
Mailing list: Google group, name: neuclir-participants

Plain Language Adaptation of Biomedical Abstracts (PLABA)

The goal of the PLABA track is to improve health literacy by adapting biomedical abstracts for the general public using plain language.

Anticipated timeline: TBD

Track coordinators:
Brian Ondov, NIH
Dina Demner-Fushman, NIH
Hoa Dang, NIST

Track Web Page:

Product Search Track

The product search track focuses on IR tasks in the world of product search and discovery. This track seeks to understand what methods work best for product search, improve evaluation methodology, and provide a reusable dataset which allows easy benchmarking in a public forum.

The task for systems will be: given a search query, find relevant products. Documents include Amazon products corpus (title, description, etc.). The goal this year is to encourage multimodal queries (image and text)

Anticipated timeline: Runs due end of June/early July

Track coordinators:
Daniel Campos, University of Illinois at Urbana-Champaign
Corby Rosset, Microsoft
Surya Kallumadi, Lowes
ChengXiang Zhai, University of Illinois at Urbana-Champaign
Sahiti Labhishetty, University of Illinois at Urbana-Champaign
Alessandro Magnani, Walmart

Track Web Page: Product Search track web page

Retrieval Augmented Generation (RAG)

The RAG track aims to enhance retrieval and generation effectiveness to focus on varied information needs in an evolving world. Data sources will include a large corpus and topics that capture long-form definitions, list, and ambiguous information needs.

The track will involve 2 subtasks:

  1. Retrieval Task : Rank passages for a given queries
  2. RAG Task : Generate answers with supporting attributed passages

The second task takes the primary focus of the track.

Anticipated timeline: runs due end of July.

Track coordinators:
Ronak Pradeep, University of Waterloo
Nandan Thakur, University of Waterloo
Jimmy Lin, University of Waterloo
Nick Craswell, Microsoft

Track Web Page:

Recognizing Ultra Fine-grained Entities, Events, and Relations (RUFEERS)

The RUFEERS track will evaluate IE systems that identify entities, events, and relations of interest, including the roles (if any) that the entities play in the events and relations.

The track will evaluate systems on three tasks. Given an ontology and a set of Washington Post articles:

  • Task 1: Extract one mention of each event, relation, and argument from each document
  • Task 2: Extract all mentions of events, relations, and their arguments from each document
  • Task 3: Extract all mentions of each entity from each document

Anticipated timeline: deadline will likely be mid-September.

Track coordinators:
Shudong Huang, NIST
Hoa Dang, NIST

Track Web Page:

Tip-of-the-Tongue Track

The Tip-of-the-Tongue (ToT) Track focuses on the known-item identification task where the searcher has previously experienced or consumed the item (e.g., a movie) but cannot recall a reliable identifier (i.e., "It's on the tip of my tongue..."). Unlike traditional ad-hoc keyword-based search, these information requests tend to be natural-language, verbose, and complex containing a wide variety of search strategies such as multi-hop reasoning, and frequently express uncertainty and suffer from false memories.

The track will run 2 subtasks:

  1. Movie identification (230k wikipedia corpus, given a ToT query, rank Wikipedia articles)
  2. ToT known-item search for new domains with ToT query elicitation (domains include landmarks, celebrities, recipes, objects, etc.)

Anticipated timeline: Results due August 31

Track coordinators:
Jaime Arguello, University of North Carolina
Samarth Bhargav, University of Amsterdam
Bhaskar Mitra, Microsoft Research
Fernando Diaz, Google
Evangelos Kanoulas, University of Amsterdam

Track Web Page:
Twitter: @TREC_ToT

Video-to-Text (VTT)

The video-to-Text track aims to evaluate video captioning systems. Given a short video clip of about 10 sec long, systems should generate a description in 1 sentence to include four important facets as applicable:

  • Who: is in the video
  • What: are they doing
  • Where: are they doing it
  • When: are they doing it

The testing data will comprise about 2000 clips with a subtask to measure robustness of systems against real-world noise (e.g. camera shaking, etc)

Anticipated timeline: runs due end of May/early June

Track coordinators:
Asad Butt, John Hopkins University
George Awad, NIST
Yvette Graham, UCD
Afzal Godil, NIST

Track Web Page:

Conference Format

The conference itself will be used as a forum both for presentation of results (including failure analyses and system comparisons), and for more lengthy system presentations describing retrieval techniques used, experiments run using the data, and other issues of interest to researchers in information retrieval. All groups will be invited to present their results in a joint poster session. Some groups may also be selected to present during plenary talk sessions.

Application Details

Organizations wishing to participate in TREC 2024 should respond to this call for participation by submitting an application. Participants in previous TRECs who wish to participate in TREC 2024 must submit a new application.

To apply, use the new Evalbase web app at First you will need to create an account and profile, then you can register a participating organization from the main Evalbase page.

Any questions about conference participation should be sent to the general TREC email address, trec (at)

The TREC Conference series is sponsored by NIST's
Information Technology Laboratory (ITL)
Information Access Division (IAD)
Retrieval Group

Last updated: Thursday, 22-Feb-2024 07:01:12 MST
Date created: January 31, 2024

privacy policy / security notice / accessibility statement
disclaimer / FOIA