TREC 2024 (33rd Text REtrieval Conference)

Runtag	Org	Is this run manual or automatic?	Briefly describe this run	What other datasets were used in producing the run?	Please give this run a priority for inclusion in manual assessments.
baseline_splade_fq_fd (trec_eval)	h2oloo	automatic	Baseline run using splade ensemble distil concondenser using full queries and the full database	MSMARCO	3
baseline_splade_sq_fd (trec_eval)	h2oloo	automatic	Baseline run using splade concondenser ensemble distil with short queries and the full database.	MSMARCO	3
baseline_clip_fq_fd (trec_eval)	h2oloo	automatic	Baseline run using the multimodal CLIP (laion/CLIP-ViT-L-14-laion2B-s32B-b82K) with full queries and full database	MSCOCO Flickr_30k	3
baseline_blip_fq_fd (trec_eval)	h2oloo	manual	Baseline run with Salesforce/blip-itm-large-coco using full queries and the full database	COCO	3
baseline_bm25_fq_fd (trec_eval)	h2oloo	automatic	Baseline run using pyserini BM25 with the full query on the full database	none	3
baseline_blip_clip_fq_fd (trec_eval)	h2oloo	automatic	Baseline run combining blip and clip with RRF.	COCO Flickr datasets that trained both clip and blip	3
baseline_bm25_clip_fq_fd (trec_eval)	h2oloo	automatic	Baseline run combining BM25 and CLIP using RRF	Flickr COCO datasets used to train CLIP	3
baseline_splade_clip_fq_fd (trec_eval)	h2oloo	automatic	Baseline run combining SPLADE and CLIP using RRF	MSMARCO FLICKR COCO datasets that trained CLIP	3
baseline_bm25_splade_fq_fd (trec_eval)	h2oloo	automatic	Baseline run combining BM25 and SPLADE using RRF	MSMARCO	3
baseline_bm25_blip_splade_fq_fd (trec_eval)	h2oloo	automatic	Baseline run combining BM25, BLIP and SPLADE with RRF	MSMARCO COCO FLICKR datasets that trained BLIP	3
baseline_jina_clip_v1_fq_fd.trec (trec_eval)	h2oloo	automatic	Baseline run with the Jina CLIP model using full queries and the full dataset	The ones used to train Jina	3
siglip_dense (trec_eval)	IRLab-Amsterdam	automatic	SigLIP dense vector representations of the text and query (page title, section title, section description, categories) following the code snippet at https://huggingface.co/timm/ViT-SO400M-14-SigLIP-384	None; SigLIP was used zero-shot	1 (top)
siglip_lsr (trec_eval)	IRLab-Amsterdam	automatic	SigLIP sparse representations of images (page title, section title, section description, categories)	The text (query) encoder was trained on MS MARCO, and the image encoder was trained on AToMiC data	2
baseline_jina_clip_v1_sq_fq (trec_eval)	h2oloo	automatic	Baseline run with the Jina CLIP V1 model using short queries and the full database	Datasets used to train jina CLIP	3
baseline_blip_sq_fq (trec_eval)	h2oloo	automatic	Baseline run with Salesforce/blip-itm-large-coco using short queries and the full database	COCO	3
baseline_bm25_blip_splade_sq_fd (trec_eval)	h2oloo	automatic	Baseline run combining BM25, BLIP and SPLADE with RRF using short queries for blip and splade	MSMARCO COCO FLICKR datasets that trained BLIP	3

Runtag

Org

Is this run manual or automatic?

Briefly describe this run

What other datasets were used in producing the run?

Briefly describe LLMs used for this run (optional)

Please give this run a priority for inclusion in manual assessments.

baseline_splade_fq_fd (trec_eval)

h2oloo

automatic

Baseline run using splade ensemble distil concondenser using full queries and the full database

MSMARCO

baseline_splade_sq_fd (trec_eval)

h2oloo

automatic

Baseline run using splade concondenser ensemble distil with short queries and the full database.

MSMARCO

baseline_clip_fq_fd (trec_eval)

h2oloo

automatic

Baseline run using the multimodal CLIP (laion/CLIP-ViT-L-14-laion2B-s32B-b82K) with full queries and full database

MSCOCO Flickr_30k

baseline_blip_fq_fd (trec_eval)

h2oloo

manual

Baseline run with Salesforce/blip-itm-large-coco using full queries and the full database

COCO

baseline_bm25_fq_fd (trec_eval)

h2oloo

automatic

Baseline run using pyserini BM25 with the full query on the full database

none

baseline_blip_clip_fq_fd (trec_eval)

h2oloo

automatic

Baseline run combining blip and clip with RRF.

COCO Flickr datasets that trained both clip and blip

baseline_bm25_clip_fq_fd (trec_eval)

h2oloo

automatic

Baseline run combining BM25 and CLIP using RRF

Flickr COCO datasets used to train CLIP

baseline_splade_clip_fq_fd (trec_eval)

h2oloo

automatic

Baseline run combining SPLADE and CLIP using RRF

MSMARCO FLICKR COCO datasets that trained CLIP

baseline_bm25_splade_fq_fd (trec_eval)

h2oloo

automatic

Baseline run combining BM25 and SPLADE using RRF

MSMARCO

baseline_bm25_blip_splade_fq_fd (trec_eval)

h2oloo

automatic

Baseline run combining BM25, BLIP and SPLADE with RRF

MSMARCO COCO FLICKR datasets that trained BLIP

baseline_jina_clip_v1_fq_fd.trec (trec_eval)

h2oloo

automatic

Baseline run with the Jina CLIP model using full queries and the full dataset

The ones used to train Jina

siglip_dense (trec_eval)

IRLab-Amsterdam

automatic

SigLIP dense vector representations of the text and query (page title, section title, section description, categories) following the code snippet at https://huggingface.co/timm/ViT-SO400M-14-SigLIP-384

None; SigLIP was used zero-shot

1 (top)

siglip_lsr (trec_eval)

IRLab-Amsterdam

automatic

SigLIP sparse representations of images (page title, section title, section description, categories)

The text (query) encoder was trained on MS MARCO, and the image encoder was trained on AToMiC data

baseline_jina_clip_v1_sq_fq (trec_eval)

h2oloo

automatic

Baseline run with the Jina CLIP V1 model using short queries and the full database

Datasets used to train jina CLIP

baseline_blip_sq_fq (trec_eval)

h2oloo

automatic

Baseline run with Salesforce/blip-itm-large-coco using short queries and the full database

COCO

baseline_bm25_blip_splade_sq_fd (trec_eval)

h2oloo

automatic

Baseline run combining BM25, BLIP and SPLADE with RRF using short queries for blip and splade

MSMARCO COCO FLICKR datasets that trained BLIP

The Thirty-Third Text REtrieval Conference
(TREC 2024)

Authoring Tools for Multimedia Content (AToMiC) AToMiC image suggestion task Appendix