The Thirty-Third Text REtrieval Conference
(TREC 2024)

Product Search Main task Appendix

RuntagOrgIs this run manual or automatic?Is this run text-only, image-only, or multi-modal?Briefly describe this runWhat other datasets were used in producing the run?Briefly describe LLMs used for this run (optional)Please give this run a priority for inclusion in manual assessments.
BM25 (trec_eval) Lowes-DS
automatic
text-only
BM25
None
1 (top)
BM25-QE (trec_eval) Lowes-DS
automatic
text-only
BM25 with Query Expansion
None
1 (top)
Rerank (trec_eval) Lowes-DS
automatic
text-only
Top 1000 BM25 reranked with TAS-B
None
1 (top)
TAS-B (trec_eval) Lowes-DS
automatic
text-only
Single representation biencoder dense retrieval method TAS-B
None
1 (top)
SPLADE++ (trec_eval) Lowes-DS
automatic
text-only
Learned Sparse Vector method SPLADE++
None
1 (top)
BM25-TAS-B-fusion (trec_eval) Lowes-DS
automatic
text-only
TAS-B and BM25 fusion
None
1 (top)
BM25-SPLADE++-fusion (trec_eval) Lowes-DS
automatic
text-only
BM25 and SPLADE++ fusion
None
1 (top)
BM25QE-TAS-B-fusion (trec_eval) Lowes-DS
automatic
text-only
BM25 with Query Expansion and TAS-B fusion
None
1 (top)
BM25QE-SPLADE++-fusion (trec_eval) Lowes-DS
automatic
text-only
BM25 with Query Expansion and SPLADE++ fusion
None
1 (top)
jbnu08 (trec_eval) (paper)jbnu
automatic
text-only
Fusion of jbnu02 and jbnu04 using the ranx library.
No other datasets were used.
1 (top)
jbnu04 (trec_eval) (paper)jbnu
automatic
text-only
Using the ColBERT model and overcoming the maximum token limit by utilizing document summaries generated by T5.
No other datasets were used.
3
jbnu09 (trec_eval) (paper)jbnu
automatic
text-only
Modifying the SPLADE model to calculate negative scores using the GELU function, and overcoming the maximum token limit by summarizing documents using T5 for retrieval.
No other datasets were used.
4
jbnu01 (trec_eval) (paper)jbnu
automatic
text-only
Modifying the SPLADE model to calculate negative scores using the GELU activation function.
No other datasets were used.
5 (bottom)
jbnu07 (trec_eval) (paper)jbnu
automatic
text-only
Fusion of jbnu02 and jbnu03 using the ranx library.
No other datasets were used.
5 (bottom)
jbnu10 (trec_eval) (paper)jbnu
automatic
text-only
Translating queries and applying Pseudo Relevance Feedback (PRF) and ASIN-title conversion, followed by retrieval using BM25, and fusion of the results with jbnu04 using the ranx library.
No other datasets were used.
5 (bottom)
jbnu03 (trec_eval) (paper)jbnu
automatic
text-only
Using the TAS-B model with title and T5 (document) summary data.
No other datasets were used.
5 (bottom)
jbnu02 (trec_eval) (paper)jbnu
automatic
text-only
Fine-tuning the base SPLADE model, then overcoming the maximum token limit by summarizing documents using T5 for retrieval.
No other datasets were used.
1 (top)
jbnu11 (trec_eval) (paper)jbnu
automatic
text-only
Fusion of jbnu09 and jbnu03 using the ranx library.
No other datasets were used.
5 (bottom)
bm25-simple-collection (trec_eval) stktest
manual
text-only
bm25
none
none
4
jbnu12 (trec_eval) (paper)jbnu
automatic
text-only
Fusion of jbnu09 and jbnu04 using the ranx library.
No other datasets were used.
5 (bottom)
jbnu05 (trec_eval) (paper)jbnu
automatic
text-only
Fusion of jbnu01 and jbnu03 using the ranx library.
No other datasets were used.
5 (bottom)
jbnu06 (trec_eval) (paper)jbnu
automatic
text-only
Fusion of jbnu01 and jbnu04 using the ranx library.
No other datasets were used.
5 (bottom)
res_img_splade_bm25_rerank (trec_eval) wish
automatic
multi-modal
We trained a dual-tower model that maps queries to product texts and images, retrieving the K nearest neighbors of the query vector from product embeddings. We also incorporated additional candidates from SPADE++ and BM25, then reranked all candidates using a cross-encoder model.
This run utilized heuristically labeled relevance judgments for (query, product) pairs from Wish.com’s search data.
2
res_splade_bm25_rerank (trec_eval) wish
automatic
text-only
We trained a dual-tower model that maps queries to product texts, retrieving the K nearest neighbors of the query vector from product embeddings. We also incorporated additional candidates from SPADE++ and BM25, then reranked all candidates using a cross-encoder model.
This run utilized heuristically labeled relevance judgments for (query, product) pairs from Wish.com’s search data.
2
long_res_img_splade_bm25 (trec_eval) wish
automatic
multi-modal
We trained a dual-tower model with a longer sequence that maps queries to product texts and images, retrieving the K nearest neighbors of the query vector from product embeddings. We also incorporated additional candidates from SPADE++ and BM25, then reranked all candidates using a cross-encoder model.
This run utilized heuristically labeled relevance judgments for (query, product) pairs from Wish.com’s search data.
1 (top)
long_res_splade_bm25 (trec_eval) wish
automatic
text-only
We trained a dual-tower model with a longer sequence length that maps queries to product texts, retrieving the K nearest neighbors of the query vector from product embeddings. We also incorporated additional candidates from SPADE++ and BM25, then reranked all candidates using a cross-encoder model.
This run utilized heuristically labeled relevance judgments for (query, product) pairs from Wish.com’s search data.
1 (top)
kd_bm25_100_listwise_20_10 (trec_eval) kd
automatic
text-only
This run uses BM25 to retrieve top-100 candidates and then applies a listwise sliding window strategy to rerank the top-100 candidates.
The item description, ratings, and all content (except for image) was used.
GPT4o
3
kd_bm25_100_listwise_40_15 (trec_eval) kd
automatic
text-only
This run retrieves the top-100 items using BM25 and then performs a sliding window approach with window size 40 and stride 10 to rerank them.
All the fields (ratings, item information, title, reviews, etc) are used except for image information
GPT4o
3
kd_bm25_100_listwise_20_10_twice (trec_eval) kd
automatic
text-only
This run uses BM25 to retrieve top-100 candidates and then applies a listwise sliding window strategy twice to rerank the top-100 candidates. The ordered top-k from the first pass of sliding window listwise is revisited again for reranking.
The item description, ratings, and all content (except for image) was used.
GP4o
1 (top)
kd_bm25_100_listwise_30_15 (trec_eval) kd
manual
text-only
This run uses BM25 to retrieve top-100 candidates and then applies a listwise sliding window strategy to rerank the top-100 candidates. Sliding window of width 30 and stride of 15.
The item description, ratings, and all content (except for image) was used.
GPT4o
3
snowflake arctic medium model (trec_eval) stktest
manual
text-only
snowflake arctic medium model
none
snowflake arctic medium model
2
snowflake arctic large model (trec_eval) stktest
manual
text-only
snowflake arctic large model
none
snowflake arctic large model
1 (top)
GTE Large (trec_eval) stktest
manual
text-only
https://huggingface.co/thenlper/gte-large General text embeddings
none
https://huggingface.co/thenlper/gte-large General text embeddings
1 (top)
kd_bm25_100_listwise_20_10_llama_spark (trec_eval) kd
automatic
text-only
This run uses BM25 to retrieve top-100 candidates and then applies a listwise sliding window strategy to rerank the top-100 candidates using the Llama Spark.
The item description, ratings, and all content (except for image) was used.
Llama Spark: Llama-Spark is built upon the Llama-3.1-8B base model, fine-tuned using the Tome Dataset, and merged with Llama-3.1-8B-Instruct.
2
kd_linear_combo_1_100 (trec_eval) kd
automatic
text-only
This computes the rankings through three sliding window ranking approaches, each of which uses a paraphrase of some base instructions and then the scores are combined together for the final ranking.
All the fields (ratings, item information, title, reviews, etc) are used except for image information
GPT4o
2
run_bm25_1000_listwise_50_20 (trec_eval) kd
automatic
text-only
This run uses BM25 to retrieve top-1000 candidates and then applies a listwise sliding window strategy of window 50 and stride 20 to rerank the top-1000 candidates
The item description, ratings, and all content (except for image) was used.
GPT4o
3
run_bm25_1000_listwise_50_30 (trec_eval) kd
automatic
text-only
This run uses BM25 to retrieve top-1000 candidates and then applies a listwise sliding window strategy of window 50 and stride 30 to rerank the top-1000 candidates
The item description, ratings, and all content (except for image) was used.
GPT4o
3