TREC 2025 Proceedings
phi-subgroup
Submission Details
- Organization
- ncsu-las
- Track
- Adhoc Video Search
- Task
- Video Search Task
- Date
- 2025-07-28
Run Description
- Is this run manual or automatic?
- automatic
- Describe the retrieval model used.
- We extract SigLIP2-base-patch16-naflex embeddings at 1 keyframe per second. Each user query is expanded to 100 variants using GPT-4.1-mini, and their text embeddings are averaged into a single query vector. Initial retrieval is done directly using SigLIP similarity, returning the top 2,500 candidates. Each candidate shot is then evaluated 10 times using Phi-3.5-Vision, and the scores are averaged. An overlapping subgroup sort is then applied during re-ranking to limit how far each result can move from its initial rank, and the top 1,000 results are submitted.
- Describe any external resources used.
- We use embeddings from SigLIP2-base-patch16-naflex, GPT-4.1-mini for query expansion, and Phi-3.5-Vision for LLM judgement.
- Training type:
- D
Evaluation Files
Paper