TREC 2025 Proceedings
fg-clip
Submission Details
- Organization
- ncsu-las
- Track
- Adhoc Video Search
- Task
- Video Search Task
- Date
- 2025-07-28
Run Description
- Is this run manual or automatic?
- automatic
- Describe the retrieval model used.
- This run uses FG-CLIP embeddings to retrieve the most relevant keyframes. FG-CLIP is a fine-tuned version of OpenAI's clip-vit-base-patch32, trained on V3C1 keyframes with captions generated by Phi-3-Vision. The fine-tune training used a modified loss function for fine-grain token level comparison.
- Describe any external resources used.
- The search uses embeddings from clip-vit-base-patch32 fine-tuned on V3C1 keyframes. The captions for training with V3C1 were generated by phi-3-vision.
- Training type:
- A
Evaluation Files
Paper