TREC 2025 Proceedings

Video Question Answering Answer Generation Task Appendix

Runtag Org METEOR BERTScore NDCG_BERTScore NDCG_METEOR STSscore
certh.vqa.ag.run_2 (per_response.csv)  (vqa_gen_eval)  (paper) CERTH-ITI 0.2604 0.8982 0.9879 0.9879 0.3510
certh.vqa.ag.run_1 (per_response.csv)  (vqa_gen_eval)  (paper) CERTH-ITI 0.2596 0.9033 0.9924 0.9152 0.3483
certh.vqa.ag.run_4 (per_response.csv)  (vqa_gen_eval)  (paper) CERTH-ITI 0.2596 0.9034 0.9924 0.9152 0.3485
videollama-7B-prompt (per_response.csv)  (vqa_gen_eval)  (paper) WHU-NERCMS 0.2384 0.8649 1.0000 1.0000 0.2987
videollama-2B (per_response.csv)  (vqa_gen_eval)  (paper) WHU-NERCMS 0.2345 0.8657 1.0000 1.0000 0.3023
videollama-2B-prompt (per_response.csv)  (vqa_gen_eval)  (paper) WHU-NERCMS 0.2311 0.8667 1.0000 1.0000 0.3006
nut-kslab-2025 (per_response.csv)  (vqa_gen_eval)  (paper) kslab 0.2265 0.8931 0.9927 0.8318 0.2837
MARS-GenR_1 (per_response.csv)  (vqa_gen_eval)  (paper) HLTCOE 0.2191 0.8893 0.9867 0.8181 0.2817
gemini_flash_25 (per_response.csv)  (vqa_gen_eval)  tcna 0.2128 0.8617 0.9870 0.8376 0.2567
certh.vqa.ag.run_3 (per_response.csv)  (vqa_gen_eval)  (paper) CERTH-ITI 0.2031 0.8906 0.9906 0.7823 0.2852
gemini_flash_lite (per_response.csv)  (vqa_gen_eval)  tcna 0.1886 0.8560 0.9846 0.8278 0.2162
Aria8x3.5B_VidLLaMa (per_response.csv)  (vqa_gen_eval)  (paper) NII_UIT 0.1728 0.8866 0.9955 0.8254 0.2696