TREC 2025 Proceedings
CUET-qwen14B-v3
Submission Details
- Organization
- CUET
- Track
- Detection, Retr., and Gen for Understanding News
- Task
- Question Generation Task
- Date
- 2025-08-08
Run Description
- Is this run manual or automatic?
- automatic
- Is this run based on the provided starter kit?
- no
- Briefly describe this run
- This run utilizes the unsloth/Qwen3-14B-unsloth-bnb-4bit model to generate 10 investigative and critical questions per topic from the TREC 2025 dataset. The questions are designed to help readers assess the credibility and bias of each article. The prompt includes two detailed few-shot examples modeled after PolitiFact and MBFC, guiding the model to focus on:
Evidence and factual integrity
Bias and one-sided reporting
Missing viewpoints or counterarguments
Language framing and sensationalism
Conflicts of interest or affiliations
LangChain’s LLMChain is used to wrap a HuggingFace text generation pipeline with settings that enable diverse outputs (temperature=0.6, top_p=0.9, do_sample=True, max_new_tokens=600). Each article’s body is truncated to the first 2000 characters to fit within the model’s 2048-token context window. A regex is used to extract properly formatted numbered questions up to 300 characters long. The model attempts up to 3 retries per topic to get at least 10 valid questions, padding with "N/A" if not enough are generated. The final output is saved in a tab-separated file named CUET_run8.tsv, with columns: topic ID, team ID, run ID, question rank, and cleaned question.
- What other datasets or services (e.g. Google/Bing web search, ChatGPT, Perplexity, etc.) were used in producing the run?
- No external datasets or services such as Google, Bing, ChatGPT, or Perplexity were used.
However, in the prompt template, the model is guided by two curated few-shot examples—one informed by PolitiFact’s approach to evidence-based fact-checking, and another inspired by MBFC-style media bias analysis. These examples simulate the reasoning process of expert fact-checkers and inform the model’s generation internally.
- Briefly describe LLMs used for this run (optional)
- The model used is unsloth/Qwen3-14B-unsloth-bnb-4bit, a quantized variant of Alibaba’s Qwen-14B model optimized by Unsloth for efficient inference in 4-bit precision (using bitsandbytes).
It supports RoPE scaling, enabling long input sequences, and is auto-configured with suitable data types (dtype=None allows auto-detection for hardware like Tesla T4, V100, Ampere, etc.).
The HuggingFace pipeline applies nucleus sampling (top_p=0.9) and a controlled creativity temperature (0.6) for balanced and non-repetitive question generation.
- Please give this run a priority for inclusion in manual assessments.
- 2
Evaluation Files
Paper