Reasoning model for complex problem-solving, based on Qwen2.5
Top 62.0% on sourcepulse
QwQ is a reasoning-specialized large language model series from Alibaba Cloud's Qwen team, designed for complex problem-solving tasks. It aims to outperform traditional instruction-tuned models by leveraging advanced reasoning and critical thinking, making it suitable for researchers and developers tackling challenging NLP applications.
How It Works
QwQ is built upon the Qwen2.5 architecture, specifically optimized for reasoning. It utilizes a thoughtful output generation process, often starting with "\ \n", to separate reasoning steps from the final answer. The model recommends specific sampling parameters (Temperature=0.6, TopP=0.95, TopK=40) and advises against greedy decoding to prevent repetition. For long contexts, it supports YaRN scaling, configurable via rope_scaling
in config.json
.
Quick Start & Requirements
pip install transformers
. Requires transformers>=4.37.0
.
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "Qwen/QwQ-32B"
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(model_name)
# ... generation code ...
ollama run hf.co/Qwen/QwQ-32B-GGUF:Q4_K_M
./llama-cli --model QwQ-32B-GGUF/qwq-32b-q4_k_m.gguf --threads 32 --ctx-size 32768 --temp 0.6 --top-p 0.95 --prompt "<|im_start|>user\nHow many r's are in the word \"strawberry\"<|im_end|>\n<|im_start|>assistant\n \n"
Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
KeyError: 'qwen2'
with transformers<4.37.0
.4 months ago
Inactive