Qwen3.5  by QwenLM

Powerful multimodal foundation models for AI development

Created 5 months ago
1,018 stars

Top 36.5% on SourcePulse

GitHubView on GitHub
Project Summary

Qwen3.5 is a series of large language models from Alibaba Cloud's Qwen team, focusing on enhanced multimodal learning, architectural efficiency, and global accessibility. It aims to provide developers and enterprises with advanced capabilities for reasoning, coding, agents, and visual understanding, offering significant performance gains and cost-effectiveness.

How It Works

The models leverage a Unified Vision-Language Foundation trained on trillions of multimodal tokens, achieving cross-generational parity and outperforming previous VL models. An Efficient Hybrid Architecture, combining Gated Delta Networks with sparse Mixture-of-Experts (MoE), enables high-throughput, low-latency inference. Scalable Reinforcement Learning across million-agent environments ensures robust real-world adaptability, while expanded support for 201 languages facilitates worldwide deployment.

Quick Start & Requirements

Model weights are available on Hugging Face Hub (Qwen/Qwen3.5-397B-A17B) and ModelScope. Local inference can be initiated using Hugging Face Transformers (transformers serve), SGLang (python -m sglang.launch_server), or vLLM (vllm serve), all providing OpenAI-compatible APIs. llama.cpp (GGUF models) and MLX (Apple Silicon) are also supported. Official documentation is listed as "coming soon." Deployment typically requires substantial GPU resources, with examples showing tensor parallelism (tp-size 8) and support for very long contexts (up to 262,144 tokens).

Highlighted Details

  • Achieves cross-generational
Health Check
Last Commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)
7
Issues (30d)
7
Star History
1,096 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.