xw-cli  by TsingmaoAI

Large model deployment CLI for domestic hardware

Created 3 weeks ago

New!

288 stars

Top 91.4% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

TsingmaoAI/xw-cli (Xuanwu CLI) aims to democratize the deployment of large AI models on domestic Chinese hardware, positioning itself as a domestic alternative to Ollama. It simplifies the process of running various models like Qwen, GLM-4.7, and DeepSeek-OCR, optimizing performance for specific domestic compute platforms and hardware, such as Huawei Ascend chips. The tool targets engineers and researchers seeking an easy, command-line driven approach to leverage local, often specialized, AI hardware without complex environment setup.

How It Works

Xuanwu CLI abstracts away intricate environment configurations and operator adaptations required for diverse AI hardware. It employs an automatic hardware detection system to recommend and route inference requests to the most suitable underlying engine, supporting multiple inference backends. This multi-engine approach ensures broad model compatibility and optimized performance tailored for domestic hardware architectures, enabling a "zero-threshold" deployment experience.

Quick Start & Requirements

  • Installation: curl -o- http://xw.tsingmao.com/install.sh | bash
  • Prerequisites: Linux operating system, supported domestic GPU/accelerator cards (e.g., Huawei Ascend), and correctly installed drivers.
  • Resources: Setup is designed to be quick ("Express Install"). Specific resource footprint (RAM, VRAM, disk) for running models is not detailed but implied to be manageable via xw pull and xw run.
  • Links:
    • Official Website: http://xw.tsingmao.com
    • Documentation: http://xw.tsingmao.com/doc
    • Model Repository: http://xw.tsingmao.com/models

Highlighted Details

  • Domestic Hardware Optimization: Deeply optimized for and supports domestic computing power platforms and hardware, including Huawei Ascend chips.
  • Ease of Use: One-click, out-of-the-box deployment with automatic hardware detection and engine selection.
  • Model Support: Supports popular models such as Qwen, GLM-4.7, Minimax-2.1, and DeepSeek-OCR.
  • Multi-Engine: Integrates and routes through multiple inference engines for performance and compatibility.

Maintenance & Community

The project is maintained by the Tsingmao Intelligence team. Community interaction is encouraged via GitHub Issues and multiple WeChat groups for technical discussions and user experience sharing.

Licensing & Compatibility

The license type is not specified in the provided README. Compatibility for commercial use or linking with closed-source projects is undetermined without a stated license.

Limitations & Caveats

The tool is explicitly designed for Linux systems and requires specific, supported domestic hardware, potentially limiting its use on standard consumer hardware or other operating systems. Driver installation is a prerequisite. The project was officially open-sourced on February 2, 2026, suggesting it may still be in early stages of development or community adoption.

Health Check
Last Commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)
3
Issues (30d)
22
Star History
291 stars in the last 23 days

Explore Similar Projects

Starred by Luis Capelo Luis Capelo(Cofounder of Lightning AI), Patrick von Platen Patrick von Platen(Author of Hugging Face Diffusers; Research Engineer at Mistral), and
4 more.

ktransformers by kvcache-ai

0.2%
17k
Framework for LLM inference optimization experimentation
Created 1 year ago
Updated 1 day ago
Feedback? Help us improve.