Discover and explore top open-source AI tools and projects—updated daily.
DakeQQNative LLM inference for Android devices
Top 99.6% on SourcePulse
This project demonstrates running native Large Language Models (LLMs) directly on Android devices, offering on-device AI capabilities without cloud dependency. It targets developers and power users seeking to integrate LLMs into mobile applications, providing optimized performance for a variety of popular models.
How It Works
The core approach involves converting models from HuggingFace or ModelScope, optimizing them for extreme execution speed on mobile hardware. This process typically utilizes ONNX export, with a recommendation for dynamic axes and q4f32 quantization. Tokenizer files are sourced from the mnn-llm repository. The project supports various quantization methods and includes specific instructions for model parameter adjustments and low-memory loading modes.
Quick Start & Requirements
assets folder.*.so files from the libs/arm64-v8a folder.GLRender.java and project.h.low_memory_mode = true in MainActivity.java.Export_ONNX folder and using onnxruntime.tools.convert_onnx_models_to_ort.Highlighted Details
Maintenance & Community
The project shows recent activity with updates logged through early 2026, indicating ongoing development. No specific community links (e.g., Discord, Slack) or contributor details are provided in the README.
Licensing & Compatibility
License information is not specified in the provided README content.
Limitations & Caveats
Input and output behavior may differ slightly from the original HuggingFace or ModelScope models due to optimization and conversion processes. Specific parameter adjustments are required for certain model families (e.g., Qwen2VL/Qwen2.5VL).
2 days ago
1 day
b4rtaz
pytorch
mit-han-lab
ml-explore
lyogavin