bolt  by huawei-noah

Deep learning library for high-performance, heterogeneous deployment

Created 5 years ago
954 stars

Top 38.5% on SourcePulse

GitHubView on GitHub
Project Summary

Bolt is a high-performance, lightweight deep learning inference library designed for efficient deployment across a wide range of hardware and model formats. It targets developers and researchers needing to optimize neural network performance for edge devices and servers, offering significant speedups and reduced resource consumption.

How It Works

Bolt employs a graph optimization engine and efficient thread affinity settings to maximize inference speed. It supports various numerical precisions (FP32, FP16, INT8, BNN) and model formats (Caffe, ONNX, TFLite, TensorFlow), enabling broad compatibility. Its architecture is built for heterogeneous flexibility, allowing it to leverage specific hardware acceleration features.

Quick Start & Requirements

Build and installation is performed via the install.sh script, with various target platforms and precision options. For example, ./install.sh --target=android-aarch64 for Android ARMv8. Detailed instructions for building with specific compilers and deploying models are available in the docs directory.

Highlighted Details

  • Claims 15%+ performance improvement over existing open-source acceleration libraries.
  • Supports a wide array of platforms including ARM (v7, v8, v8.2+, v9), x86 (AVX2, AVX512), and various GPUs (Mali, Qualcomm, Intel, AMD).
  • Offers support for NLP tasks, including BERT and TTS, in addition to common CV applications.
  • Includes an on-device training module (beta) for select models like LeNet, MobileNet_v1, and ResNet18.

Maintenance & Community

The project is developed by Huawei Noah's Ark Lab. Community support is available via QQ group: 833345709.

Licensing & Compatibility

Licensed under the MIT License, permitting commercial use and integration with closed-source projects.

Limitations & Caveats

The on-device training feature is currently in beta and supports a limited set of models. The default static library linking may cause issues on some platforms, with a --shared option available for shared library linking.

Health Check
Last Commit

5 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
0 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Ying Sheng Ying Sheng(Coauthor of SGLang).

fastllm by ztxz16

0.4%
4k
High-performance C++ LLM inference library
Created 2 years ago
Updated 1 week ago
Starred by Luis Capelo Luis Capelo(Cofounder of Lightning AI), Alex Yu Alex Yu(Research Scientist at OpenAI; Former Cofounder of Luma AI), and
7 more.

TransformerEngine by NVIDIA

0.4%
3k
Library for Transformer model acceleration on NVIDIA GPUs
Created 3 years ago
Updated 20 hours ago
Starred by François Chollet François Chollet(Author of Keras; Cofounder of Ndea, ARC Prize), Chaoyu Yang Chaoyu Yang(Founder of Bento), and
13 more.

neon by NervanaSystems

0%
4k
Deep learning framework (discontinued)
Created 11 years ago
Updated 4 years ago
Feedback? Help us improve.