HuggingFaceModelDownloader  by bodaay

CLI tool for downloading Hugging Face models/datasets

created 2 years ago
706 stars

Top 49.4% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This Go utility provides a fast, multithreaded alternative to git lfs for downloading models and datasets from HuggingFace. It targets developers and researchers needing efficient access to large AI models, offering features like SHA256 checksum verification, download resumption, and flexible filtering for specific model files, particularly useful for quantized formats like GGML.

How It Works

The tool leverages Go's concurrency primitives to download large LFS files using multiple connections simultaneously, significantly speeding up transfers compared to single-threaded methods. It performs SHA256 checksum verification post-download to ensure data integrity and supports resuming interrupted downloads by skipping already downloaded files. Users can filter LFS files based on patterns, which is particularly beneficial for downloading specific quantized variants of models (e.g., GGML q4_0, q5_0).

Quick Start & Requirements

  • Install: bash <(curl -sSL https://g.bodaay.io/hfd) (installs to current directory) or bash <(curl -sSL https://g.bodaay.io/hfd) -i (installs to OS bin folder).
  • Prerequisites: None explicitly mentioned beyond a compatible OS (Linux, Mac, Windows WSL2).
  • Usage: hfdownloader -m <model_name> or hfdownloader -d <dataset_name>.
  • Docs: https://g.bodaay.io/hfd

Highlighted Details

  • Multithreaded LFS downloads with configurable concurrent connections (default 5).
  • SHA256 checksum verification for downloaded files.
  • Filter LFS files by name, useful for downloading specific model quantizations (e.g., GGML variants).
  • Supports resuming interrupted downloads and skipping existing files.
  • Can use HuggingFace Access Tokens via environment variable (HF_TOKEN), .env file, or command-line flag.
  • Configuration file support (~/.config/hfdownloader.json) for default settings.

Maintenance & Community

The project is maintained by bodaay. No specific community channels (Discord/Slack) or roadmap are mentioned in the README.

Licensing & Compatibility

The README does not explicitly state a license. This requires clarification for commercial use or integration into closed-source projects.

Limitations & Caveats

The project's license is not specified, which may pose a risk for commercial adoption. The README focuses on Linux/Mac/WSL2; Windows native support is not detailed.

Health Check
Last commit

9 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
2
Star History
43 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.