edgeFlow.js by s-zx

Browser ML inference orchestration

Created 5 months ago

367 stars

Top 76.6% on SourcePulse

Project Summary

edgeFlow.js is a browser-based machine learning inference framework designed to simplify the deployment and management of ML models in web applications. It provides essential production-ready features such as intelligent task scheduling, smart caching, and robust memory management, targeting developers who need to run ML models efficiently on the client-side. The framework acts as an orchestration layer, enhancing existing inference engines with capabilities crucial for real-world applications.

How It Works

edgeFlow.js leverages ONNX Runtime with WebGPU and WASM execution providers, offering automatic fallback mechanisms for broad browser compatibility. Its core design includes a priority queue task scheduler for managing concurrent inference requests, automatic memory tracking and cleanup using scopes, and smart model loading strategies like preloading, sharding, and resumable downloads. It also features IndexedDB-based offline caching for models and integrates directly with the Hugging Face Hub for easy model acquisition.

Quick Start & Requirements

Installation: npm install edgeflowjs, yarn add edgeflowjs, or pnpm add edgeflowjs.
Prerequisites: ONNX Runtime is bundled; no additional setup is required.
Demo: Clone the repository, run npm install, then npm run demo to start a local server. Access the interactive demo at http://localhost:3000.
Documentation: Documentation, Examples, API Reference.

Highlighted Details

Task Scheduling: Priority queue, concurrency control, and task cancellation.
Batch Processing: Built-in support for efficient batch inference.
Memory Management: Automatic tracking, cleanup, and garbage collection hints.
Smart Model Loading: Preloading, sharding, and resumable downloads.
Offline Caching: IndexedDB-based model caching for offline use.
Multi-Backend Support: ONNX Runtime with WebGPU/WASM execution providers.
HuggingFace Hub Integration: Direct model download and usage.
Web Worker Support: Enables running inference in background threads.
TypeScript First: Provides full type support for intuitive API usage.

Maintenance & Community

The project includes a contributing guide, but specific community channels (like Discord or Slack) are not detailed in the README.

Licensing & Compatibility

The project is released under the MIT License, which is permissive for commercial use and integration into closed-source applications.

Limitations & Caveats

Several ML tasks are marked as "Experimental" and may rely on heuristics or require users to provide their own ONNX models for production-level accuracy. For robust, production-ready inference leveraging Hugging Face's extensive model ecosystem, the README recommends using edgeFlow.js as an orchestration layer on top of the transformers.js library via its adapter backend.

edgeFlow.js by s-zx

Explore Similar Projects

mini-infer by psmarter

timber by kossisoroyce

Kolosal by KolosalAI

mosec by mosecorg

mac-code by walter-grace

nndeploy by nndeploy

multi-model-server by awslabs

distributed-llama by b4rtaz

LitServe by Lightning-AI

FastDeploy by PaddlePaddle

dynamo by ai-dynamo

serving by tensorflow