Discover and explore top open-source AI tools and projects—updated daily.
brillm05Brain-inspired LLM with novel graph-based architecture
Top 86.4% on SourcePulse
BriLLM redefines generative language modeling by moving away from Transformer architectures towards a novel Signal Fully-connected Flowing (SiFu) mechanism. This approach offers full interpretability across all nodes and unbounded contextual capacity, targeting researchers and developers seeking alternatives to conventional LLM paradigms. The primary benefit lies in its brain-inspired design, mimicking cognitive patterns for potentially more efficient and understandable language generation.
How It Works
BriLLM utilizes the SiFu mechanism, a directed graph-based neural network where tokens are represented as graph nodes. Signal tensors propagate through the graph following a "least resistance" principle, with the next token emerging as the target of this signal flow. This design allows for full interpretability, as user-defined entities map directly to specific graph nodes. Unlike traditional models, SiFu's signal propagation across nodes enables arbitrarily long contexts without increasing model size. The architecture employs GeLU-activated neuron layers for nodes and fully connected matrices for edges, enabling bidirectional signaling and prediction via signal energy maximization.
Quick Start & Requirements
pip install torch. However, running the model necessitates downloading specific model checkpoints (.bin files), vocabulary files (.json files), and potentially tokenizer files (.json files) for Chinese and English versions.model and Vocab modules, and the tokenizers library (for English). Inference code examples are provided, demonstrating usage with specific model files and vocabulary mappings. Training requires significant GPU resources (e.g., 8 NVIDIA A800 GPUs).Highlighted Details
Maintenance & Community
The repository provides a GitHub link for community interaction. No specific details regarding active maintenance, contributors, sponsorships, or dedicated community channels (like Discord or Slack) are present in the provided README.
Licensing & Compatibility
No license information is specified in the provided README content. This absence may pose a barrier for users evaluating commercial or closed-source integration.
Limitations & Caveats
The initial model size is stated as approximately 16 billion parameters before optimization, indicating a substantial resource requirement. The parameter reduction relies on sparse token co-occurrence and fixed matrices for low-frequency bigrams, which could potentially limit the model's ability to capture nuanced or rare linguistic patterns. The provided installation instructions are minimal, and successful setup depends on acquiring and correctly configuring several external model and vocabulary files.
7 months ago
Inactive
lucidrains
abertsch72
lucidrains