KnowLM  by zjunlp

LLM framework for knowledge integration and utilization

created 2 years ago
1,332 stars

Top 30.8% on sourcepulse

GitHubView on GitHub
Project Summary

KnowLM is a comprehensive framework for building and deploying knowledgeable Large Language Models (LLMs), targeting researchers and developers interested in enhancing LLMs with knowledge. It offers end-to-end capabilities from data processing and pre-training to fine-tuning, augmentation, and inference, providing a model zoo with pre-trained models and datasets for immediate use.

How It Works

KnowLM integrates three core technical features: Knowledge Prompting for structured data utilization, Knowledge Editing for correcting factual inaccuracies, and Knowledge Interaction for tool-based learning and multi-agent collaboration. These are implemented through modular components like EasyInstruct, EasyDetect, and EasyEdit, enabling advanced LLM capabilities beyond standard text generation.

Quick Start & Requirements

  • Installation: Manual setup via git clone and pip install -r requirements.txt or using a Docker image (docker pull zjunlp/knowlm:v.1).
  • Prerequisites: Python 3.9, PyTorch 1.13.1+cu116. GPU acceleration is highly recommended for training and inference.
  • Resources: Multi-GPU setup is supported for larger models.
  • Links: Getting Started, EasyInstruct, EasyEdit.

Highlighted Details

  • Offers pre-trained models like KnowLM-13B-Base, KnowLM-13B-ZhiXi, and OneKE based on LLaMA architectures.
  • Provides extensive datasets for instruction tuning and information extraction (e.g., IEPile, InstructIE).
  • Integrates vLLM for accelerated inference and API serving.
  • Supports quantization via llama.cpp for resource-constrained environments.

Maintenance & Community

The project is actively developed with regular updates to model weights and features. Key contributors are listed, and community support is available via submitting issues on GitHub.

Licensing & Compatibility

The project's licensing is not explicitly stated in the README, but it relies on LLaMA, which has its own usage restrictions. Compatibility for commercial use or closed-source linking would require careful review of LLaMA's license and any specific terms for KnowLM's derived models.

Limitations & Caveats

The project is still under development, with potential for ongoing optimization and updates. Instruction tuning currently uses LoRA, not full tuning, and multi-turn conversations are not yet supported. While efforts are made to ensure harmlessness, toxic outputs may still occur. Pre-training is not exhaustive.

Health Check
Last commit

6 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
28 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.