FILM  by microsoft

LLM for enhanced context utilization

Created 1 year ago
254 stars

Top 99.1% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This repository provides the official implementation for FILM-7B, a 32K-context Large Language Model designed to overcome the "lost-in-the-middle" problem. It targets researchers and developers working with long-context LLMs, offering improved performance on both probing and real-world tasks without sacrificing short-context capabilities.

How It Works

FILM-7B is a fine-tuned version of Mistral-7B-Instruct-v0.2, utilizing Information-Intensive (In2) Training. This approach enhances the model's ability to effectively process and recall information from extended contexts, achieving near-perfect scores on probing tasks and state-of-the-art performance for its size class on real-world long-context benchmarks.

Quick Start & Requirements

  • Install: Clone the repo, activate a conda environment, and install dependencies:
    git clone https://github.com/microsoft/FILM.git
    cd FILM
    conda create -n FILM python=3.10.11
    conda activate FILM
    pip install torch==2.0.1 # cuda11.7 and cudnn8
    pip install -r requirements.txt
    
  • Prerequisites: Python 3.10.11, PyTorch 2.0.1 with CUDA 11.7 and cuDNN 8.
  • Resources: Requires GPU with CUDA 11.7.
  • Docs: VaLProbing-32K, real_world_long, short_tasks.

Highlighted Details

  • Achieves near-perfect performance on probing tasks.
  • State-of-the-art results on real-world long-context tasks for ~7B LLMs.
  • Maintains short-context performance.
  • Trained using Information-Intensive (In2) Training.

Maintenance & Community

This project is from Microsoft and welcomes contributions via pull requests, requiring agreement to a Contributor License Agreement (CLA). It adheres to the Microsoft Open Source Code of Conduct.

Licensing & Compatibility

The repository does not explicitly state a license in the README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

This repository is strictly for research purposes and is not an official Microsoft product or service. The specific license for the model weights and code is not detailed in the README.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
2 stars in the last 30 days

Explore Similar Projects

Starred by Junyang Lin Junyang Lin(Core Maintainer at Alibaba Qwen), Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), and
1 more.

LMaaS-Papers by txsun1997

0%
549
Curated list of LMaaS research papers
Created 3 years ago
Updated 1 year ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Luis Capelo Luis Capelo(Cofounder of Lightning AI).

LongLM by datamllab

0%
661
Self-Extend: LLM context window extension via self-attention
Created 1 year ago
Updated 1 year ago
Feedback? Help us improve.