lang-seg by isl-org

Semantic segmentation model using language

Created 4 years ago

821 stars

Top 43.2% on SourcePulse

Project Summary

LSeg (Language-driven Semantic Segmentation) offers a novel approach to image segmentation by leveraging natural language descriptions for class labels. This enables zero-shot generalization to unseen categories without retraining, making it valuable for researchers and practitioners in computer vision and NLP seeking flexible and adaptable segmentation models.

How It Works

LSeg employs a transformer-based image encoder to generate dense, per-pixel embeddings and a text encoder to create embeddings for descriptive labels (e.g., "grass"). A contrastive objective aligns these embeddings, allowing semantically similar labels to map to similar image regions. This design facilitates generalization to novel classes at test time by exploiting the semantic relationships captured in the text embeddings.

Quick Start & Requirements

Installation: pip install -r requirements.txt followed by specific installs for PyTorch, PyTorch-Encoding, PyTorch-Lightning, OpenCV, imageio, ftfy, regex, tqdm, CLIP, altair, streamlit, protobuf, timm, tensorboardX, matplotlib, test-tube, and wandb.
Data Preparation: Requires ADE20k dataset (python prepare_ade20k.py).
Demo: Download demo model (checkpoints/demo_e200.ckpt) and run streamlit run lseg_app.py or use lseg_demo.ipynb.
Dependencies: PyTorch (v1.7.1), CLIP, PyTorch-Lightning (v1.3.5), Streamlit.

Highlighted Details

Achieves competitive zero-shot performance on semantic segmentation tasks.
Generalizes to unseen categories without retraining or additional samples.
Matches traditional segmentation accuracy with fixed label sets.
Provides interactive demo applications via Streamlit.

Maintenance & Community

This project is NOT UNDER ACTIVE MANAGEMENT by Intel. Intel has ceased development, maintenance, bug fixes, and contributions. Users are encouraged to fork the project for ongoing needs.

Licensing & Compatibility

The repository's license is not explicitly stated in the README. However, it acknowledges codebases from DPT, PyTorch-Lightning, CLIP, PyTorch Encoding, Streamlit, and Wandb, which have various open-source licenses. Users should verify compatibility for commercial or closed-source use.

Limitations & Caveats

The project is no longer maintained by Intel, meaning no future updates, bug fixes, or support are expected. Users requiring ongoing development or maintenance will need to fork the repository.

lang-seg by isl-org

Explore Similar Projects

CrossFlow by qihao067

TokenFlow by ByteVisionLab

CM3Leon by kyegomez

Awesome-Open-Vocabulary-Semantic-Segmentation by Qinying-Liu

fromage by kohjingyu

InstructCV by AlaaLab

Osprey by CircleRadon

MiniGPT-4-ZH by RiseInRose

OpenAI-CLIP by moein-shariatnia

VL-BERT by jackroos

open_flamingo by mlfoundations

Transformers-Tutorials by NielsRogge