jepa-intuitive-physics  by facebookresearch

Self-supervised pretraining for intuitive physics understanding

Created 1 year ago
256 stars

Top 98.6% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides the code and data to reproduce the findings of the paper "Intuitive physics understanding emerges from self-supervised pretraining on natural videos." It enables researchers and engineers to evaluate models on intuitive physics tasks using self-supervised pretraining on natural videos, facilitating reproducibility and further research in video understanding and AI reasoning.

How It Works

The project leverages self-supervised pretraining, specifically building upon the JEPA (Joint Embedding Predictive Architecture) framework. It evaluates models by extracting "surprise metrics" from natural videos, quantifying their intuitive physics understanding. The approach supports V-JEPA and VideoMAEv2 models, processing raw surprise outputs into performance metrics for analysis and figure generation.

Quick Start & Requirements

  • Installation: Decompress data_intphys.tar.gz. Install dependencies via requirements.txt.
  • Configuration: Adapt evaluation scripts (utils.py) to specify cluster environment and dataset paths.
  • Execution: Run local evaluations using python -m evals.main or distributed runs with submitit via python -m evals.main_distributed. Configuration files (.yaml) specify model checkpoints and evaluation tasks.
  • Prerequisites: Python environment, GPU(s) (multiple recommended for distributed runs), decompressed data.
  • Links: Base JEPA code: github.com/facebookresearch/jepa

Highlighted Details

  • Enables full reproduction of paper results, including all models and figures.
  • Provides raw surprise metrics and processed performance data (.pth, .csv).
  • Includes evaluation code compatible with V-JEPA and VideoMAEv2.
  • Facilitates submission to the official IntPhys leaderboard via intphys_test evaluation.

Maintenance & Community

Authored by researchers from Meta AI (formerly Facebook AI Research), including Yann LeCun. No specific community channels (Discord, Slack) or roadmap links are provided in the README.

Licensing & Compatibility

  • License: CC-BY-NC (Creative Commons Attribution-NonCommercial).
  • Compatibility: Strictly non-commercial use is permitted. Commercial applications or integration into closed-source proprietary systems are prohibited.

Limitations & Caveats

The setup requires manual adaptation of cluster configurations and dataset paths. The non-commercial license restricts its use in industry or for profit-driven projects. The focus is on reproducing specific research findings rather than providing a general-purpose library.

Health Check
Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
4 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.