openfda by FDA

Open FDA data APIs and pipelines

Created 11 years ago

649 stars

Top 51.4% on SourcePulse

View on GitHub

2 Experts Love This Project

Jeff Hammerbacher

Cofounder of Cloudera

Luis Capelo

Cofounder of Lightning AI

Project Summary

The openFDA project provides open APIs, data downloads, and a developer community for FDA public datasets, including drugs, foods, and medical devices. It aims to make this data accessible for research and development, empowering users to build applications and gain insights into FDA-regulated products.

How It Works

This project utilizes a Python-based Luigi pipeline to process raw FDA data into a JSON format suitable for Elasticsearch. An Elasticsearch cluster stores this processed data, and a Node.js API server, built with Express and Elasticsearch.js, serves the data via a documented JSON interface (api.fda.gov). This architecture allows for efficient data ingestion, storage, and querying of large public health datasets.

Quick Start & Requirements

Installation: Run bootstrap.sh for Python virtualenv and Node.js package setup. Docker is recommended via docker-compose up.
Prerequisites: Elasticsearch 7, Python 3.6+, Node.js 14+. Linux users may need to increase vm.max_map_count to 262144. Windows users should use git clone ... --config core.autocrlf=input.
Resources: Docker setup includes Elasticsearch, API, and Python containers. API is available after pipelines complete; check http://localhost:8000/status.
Documentation: https://open.fda.gov

Highlighted Details

Powers the official api.fda.gov endpoints.
Includes pipelines for drugs, foods, and medical devices.
Provides Elasticsearch schemas for data sets.
Supports querying via standard openFDA syntax.

Maintenance & Community

The project is an FDA initiative. Community interest may drive the addition of more data pipelines.

Licensing & Compatibility

The repository does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project explicitly states: "Do not rely on openFDA to make decisions regarding medical care." Only a subset of pipelines (NSDE, CAERS, Substance Data, Device Clearance, Device PMA, Device Event) are included in the Docker setup due to complexity and network access requirements.

Health Check

Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

6 stars in the last 30 days