openfda  by FDA

Open FDA data APIs and pipelines

Created 11 years ago
632 stars

Top 52.4% on SourcePulse

GitHubView on GitHub
Project Summary

The openFDA project provides open APIs, data downloads, and a developer community for FDA public datasets, including drugs, foods, and medical devices. It aims to make this data accessible for research and development, empowering users to build applications and gain insights into FDA-regulated products.

How It Works

This project utilizes a Python-based Luigi pipeline to process raw FDA data into a JSON format suitable for Elasticsearch. An Elasticsearch cluster stores this processed data, and a Node.js API server, built with Express and Elasticsearch.js, serves the data via a documented JSON interface (api.fda.gov). This architecture allows for efficient data ingestion, storage, and querying of large public health datasets.

Quick Start & Requirements

  • Installation: Run bootstrap.sh for Python virtualenv and Node.js package setup. Docker is recommended via docker-compose up.
  • Prerequisites: Elasticsearch 7, Python 3.6+, Node.js 14+. Linux users may need to increase vm.max_map_count to 262144. Windows users should use git clone ... --config core.autocrlf=input.
  • Resources: Docker setup includes Elasticsearch, API, and Python containers. API is available after pipelines complete; check http://localhost:8000/status.
  • Documentation: https://open.fda.gov

Highlighted Details

  • Powers the official api.fda.gov endpoints.
  • Includes pipelines for drugs, foods, and medical devices.
  • Provides Elasticsearch schemas for data sets.
  • Supports querying via standard openFDA syntax.

Maintenance & Community

The project is an FDA initiative. Community interest may drive the addition of more data pipelines.

Licensing & Compatibility

The repository does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project explicitly states: "Do not rely on openFDA to make decisions regarding medical care." Only a subset of pipelines (NSDE, CAERS, Substance Data, Device Clearance, Device PMA, Device Event) are included in the Docker setup due to complexity and network access requirements.

Health Check
Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
8 stars in the last 30 days

Explore Similar Projects

Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Anton Troynikov Anton Troynikov(Cofounder of Chroma), and
44 more.

llama_index by run-llama

0.3%
44k
Data framework for building LLM-powered agents
Created 2 years ago
Updated 18 hours ago
Starred by Mike Krieger Mike Krieger(CPO at Anthropic; Cofounder of Instagram), Patrick von Platen Patrick von Platen(Author of Hugging Face Diffusers; Research Engineer at Mistral), and
25 more.

redis by redis

0.1%
71k
Redis is a versatile data structure server, cache, and query engine
Created 16 years ago
Updated 3 days ago
Feedback? Help us improve.