FedProx by litian96

Research paper code for federated learning in heterogeneous networks

Created 7 years ago

712 stars

Top 48.2% on SourcePulse

View on GitHub

1 Expert Loves This Project

Ce Zhang

Cofounder of Together AI; Professor at UChicago

Project Summary

This repository provides the implementation for FedProx, a federated learning optimization framework designed to address heterogeneity in distributed networks. It targets researchers and practitioners in federated learning, offering a more robust convergence solution than FedAvg, particularly in statistically heterogeneous environments, with reported average improvements of 22% in test accuracy.

How It Works

FedProx introduces a proximal term to the local client objective function. This term penalizes deviations from the global model, effectively regularizing local training and mitigating the divergence caused by non-identically distributed data across clients. This approach enhances convergence stability and accuracy in heterogeneous settings.

Quick Start & Requirements

Install dependencies: pip3 install -r requirements.txt
Run synthetic data experiments (CPU-compatible):
- export CUDA_VISIBLE_DEVICES=
- bash run_fedprox.sh synthetic_iid 0 1 | tee log_synthetic/synthetic_iid_client10_epoch20_mu1
For real datasets, specify GPU (export CUDA_VISIBLE_DEVICES=available_gpu_id) and modify run_fedavg.sh/run_fedprox.sh with dataset-specific models and hyperparameters.
Official documentation and paper: MLSys '20 Federated Optimization in Heterogeneous Networks

Highlighted Details

Implements FedProx, a proximal term added to local client objectives for improved convergence.
Provides code and experiments for empirical evaluation on synthetic and real-world federated datasets (MNIST, FEMNIST, Shakespeare, Sent140).
Includes scripts to reproduce paper results and generate figures for loss, accuracy, and dissimilarity.
Offers guidance on hyperparameter tuning, particularly the mu parameter, which requires adjustment based on the dataset and heterogeneity.

Maintenance & Community

The project is associated with the MLSys '20 paper. Further community engagement or maintenance status is not detailed in the README.

Licensing & Compatibility

The repository does not explicitly state a license. Users should verify compatibility for commercial or closed-source use.

Limitations & Caveats

Running experiments on real-world federated datasets can be time-consuming due to dataset size and model complexity. Hyperparameter tuning, especially for the mu parameter, is critical and dataset-dependent.

Health Check

Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

7 stars in the last 30 days