analytics-handbook  by devinpleuler

Soccer analytics handbook for getting started in the field

created 5 years ago
1,605 stars

Top 26.7% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a comprehensive handbook for individuals looking to enter the field of soccer analytics. It offers guidance on essential technical skills, historical context, and practical tutorials using Python and open-source soccer data, targeting aspiring analysts and data scientists.

How It Works

The handbook leverages Jupyter notebooks to deliver tutorials on data science techniques relevant to soccer analytics. It emphasizes Python as the primary programming language, utilizing the SciPy stack (NumPy, Pandas, Matplotlib, scikit-learn) and specialized libraries like mplsoccer and kloppy for data manipulation and visualization. The approach prioritizes using readily available, pip-installable packages and open-source data from providers like StatsBomb and Metrica.

Quick Start & Requirements

  • Install: Primarily uses pip-installable Python packages. Code samples are available in Jupyter notebooks, often hosted on Google Colab.
  • Prerequisites: Python 3.7+, Git, and familiarity with command-line basics are recommended.
  • Data: Utilizes StatsBomb Open Data and Metrica open data.
  • Resources: Links to Google Colab for notebooks, official Python documentation, and other relevant resources are provided within the handbook.

Highlighted Details

  • Focuses on practical application with real-world soccer data.
  • Covers both event data and tracking data analysis.
  • Includes sections on essential complementary skills like SQL, Git, and data visualization libraries (D3.js, Altair, Seaborn).
  • Curates a list of key research papers, blog posts, and talks in soccer analytics.

Maintenance & Community

The project is actively maintained, with a February 2023 update significantly reworking code samples. The author encourages community engagement through pull requests and issues. Links to external resources like Twitter threads for project ideas are included.

Licensing & Compatibility

The repository itself is likely under a permissive license given its nature as a handbook and the encouragement of community contributions. Specific software libraries used have their own licenses. Compatibility for commercial use depends on the licenses of the data sources and libraries employed.

Limitations & Caveats

While the handbook aims to be comprehensive, it notes that success in soccer analytics also depends heavily on factors beyond technical skills, such as "good fortune and timing." The project relies on the continued availability of open-source data from providers like StatsBomb and Metrica.

Health Check
Last commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
28 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.