Soccer analytics handbook for getting started in the field
Top 26.7% on sourcepulse
This repository provides a comprehensive handbook for individuals looking to enter the field of soccer analytics. It offers guidance on essential technical skills, historical context, and practical tutorials using Python and open-source soccer data, targeting aspiring analysts and data scientists.
How It Works
The handbook leverages Jupyter notebooks to deliver tutorials on data science techniques relevant to soccer analytics. It emphasizes Python as the primary programming language, utilizing the SciPy stack (NumPy, Pandas, Matplotlib, scikit-learn) and specialized libraries like mplsoccer
and kloppy
for data manipulation and visualization. The approach prioritizes using readily available, pip-installable packages and open-source data from providers like StatsBomb and Metrica.
Quick Start & Requirements
Highlighted Details
Maintenance & Community
The project is actively maintained, with a February 2023 update significantly reworking code samples. The author encourages community engagement through pull requests and issues. Links to external resources like Twitter threads for project ideas are included.
Licensing & Compatibility
The repository itself is likely under a permissive license given its nature as a handbook and the encouragement of community contributions. Specific software libraries used have their own licenses. Compatibility for commercial use depends on the licenses of the data sources and libraries employed.
Limitations & Caveats
While the handbook aims to be comprehensive, it notes that success in soccer analytics also depends heavily on factors beyond technical skills, such as "good fortune and timing." The project relies on the continued availability of open-source data from providers like StatsBomb and Metrica.
1 year ago
1 day