Edd Webster Football Analytics
A space for football analytics projects by Edd Webster, including a curated list of publicly available resources published by the football analytics community
👋 About This Repository and Author
The README of this repository is a resources guide of learning materials, data sources, libraries, papers, blogs, , etc., created by all those that have made contributions to the open source football analytics community. This GitHub repository and resources list is always a work in progress, with new resources added semi-regularly. If you feel there's any resource(s) that I have missed, please feel free to create a pull request or send me a message on the links above and I'll get back to you as quick as I can!
If you like the repo, please feel free to give it a ⭐ (top right). Cheers!
For more information about this repository and the author, see the following:
📍 Table of Contents
Table of Contents
- 👋 About This Repository and Author
- 📍 Table of Contents
- 🚀 Getting Started
- 🌵 Repository Structure
- 📚 Source Code and Notebooks
- 📈 Data Visualisation and Tableau
-
📑 Resources
- 🔖 Other Resources Guides
- 🏃 Getting Started with Football Analytics
- 💾 Data
- Data Sources
- Event data
- Tracking data
- Broadcast Tracking data
- Aggregated Player/Team Performance data
- Team Rating data
- Physical data
- Results and Matchsheet data
- Financial, Valuation, and Transfer data
- Odds, Betting, and Predictions data
- Plotting tools
- Reference data
- Miscellaneous data
- Documentation
- Data Companies and Types
- 🧑🎓 Tutorials
- 🏛️ Libraries
- 📁 GitHub Repositories
- 📱 Apps
- 📊 Data Visualisation Resources and Tools
- ✒️ Written Pieces
- 📼 Video
- YouTube Playlists
- YouTube Channels
- Video Analysis
- Webinars and Lectures
- Ted Talks
- Documentaries
- Match Highlights
- Other
- 🔊 Podcasts
- 👨💻 Notable Figures and Twitter Accounts
- 🗓️ Events and Conferences
- 🏆 Competitions
- 🧑🏫 Courses
- 💼 Jobs
- 💬 Discord / Slack Groups
- 🔑 Key Concepts
- History of Football Analytics
- Expected Goals (xG) Modeling
- Web Scraping Football Data
- Tracking Data
- Pitch Control Modeling
- Passing Networks
- Possession Value (PV) Frameworks
- General
- Expected Threat (xT)
- Valuing Actions by Estimating Probabilities (VAEP)
- Goals Added (g+)
- On-Ball Value (OBV)
- Dixon Coles Modeling
- Player Similarity and Style Analysis
- Team Playing Style Analysis
- Player Rating
- Reinforcement Learning for Football Simulation
- Set Pieces
- Radars
- Recruitment Analysis
- Player Valuation Modeling
- Quantifying Relative Club and League Strength
- Tactics
- Game Win Probability Modelling
- Goalkeeper Analysis
- 🗣️ Citations
- 🤝 Contributing
- ⭐ Star Tracker
- 👏 Acknowledgements
🚀 Getting Started
✅ Dependencies
The code in this repository is written in a mix of both Python and R. Before you begin, ensure that you have the following prerequisites installed:
- Python (ideally 3.6.1+ installed)
- R (ideally 4.0.4+ installed)
- The following Python and R libraries...
🐍 Python
General Python data science libraries:
NumPy
for multidimensional array computing;pandas
for data analysis and manipulation;matplotlib
andSeaborn
for data visualisation; andscitkit-learn
andSciPy
for Machine Learning.
Football analytics Python libraries:
kloppy
- a package for standardising tracking and event data by Koen Vossen and Jan Van Haaren. See the YouTube tutorial [link]floodlight
by floodlight-sports - package for streamlined analysis of sports data. It is designed with a clear focus on scientific computing and built upon popular libraries such as numpy or pandas. See the following documentation [link]matplotsoccer
- a Python library for visualising soccer event data by Tom Decroosmplsoccer
- a Python library for plotting football pitches in matplotlib by Andrew RowlinsonPySport
includingPySport Soccer
- collection of open-source sport packages including many of those mentioned in this section, by Koen VossenScraperFC
by Owen Seymour - a Python package to scrape data from FiveThirtyEight data, FBref, Understat, Club Elo, Capology and TransferMarkt. Previously scraped Opta event data through the WhoScored? match center (functionality now removed but see old versions