Sports Analytics Computer Vision System

Overview

My master's thesis at Vrije Universiteit Amsterdam, developed at the Sports Intelligence Lab. Body orientation at the moment a player receives the ball, and scanning checks performed before receiving, are two off-the-ball statistics that coaching staff care about. Both are still annotated manually by professional analysts. This thesis proposes a single computer vision pipeline that automates them from broadcast match footage.

Validated on a 2024/25 Women's Super League match (Brighton vs Aston Villa), with predictions compared against Vantage analyst annotations. The body orientation system reached 75% accuracy. Scan counting did not work well at broadcast resolution and is reported as a partial result; details below.

Pipeline Architecture

Pipeline

The pipeline is rule-based plumbing around six learned components:

Player and ball detection. Two separate YOLOv8 models, one for players and one for the ball. Combining them into a single multi-class model was tried first and produced a model heavily biased toward the player class because of the annotation imbalance: 11,368 player annotations versus 886 ball annotations in the Roboflow Universe data.
Multi-object tracking. Norfair (Kalman-filter-based) for identity-stable player tracks across frames, with re-identification logic to recover after occlusions.
Automatic team selection. HSV color sampling at the player's bounding box centre plus K-means clustering into two teams. Documented failure mode: kits with white numbers on the centre stripe (e.g. ID 26 in evaluation) misclassify because the sampled pixel is white, not the team color.
Pass detection. Rule-based: a pass event is registered when ball possession transfers between two players on the same team, using a foot-to-ball cosine distance and a 1m possession threshold. 8 of 14 ground-truth passes were detected on the evaluation footage; the misses were almost all caused by the ball being occluded by the passer's body, so the ball detector lost the ball at the moment of release.
Body orientation. GluonCV ResNet-152 pose estimation extracts shoulder keypoints. The shoulder vector, its perpendicular (forward direction), and the angular difference to a user-set forward vector pointing at the opponent goal classify orientation as Open, Half-Open, or Closed. 75% accuracy vs analyst annotations on detected passes.
Scanning. Euler angles from facial keypoints, with a 45 deg/sec angular velocity threshold to count rapid head movements as scans.

Player Detection and Tracking Output

Keypoints from GluonCV (a) and orientation vectors: shoulder (red), perpendicular (blue), forward (green), angular difference (purple) (b)

Body Orientation Categories: Open, Closed, Half-Open

Results

The honest finding on scan counting: at the resolution of broadcast football footage, facial keypoints are not reliable enough for the Euler-angle approach to work. The thesis treats this as a negative result and proposes higher resolution capture or hybrid head-pose / optical-flow approaches as the next step.

Publication

The body orientation portion of this work was published at ACM MMSports '25 in Dublin (October 2025) as Semi-Automatic Estimation of Body Orientation in Football, with Mauricio Verano Merino and Elixabete Sarasola Nieto.

Stack

PyTorch, OpenCV, YOLOv8, GluonCV (MXNet, ResNet-152), Norfair, Roboflow Universe data, Python.