Ball detection and tracking in sports has been gaining momentum recently. No matter what sports it is, if you manage to detect the ball, it’s easy to predict the results with computer vision systems. Today, ball tracking software enables both game results prediction and detailed analysis of the ball movement. The latter is of help for players who want to master the ball.
Let’s start with a very popular sport – golf. The COVID-19 has given an unexpected jolt to golf that has seen declining popularity recently. The safe outdoor activity during the lockdown has boosted the enthusiasm for the game. And thanks to that, it looks like it’s not going anywhere once the lockdown measures are eased.
If you’re in the AI Sports niche and planning to empower your game with the technology, or just want to know what’s coming down the pipeline, read further.
Golf Ball Detection: Benefits
Knowing how to detect the ball, you’ll be quick to detect the key events of the game and keep your finger on the pulse of it.
If you own a golf club, developing a full-blown Sports AI ball tracking software might be the key to helping your customers become better golfers. The computer vision development solution enables accurate ball detection and its trajectory tracing from cameras and shows whether the ball has hit the hole or not. Also, it resumes the ball-out-of-frame and ball-lost situations.
This kind of metrics provide the players with insights on their weak spots in golf and help them readjust the game strategy. Using the AI golf ball detector, the player can get a better swing and a better score promptly.
Ball Tracking Technology: Problem Formulation
Now we’ll shed some light on the basic approach to ball detection and tracking no matter what the sport is.
In many types of sports such as soccer, golf, tennis and football, the ball is a crucial component of the game and attracts a lot of attention. So, there is no doubt that the creation of an automated ball tracking system is an essential step towards the development of a robust Sports AI.
That is why in this blog post we decided to investigate how such ball tracking systems may be developed using recent advances in computer vision and deep learning. We could formulate the problem of automated ball tracking in the following way: at each image frame from a video we need to understand whether there is a ball in the image and determine its location if there is one. In the computer vision community, this problem is commonly referred to as object detection.
Knowing how to solve the ball detection problem will automatically give you an insight on how to solve other detection problems, e.g. detection of players on a playing field. Today, the ball detection problem is one of the top challenges to address with artificial intelligence in sports due to the speed of the ball and the fact that it is frequently occluded on the videos.
Ways to Approach the Problem
There are multiple ways and models that we could investigate to solve the speedy ball detection tracking problem. However, if we want a model that could be applied to solve real-world cases, we probably need to establish the following requirements for the model:
- The model should work with acceptable performance in the domain of interest, e.g. on actual sports videos.
- In many scenarios, the model should also be able to work in real time, as you may want to process a video stream without significant delays.
Having these requirements in mind, we may limit the space of possible models. For instance, simple OpenCV blob detection algorithms won’t achieve the required performance, since the environment is too difficult. On the contrary, such deep learning models as Faster R-CNN or Detectron2 cannot work in real time, even though they can possibly achieve the required performance.
All in all, we believe that with these requirements models from the YOLO family propose the best trade-off in terms of computational burden and accuracy. Now let’s take a closer look at the YOLO family and in particular YOLOv5, the latest model from the YOLO family.
Yolov5
YOLO is an abbreviation for “You Only Look Once” and is one of the most commonly used models for real-time object detection. The success of YOLO may be mainly attributed to its fully convolutional approach: to obtain all possible detection boxes and their classification only one neural network pass is actually needed. In particular, an input image is firstly propagated through a single neural network. The image is divided into regions and for each region the neural network predicts several bounding boxes together with their confidences as well as class probabilities. For every bounding box, the network predicts five main parameters: x and y coordinates of the bounding box’s center, width and heights of the bounding box as well as confidence score showing the probability that bounding box contains an object. At test time, after the non-maximal suppression stage, the algorithm outputs recognized objects and their locations.
Source: arXiv.org
In 2020 the latest model in the YOLO family, YOLOv5 was released on Github. It is now current state of the art of the YOLO object detection series, and sets the benchmark for object detection models to a very high standard. The more detailed comparison can be viewed on Github repo. For the results in this blog post we chose YOLOv5 because of its performance as well as convenient usage for both training and inference stages.
Results of Using Out-of-the-Box Model
As an initial step, we decided to test the performance of out-of-the-box model trained on COCO dataset. We took a lightweight YOLOv5m model for our experiments. And here are our results on ball tracking in different sports:
Source: YouTube
As we can see, the model can detect a ball in several frames, however it doesn’t perform up to the required standard to be used in real-world applications. The question is how can one improve the performance of the model? Adjusting an out-of-the-box model to one’s own data.
A Way to Improve: Fine-Tuning the Model
Assume you want to train a detector for a particular use case, e.g. tennis ball, baseball ball, etc. As we already showed in most cases a simple usage of out-of-the-box YOLOv5 model won’t work and we need to improve the performance somehow.
But you may wonder whether we will need to train the model from scratch, meaning that we have to collect tons of labelled data. Fortunately, the answer is no. Instead of training from scratch, you can adjust the parameters of the out-of-the-box model and get a much better performance as a result. This method is based on the idea that most features learned by the out-of-the-box model are general and relevant for other use cases. Hence, we just need to adapt the weights a little bit to perform better on a new task. For this process you don’t need to collect millions of labelled samples, several thouthands (or in some simple situations, hundreds) labelled images are usually enough to considerably improve the model performance.
We tested the fine-tuning capabilities of the model and optimized over the weights of the out-of-the box model using the same training losses. We also took a small labelled dataset, consisting of frames from football games. Here is what we got:
Source: YouTube
To Draw the Line
To conclude, we can say that YOLOv5 is indeed a powerful tool which can be used in ball detection in real-world cases. It offers a good balance between detection speed and accuracy. However, for sport use cases you cannot really rely on the out-of-the-box cases and you most likely will need to fine-tune the model to your particular use case.
Develop AI Ball Tracking Software with InData Labs
Having an idea of a ball tracking system based on deep learning? Schedule a call with our AI consultants to see how we can help.