Continuing on with the machine learning theme of the last few posts, where previously I had looked at how a high level scouting report can be created for teams. I now wanted to get into some nitty-grity detail and see what insights machine learning can discover in terms of specific 5 man line-ups.
The goal here is to find out what individual 5 man line-ups do well/not so well and if we can find insights to how certain players impact different parts of the game.
We should also be able to learn how different combinations of players work together and what coaches can then put into place to get the most out of those line-ups. If there are line-ups that are a negative, meaning they get outscored by the oppositions 5 man line-up we should now be able to pinpoint how to improve those line-ups. Making them either a break even or positive line-up or perhaps there is no hope for that line-up and coaches should stop using them.
Right now there is really only +/- data available publicly for 5 man line-ups. Basically +/- tells you how many points a line-up either won by (+) or lost by (-) in the minutes they were on court for. However this doesn’t tell us why they are doing well or not so well and doesn’t tell us how they can improve.
I have been looking at Miami and Dallas in my previous machine learning posts, so I have continued that theme here and will kick things off with Dallas and follow up with Miami in a few days.
There isn’t much data publicly available for each 5 man line-up that takes the court, so I have taken the raw play by play logs (thanks to @nbastuffer https://www.nbastuffer.com/ ) and I have pulled out the 50 statistics that I have used to feed my previous algorithm, applying those stats to every Dallas and Miami line-up that has taken the court this season.
To give you an idea of how much data we are talking about, both teams have a combined 20,000+ rows of data from the play by play logs and over 1000 line-up changes have been made so far.
If a 5 man line-up combines for less than 1 minute of court time in a game, I have kicked this data out as it’s far to “noisy”. Noisy meaning there is a lot of randomness that can happen in under a minute and you don’t get a true reflection of what a line-up is doing.
The average line-up for these teams play 4.5 minutes together in a game, so to ensure we can compare apples with apples, I have converted the stats to be “per 4.5 minutes” of court time. Meaning if a line-up played 3 minutes together on the court together, I have converted that to what the stats would look like if they played 4.5 minutes together.
Algorithms need plenty of data, so I have also kicked out any line-ups that have played sparingly this season.
We are now in a position where I have captured the 50 vital stats for every line-up, converted them to per 4.5 minutes, and I will now put the top 5 used lineups into the algorithm to see what we can learn about them.
Results for Dallas
It’s getting harder and harder to play Dirk, last year the Mavs realized they couldn’t play him at the 4 spot, particularly with Bogut. Now they find themselves in a position where playing Dirk at the 5 is putting a lot of pressure on him to rebound and protect the paint on defense, both needed at a really high level. We see these struggles in lineups 4 and 5. The eye test shows Dirk is still working hard, he just needs the right lineup around him and the right match-ups.
In saying all of the above, line-up 1 where Dirk and Powell played together (2 traditional bigs) saw really nice results. Powell is an athlete and can really cover for Dirk on the glass, the Mavs just need to find the right match-ups to be able to play these two together. Powell playing on the stretch 4’s perhaps?
A trend out of all this is the Mavs generally play small ball with one of Dirk, Powell or Mejri as the 5 and Barnes playing a stretch 4. Barnes is doing a great job on the boards with career high rebounding numbers (7.3 per 36 minutes). They however need more rebounding from the perimeter, Matthews for example is really poor at 3.3 rebounds per 36 minutes. Matthews is often playing the 3 spot and 3.3 rebounds just isn’t going to cut it. The emergence and more minutes for Kleber will help here as he’s a better rebounder and can push Matthews to the traditional 2 spot where he won’t hurt lineups as much.
Line-up 2 with Salah Mejri at the 5 has tremendous upside if the group were even just average offensively. Mejri even though a huge man, does have some skills. Could Dallas utilize him in more hand-offs where he gets his big body in the way? What about high post passing, similar to how the Spurs utilize Pau Gasol, and running some offense through the elbows with Mejri? Right now a lot of what Mejri does on offense is just classic on-ball screens and hitting the offensive glass, I think he has a little more to his game than the basics.
Line-up 4 like the above has plenty of upside. They are getting to many shot attempts from inside the arc though, as Matthews is really the only volume shooter from outside the 3. Can this line-up look to get Dirk more 3’s? Kleber as well.
I hope this was of some value and shows what can be done with machine learning to gather lineup specific insights which coaches can then act on. Obviously the algorithm can only spit out so much in it's results, you then need to apply basketball knowledge to those findings and present something that coaches can act on. Hopefully I have demonstrated that here with the Mavs and will follow up with the Heat in a few days time.