Skip to content

Harnessing the power of tracking data to drive analytics and scouting for collegiate basketball 🏀📊 | Powered by Streamlit

License

Notifications You must be signed in to change notification settings

sejaldua/synergy-scouting-app

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

synergy-scouting-app

Deployed Application: http://bit.ly/synergy-basketball-scouting
Codebase: /~https://github.com/sejaldua/synergy-scouting-app

About the App

Synergy is a data tracking platform used primarily for scouting purposes in collegiate basketball. Its current bottlenecks include poor organization, poor usability, and lack of practicality in the context of DIII scouting. Most notably, user-facing data tends to be heavily skewed towards landslide games against out-of-conference opponents, rendering insights unactionable. We created this app to enable coaches to scout opponent teams with a filterable subset of games. Coaches can choose a team to scout, select similar caliper opponents that the scouted team has played, and then view a comprehensive breakdown of that team’s strengths and weaknesses through a breakdown play types. The data is displayed in a standard tabular format with complementary data visualizations to drive insights.

Terminology

Statistics Glossary
Statistic Description
Plays/Game The number of specified play type occurrences per game. NOTE: we decided not to focus on possession count because one possession can consist of many consecutive play types (e.g. a spot up followed by an offensive rebound, followed by an isolation, which evolves into a post up play and ultimately culminates with a made basket). For simplicity, we divided each possession into spliced sequences consisting of one unique play type each.
Points Points per game.
PPP Points per possession.
FGM Field goals made. This statistic includes both 2PT and 3PT baskets.
FGA Field goals attempted. This statistic includes 2PT and 3PT makes and misses.
FG% Field goal percentage. Formula:
aFG% Adjusted field goal percentage measures shooting efficiency by taking into account the total points a player produces through their field goal attempts. Its goal is to show what field goal percentage a two-point shooter would have to shoot at to match the output of a player who also shoots three-pointers. Formula:
References: Wikipedia: Effective Field Goal Percentage & Basketball Reference Glossary
TO% Turnover rate. The percentage of plays that resulted in the other team gaining possession before a shot was attempted.
FT% Free throw percentage.
Play Type Glossary
Play Type Description
Pick-and-roll ball-handler These are possessions finished by the ball-handler in the pick-and-roll. This includes pull-ups, floaters, and shots at the rim by that player. It also includes possessions where the bal- handler shoots before even dribbling off of the screen, as well as when he denies the ball screen and dribbles away from the pick.
Pick-and-roll roll man These are the slips, rolls, and pops from screeners in the pick-and-roll. This is a tricky top-line stat to make judgements based off of due to the variation that exists within it. When analyzing players, I make an effort to look more at the efficiencies at each of those three specific actions. If a player has mediocre roll man numbers but elite popping data, there’s more value than initially meets the eye.
Transition Transition possessions are about the defense not being set, and don’t have anything to do with the time left on the shot clock. That means there’s no time cutoff that makes a possession a halfcourt possession rather than a transition possession. On a more granular level we can look deeper into the role a player had within a transition possession. That can be as a leak-out man, the ball-handler, left/right wing, or a trailer.
Off-screen These possessions are generated by a player running off of a screen, whether it be a pin-down, flare screen, elevator screens, or any other of the plethora of screen variations before they receive the ball. That player catches the ball coming off of a screen and either shoots immediately, dribbles into a pull up, dribbles into a floater, or dribbles and takes a shot at the rim. Occasions where a player curls off of a screen toward the basket are also counted. However, UCLA screens and flex screens do not fall into this category. Those would be logged as cuts.
Spot-up Spot-up possessions are similar to off-screen possessions, but there’s no screen being used before the player catches the ball. Players spotting up don’t need to be stationary, but they can’t be running off of screens before catching the ball. Players just standing in the corner before catching-and-shooting, or guys relocating to the 3-point line or fading to the corner and getting the ball on a kick out are all spotting up.
These possessions aren’t just catching and shoot. They can be catching-and-shooting, but attacking a close-out by dribbling into a pull-up, dribbling into a floater, or driving to the rim are also included.
Isolation I don’t think I need to explain isolation, but I will say one thing: If one of these other actions occurs and is broken, it may end up being logged as an isolation possession. For example, if a pick-and-roll ball handler dribbles off of the screen and needs to retreat dribble twice, then attacks after that substantial delay, it’s an iso possession. A spot up possession where the player catches then does several jabs or tries to size up his defender is now an isolation possession.
Hand-offs Handoffs are the dribble handoffs or flip/pitch plays we’ll see. They may come from the passer being stationary or the passer dribbling at the receiver and then handing the ball off. This is an action that isn’t used much by most teams. The Celtics are one of a few teams that run a lot of dribble handoffs.
Cuts This category includes backdoor cuts and dump-offs as “basket cuts”. UCLA cuts and flex cuts also fall into this category as “screen cuts”. “Flash cuts” are the third subgroup within the cut category. These include times a player, without a screen, cuts out or toward the ball to receive it (like for a V cut).
Putbacks Putbacks are the tip ins and quick shots after offensive rebounds. Very rarely this will also includes long rebounds that result in a quick shot. Due to most shots being right at the rim, these are generally very high PPP opportunities.
Post-up These are all of the traditional post-ups we’re accustomed to. This category counts back-to-the-basket and face-up post possessions.
Miscellaneous This is a potpourri of possessions that don’t fit into any of the other categories.

Reference: Nylon Calculus: How to understand Synergy play type categories

Data Wrangling

Using an HTML parser, this app scrapes collegiate basketball play-by-play data from Synergy. The scraped data looks like the following:

{
'team': 'MID', 
'corrected': False, 
'period': '1', 
'time': '18:40', 
'raw_plays': ['5 Max Bosco > P&R Ball Handler > Right P&R > Side > Go Away from Pick > Defense Commits', 'Ball Delivered', '10 Matt Folger > Spot-Up > Drives Right > To Basket > Make 2 Pts', '> Shot > Matt Folger > Any Type > 2 Point Attempt > Make 2 Pts'], 
'score_1': 2, 
'score_2': 2, 
'plays': ['5 Max Bosco', 'P&R Ball Handler', 'Right P&R', 'Side', 'Go Away from Pick', 'Defense Commits', 'Ball Delivered', '10 Matt Folger', 'Spot-Up', 'Drives Right', 'To Basket', 'Make 2 Pts'], 
'points': 2, 
'duration': datetime.timedelta(seconds=17)
}

In the above data snippet representing one possession that took place in the first quarter of a game between Amherst and Middlebury, we see many attributes of interest. Since this project focuses on play types, we chose to hone in on the "plays" field. We can observe that in the above posession, there was a pick-and-roll (ball handler) play which resulted in a spot-up play thereafter. Given that we wanted to do an analysis of the efficacy of various play types in order to scout an opponent and tailor game preparation based on their strengths and weaknesses, we needed to wrangle this possession data into sequences of events which can be queried by play type and/or player. The following dictionaries form the basis of our project:

Play type dictionary (sequences of events)

{ 'play type': [play sequence 1, play sequence 2, ...]}

See example
{
'Spot-Up': [['25 Fru Che', 'Spot-Up', 'Drives Right', 'To Basket', 'Turnover'], ['3 Devonn Allen', 'Spot-Up', 'No Dribble Jumper', 'Guarded', 'Long/3pt', 'Make 3 Pts'], ... ], 
'Transition': [['33 Eric Sellew', 'Transition', 'Ballhandler', 'Dribble Jumper', "Short to < 17'", 'Miss 2 Pts'], ... ], 
'Post-Up': [['33 Eric Sellew', 'Post-Up', 'Left Block', 'Left Shoulder', 'Dribble Move', 'To Basket', 'Make 2 Pts'],
'P&R Ball Handler': [['3 Devonn Allen', 'P&R Ball Handler', 'High P&R', 'Dribble Off Pick', 'Defense Commits', 'Turnover'], ... ]
...
}

Play type dictionary (statistics)

{ 'play type': {'stat 1': ##.##, 'stat 2': ##.##, 'stat 3': ##.##, ...}}

See example
{
'Spot-Up': {'Plays/Game': 20.0, 'Points': 23.0, 'PPP': 1.15, 'FGM': 7.0, 'FGA': 17.0, 'FG%': 41.17647058823529, 'aFG%': 64.70588235294117, 'TO%': 10.0, 'FT%': 33.33333333333333}, 
'Transition': {'Plays/Game': 12.0, 'Points': 9.0, 'PPP': 0.75, 'FGM': 3.0, 'FGA': 11.0, 'FG%': 27.27272727272727, 'aFG%': 36.36363636363637, 'TO%': 0.0, 'FT%': 100.0}, 
'Post-Up': {'Plays/Game': 13.0, 'Points': 6.0, 'PPP': 0.46153846153846156, 'FGM': 3.0, 'FGA': 9.0, 'FG%': 33.33333333333333, 'aFG%': 33.33333333333333, 'TO%': 7.6923076923076925, 'FT%': nan}, 
'P&R Ball Handler': {'Plays/Game': 22.0, 'Points': 13.0, 'PPP': 0.5909090909090909, 'FGM': 2.0, 'FGA': 2.0, 'FG%': 100.0, 'aFG%': 250.0, 'TO%': 18.181818181818183, 'FT%': 60.0}
...
}

Player dictionary (sequences of events)

{ 'player': [play sequence 1, play sequence 2, ...]}

See example
{
'11 Grant Robinson': [['11 Grant Robinson', 'Spot-Up', 'No Dribble Jumper', 'Guarded', 'Long/3pt', 'Miss 3 Pts'], ['11 Grant Robinson', 'Spot-Up', 'Drives Right', 'To Basket', 'Make 2 Pts'], ... ]
'25 Fru Che': [['25 Fru Che', 'Spot-Up', 'No Dribble Jumper', 'Guarded', 'Long/3pt', 'Miss 3 Pts'], ['25 Fru Che', 'Spot-Up', 'No Dribble Jumper', 'Guarded', 'Long/3pt', 'Make 3 Pts'], ... ]
'20 Josh Chery': [['20 Josh Chery', 'Spot-Up', 'Drives Left', 'To Basket', 'Miss 2 Pts'], ['20 Josh Chery', 'Transition', 'Right Wing', 'No Dribble Jumper', 'Open', 'Long/3pt', 'Miss 3 Pts'], ['20 Josh Chery', 'Post-Up', 'Left Block', 'Left Shoulder', 'Dribble Move', 'Defense Commits', 'Ball Delivered'], ... ]
...
}

Player dictionary (statistics)

{ 'player': {'stat 1': ##.##, 'stat 2': ##.##, 'stat 3': ##.##, ...}}

See example
{
'11 Grant Robinson': {'Plays/Game': 19.0, 'Points': 17.0, 'PPP': 0.8947368421052632, 'FGM': 7.0, 'FGA': 13.0, 'FG%': 53.84615384615385, 'aFG%': 65.38461538461539, 'TO%': 5.263157894736842, 'FT%': nan},
'25 Fru Che': {'Plays/Game': 10.0, 'Points': 20.0, 'PPP': 2.0, 'FGM': 2.0, 'FGA': 6.0, 'FG%': 33.33333333333333, 'aFG%': 108.33333333333333, 'TO%': 0.0, 'FT%': 87.5}, 
'15 Tim Mccarthy': {'Plays/Game': 7.0, 'Points': 10.0, 'PPP': 1.4285714285714286, 'FGM': 2.0, 'FGA': 3.0, 'FG%': 66.66666666666666, 'aFG%': 133.33333333333331, 'TO%': 0.0, 'FT%': 100.0}, 
'1 Garrett Day': {'Plays/Game': 15.0, 'Points': 10.0, 'PPP': 0.6666666666666666, 'FGM': 3.0, 'FGA': 9.0, 'FG%': 33.33333333333333, 'aFG%': 55.55555555555556, 'TO%': 6.666666666666667, 'FT%': 0.0}, 
'20 Josh Chery': {'Plays/Game': 13.0, 'Points': 4.0, 'PPP': 0.3076923076923077, 'FGM': 2.0, 'FGA': 6.0, 'FG%': 33.33333333333333, 'aFG%': 33.33333333333333, 'TO%': 0.0, 'FT%': nan},
...
}

Usage

Wrangling Testing: python synergy_parse.py amherst sequence_dump
Staging Testing: streamlit run app.py

About

Harnessing the power of tracking data to drive analytics and scouting for collegiate basketball 🏀📊 | Powered by Streamlit

Topics

Resources

License

Stars

Watchers

Forks