Machine Learning in the AFL

Discussion in 'Aussie Rules Football Discussion' started by Blake, Apr 2, 2016.

  1. Mousey AJ Son

    Does this include results from previous seasons? If so it looks pretty bloody good.
     
  2. Blake BE Quilty

    No variables apart from team vs team results. This is using an ELO rating system for each team, similar to which is used in chess to score players. It gives teams more points for beating teams better than them, and vice versa if they lose to teams weaker than them. The good thing about this is that it can be run historically to get an idea of which team had the most dominance over their peers. I might try and collect some more data and see which teams come out on top.

    Yep! Started from 2000 (although because teams' ratings reset to a degree each season, it would have virtually no impact). However, ratings from 2014/13/12 are still taken into account in this model.
     
  3. Blake BE Quilty

    I've spent today adding 1960-1999 into the database. Looking from 1960 forwards, we can now do a historical comparison between teams of different eras:

    [​IMG]
     
  4. Blake BE Quilty

    BEBot 1.0 tips this weekend's round of footy:

    North Melbourne v Western Bulldogs
    Melbourne v St Kilda
    Adelaide Crows v Fremantle
    GWS Giants v Hawthorn
    Richmond v Port Adelaide
    Geelong Cats v Gold Coast Suns
    Brisbane Lions v Sydney Swans
    Carlton v Essendon
    West Coast Eagles v Collingwood
     
  5. Blake BE Quilty

    BeBot 1.1 in progress - now factors in game margins when adjusting team's ratings after a win or loss. This probably won't make much sense immediately, but this is the process I am using to optimise the parameters of my ELO system. One of the main factors is the K value, or recency bias as I've deemed it.

    How the K value affects ELO:

    New_Rating = Old_Rating + K * (Result - Expected)

    So, if a team is 50% chance of victory based off their ELO rating, and wins the game, and has a base rating of 1500, the formula would look like:

    New_Rating = 1500 + K * (1 - 0.5)
    New_Rating = 1500 + 0.5K

    With BEBot 1.1 in progress, I am again modifying this ratings process:

    New_Rating = Old_Rating + K * (Result - Expected) * Margin_Adjustment

    This Margin_Adjustment is a new feature. Ideally, with a margin of 0, the adjustment would be equal to 1, meaning that the K value stays the same. The trick is now finding the best function for the margin adjustment, such that a victory of 100 points increases the rating by an additional... well, how many really should it?

    Here's an example function I've used for BEBot 1.1:

    Margin_Adjustment = (our_score/their_score)^alpha

    As alpha increases, the margin adjustment increases as well, giving the margin more impact on the ratings change.

    Plotting different values of K and alpha using this function gives us the following heatmap, with darker values representing an improved prediction accuracy:

    [​IMG]

    It can be seen that as K gets too high, the ELOs go a little crazy and adjust way too much to the most recent results, lowering our prediction accuracy. Where does the true balance lie?
     
  6. Blake BE Quilty

    Some slightly disappointing results - I was hoping for something interesting here. This graph shows the team's difference against their season average this week, based on their performance from last week. I was hoping that teams with big losses would actually improve on their season average through something like a 'rebound' effect, but the results definitely seem somewhat random. Teams winning by 36-60 points get complacent the week after, but teams winning by 61+ are just in good form and continue to play well? Statistical variations? Who knows :p

    [​IMG]
     
    Last edited: Apr 27, 2016
  7. Blake BE Quilty

    BEBot 2.0 complete. Instead of performing rating adjustments based off a win, loss or draw, a new function is used which calculates an expected margin given the ELO difference of the two sides. If a team performs better than this expected margin, their ELO rating goes up - otherwise, it goes down.

    It's given me about a 0.3-0.5% increase in accuracy when comparing results from 1995-2016. Running it on the 2015 season now results in a performance of 140/197 (71.066%), one tip ahead of the best Herald Sun human tipster for the year. However, it only averages 66.9% when considering all games from 1995 onwards, and 67.7% when considering games from 2005 onwards.
     
    Last edited: May 2, 2016
  8. Blake BE Quilty

    Adjusted ELO ratings, post 2016 Round 5:

    [​IMG]
     
  9. Blake BE Quilty

    6/9 for the round, with Melbourne, Hawthorn and Essendon letting down BEBot 1.0.

    The tips that BEBot 2.0 would have made differently:
    Western Bulldogs instead of North Melbourne - WRONG
    Carlton instead of Essendon - CORRECT
     
  10. Blake BE Quilty

    Predicted margins and ratings change after Round 6:

    [​IMG]
     
  11. Blake BE Quilty

    BEBot 2.0 2016 Round 7 Tips:

    Richmond v Hawthorn
    Collingwood v Carlton
    Geelong v West Coast
    Sydney v Essendon
    Gold Coast v Melbourne
    Western Bulldogs v Adelaide
    Fremantle v GWS
    St Kilda v North Melbourne
    Port Adelaide v Brisbane Lions
     
  12. Mousey AJ Son

    Does it take into account home field advantage yet in tips?
     
  13. Blake BE Quilty

    Nope. It's still fairly basic, and doesn't take home advantage/rest/other factors into consideration yet. Despite that, I think it's still performing pretty well so hopefully these factors only continue to improve it.
     
    Last edited: May 3, 2016
  14. Blake BE Quilty

    As quoted from MatterOfStats,

    "These days, I reckon I know what a good margin forecaster looks like. Any person or algorithm - and I'm still at the point where I think there's a meaningful distinction to be made there - who (that?) can consistently predict margins within 30 points of the actual result is unarguably competent. That benchmark is based on the empirical performances I've seen from others and measured for my own forecasting models across the last decade of analysing football."

    I've decided to give my (slightly tweaked) algorithm a run at predicting margin error. It looks like I'm still a little bit off the elusive <= 30 point zone, but I'm still performing very reasonably. Here's the results over some different eras:

    [​IMG]
     
    Last edited: May 3, 2016
  15. Blake BE Quilty

    Now to answer the question: which team was worse upon introduction to their competition - Gold Coast in 2011, or GWS Giants in 2012?

    <iframe width="560" height="315" src="https://www.youtube.com/embed/wv2hr42Nxvo" frameborder="0" allowfullscreen=""></iframe>

    <iframe width="560" height="315" src="https://www.youtube.com/embed/uzfXdtH-XCU" frameborder="0" allowfullscreen=""></iframe>

    By testing the model's performance using a variety of different ELO ratings for GWS/GC upon introduction to the competition, the results are as follows:

    [​IMG]

    GWS were narrowly better than Gold Coast with an optimal introductory rating of 1381.5, ahead of Gold Coast's 1374.5.

    They did, however, finish last for their first 2 seasons...
     
    Last edited: May 3, 2016
  16. Blake BE Quilty

    BEBot v2.1 is joining its robot companions and leaving the humans behind...

    [​IMG]
     
  17. Blake BE Quilty

    I'm now working on a home advantage algorithm. Seems to be very efficient. I've decided to scrap the typical distance travelled/familiarity approach and instead go with one that detects whether a team has over or under-performed based on the model's prediction.

    I've made a bar graph of every team's performance at every ground used in the last 10 years, but I'm not going to post 18 images - if you're interested then request a team and I'll post it up. Here's a table of the top 25 ground advantages, and the bottom 25 ground performers...

    Difference is the difference in expected margin when playing a team rated as average (1500).

    TOP 25
    [​IMG]

    BOTTOM 25
    [​IMG]

    The biggest ground discrepancy is between West Coast and Western Bulldogs at Subiaco. If both teams were equally rated, West Coast would expect a 17.8 point advantage on average - a 3 goal start, effectively.
     
    Last edited: May 6, 2016
  18. lpd New Member

    The website seems inaccessible now... are there any copies of the dataset floating around?
     
  19. Blake BE Quilty

    After optimising the crap out of BEBot I feel like it has hit its limits, averaging a shade under 30 points per game when attempting to predict the margin of the match. I've started work on a new model and have had some unbelievable results.. to the point where I feel like I might have screwed up in implementing the testing. Let's just say this will be blowing 70% out of the park if it is actually legitimate. Edit: Turns out I messed it up as I was expecting. However, importantly it is still delivering an improvement over BEBot!
     
    Last edited: May 16, 2016
  20. Blake BE Quilty

    Retrospectively fitting tips from the old model (BEBot v5.0). This will most likely be the last we see from it.

    Round 7
    Richmond vs. Hawthorn, M.C.G., Prediction: Hawthorn by 20 CORRECT
    Collingwood vs. Carlton, M.C.G., Prediction: Collingwood by 12 WRONG
    Geelong vs. West Coast, Kardinia Park, Prediction: Geelong by 2 CORRECT
    Sydney vs. Essendon, S.C.G., Prediction: Sydney by 33 CORRECT
    Gold Coast vs. Melbourne, Carrara, Prediction: Melbourne by 4 CORRECT
    Western Bulldogs vs. Adelaide, Docklands, Prediction: Western Bulldogs by 4 CORRECT
    Fremantle vs. Greater Western Sydney, Subiaco, Prediction: Greater Western Sydney by 10 CORRECT
    St Kilda vs. North Melbourne, Docklands, Prediction: North Melbourne by 24 CORRECT
    Port Adelaide vs. Brisbane Lions, Adelaide Oval, Prediction: Port Adelaide by 15 CORRECT

    8/9 in an admittedly somewhat easy round for tipping.

    Round 8
    Adelaide vs. Geelong, Adelaide Oval, Prediction: Geelong by 8 CORRECT
    Essendon vs. North Melbourne, Docklands, Prediction: North Melbourne by 32 CORRECT
    Hawthorn vs. Fremantle, York Park, Prediction: Hawthorn by 22 CORRECT
    Greater Western Sydney vs. Gold Coast, Sydney Showground, Prediction: Greater Western Sydney by 18 CORRECT
    Richmond vs. Sydney, M.C.G., Prediction: Sydney by 24 WRONG
    Brisbane Lions vs. Collingwood, Gabba, Prediction: Collingwood by 14 CORRECT
    Carlton vs. Port Adelaide, Docklands, Prediction: Port Adelaide by 19 WRONG
    Melbourne vs. Western Bulldogs, M.C.G., Prediction: Western Bulldogs by 15 CORRECT
    West Coast vs. St Kilda, Subiaco, Prediction: West Coast by 28 CORRECT

    7/9

    Definitely some room for improvement in the margin tipping. However, a fairly impressive display from this model to tip autonomously. I don't think I would have made many different tips if I was predicting myself.

    Next, I'm going to be posting up the equivalent tips from my new model.
     
    Last edited: May 16, 2016

Share This Page