Select Language

AI社区

公开数据集

足球赛事

足球赛事

174.44M
312 浏览
0 喜欢
4 次下载
0 条讨论
Arts and Entertainment,Games,Football Classification

数据结构 ? 174.44M

    Data Structure ?

    * 以上分析是由系统提取分析形成的结果,具体实际数据为准。

    README.md

    Context ------------- Most publicly available football (soccer) statistics are limited to aggregated data such as Goals, Shots, Fouls, Cards. When assessing performance or building predictive models, this simple aggregation, without any context, can be misleading. For example, a team that produced 10 shots on target from long range has a lower chance of scoring than a club that produced the same amount of shots from inside the box. However, metrics derived from this simple count of shots will similarly asses the two teams. A football game generates much more events and it is very important and interesting to take into account the context in which those events were generated. This dataset should keep sports analytics enthusiasts awake for long hours as the number of questions that can be asked is huge. Content ------- This dataset is a result of a very tiresome effort of webscraping and integrating different data sources. The central element is the text commentary. All the events were derived by reverse engineering the text commentary, using regex. Using this, I was able to derive 11 types of events, as well as the main player and secondary player involved in those events and many other statistics. In case I've missed extracting some useful information, you are gladly invited to do so and share your findings. The dataset provides a granular view of 9,074 games, totaling 941,009 events from the biggest 5 European football (soccer) leagues: England, Spain, Germany, Italy, France from 2011/2012 season to 2016/2017 season as of 25.01.2017. There are games that have been played during these seasons for which I could not collect detailed data. Overall, over 90% of the played games during these seasons have event data. The dataset is organized in 3 files: - **events.csv** contains event data about each game. Text commentary was scraped from: bbc.com, espn.com and onefootball.com - **ginf.csv** - contains metadata and market odds about each game. odds were collected from oddsportal.com - **dictionary.txt** contains a dictionary with the textual description of each categorical variable coded with integers Past Research ------------- I have used this data to: - create predictive models for football games in order to bet on football outcomes. - make visualizations about upcoming games - build expected goals models and compare players Inspiration ----------- There are tons of interesting questions a sports enthusiast can answer with this dataset. For example: - What is the value of a shot? Or what is the probability of a shot being a goal given it's location, shooter, league, assist method, gamestate, number of players on the pitch, time - known as expected goals (xG) models - When are teams more likely to score? - Which teams are the best or sloppiest at holding the lead? - Which teams or players make the best use of set pieces? - In which leagues is the referee more likely to give a card? - How do players compare when they shoot with their week foot versus strong foot? Or which players are ambidextrous? - Identify different styles of plays (shooting from long range vs shooting from the box, crossing the ball vs passing the ball, use of headers) - Which teams have a bias for attacking on a particular flank? And many many more...
    ×

    帕依提提提温馨提示

    该数据集正在整理中,为您准备了其他渠道,请您使用

    注:部分数据正在处理中,未能直接提供下载,还请大家理解和支持。
    暂无相关内容。
    暂无相关内容。
    • 分享你的想法
    去分享你的想法~~

    全部内容

      欢迎交流分享
      开始分享您的观点和意见,和大家一起交流分享.
    所需积分:0 去赚积分?
    • 312浏览
    • 4下载
    • 0点赞
    • 收藏
    • 分享