Select Language

AI社区

公开数据集

2019年澳大利亚网球公开赛总决赛

2019年澳大利亚网球公开赛总决赛

0.11M
410 浏览
0 喜欢
43 次下载
0 条讨论
News,Sports,Tennis Classification

数据结构 ? 0.11M

    Data Structure ?

    * 以上分析是由系统提取分析形成的结果,具体实际数据为准。

    README.md

    Context Nowadays, in most sports either tracking or event data is available for sports data scientists to analyse leagues, teams, games or players. For example, in soccer event-based data is available for all major leagues by professional data providers like [Opta](https://www.optasports.com/), [Statsbomb](https://statsbomb.com/) or [Wyscout](https://wyscout.com/). For tennis this is different. Even though a camera-based tracking with Hawkeye is possible, this data is not available to the outside and only the largest courts are equipped with the system. When I think about the latest breakthroughs in machine learning in image classification, detection, NLP ([deepl.com](https://www.deepl.com/translator)) and audio recognition ([Siri](https://www.apple.com/siri/), [Alexa](https://en.wikipedia.org/wiki/Amazon_Alexa)) it is evident that all of these areas provide a huge amount of *easily accessable* data. Personally, I expect that there would be way more research in tennis if there would be a large amount of freely available match data. There exists statistics of all matches played on ATP Tour which are available from different sources. For example, Jeff Sackmans [github repository](https://github.com/JeffSackmann) is a great way to start. He also has a [match charting project](http://www.tennisabstract.com/blog/2015/09/23/the-match-charting-project-quick-start-guide/) where point-by-point data is collected. But when I think about tennis, it is about the movement of the players, their tactics, etc. It is the ball movement, the actual rallies and shots I want to be able to see and analyse. Event data allows to capture positional, temporal and stroke information. As a proof of concept, and a tribute to Novac Djokovic and Rafael Nadal, two of the greatest tennis players of all time, I manually annotated each rally and stroke of their [Australian Open final 2019](https://www.atptour.com/en/scores/2019/580/MS001/match-stats?isLive=False). Fortunately for me it only went over three sets. Content The data consists of all points played in the match. It is build hierarchically from **events**, to **rallies**, to actual **points**. - **Points**: a list of all points played in the final with information about the server, receiver, point type, number of strokes, time of rally, new score of the game. - **Rallies**: A list of all rallies with Server, Returner, etc. - **Events**: Each time a player hit the ball, the stroke type, position of the player, and position of the opponent were recorded. - **Serves**: For each successful serve, which was no failure, the position of the serve in the service box was recorded (whenever possible) I have already done the hard part of data cleaning, and the dataset is hopefully easy to understand and ready to use. Positions The x, y positions are with respect to the court coordinate system shown in Figure 1. They were calculated from the pixel coordinates through a [direct linear transformation][1] at the beginning of the match. (As the camera angle changed a bit during the match, some of the positions are off.) ![The court coordinate system. The horizontal axis refers to x and the vertical axis to the y-direction.][2] Inspiration Look into the data, see what you can find. Is there information about the game in positional, temporal and stroke information that can tell you more about the players and the match than simple match sheet statistics like the number of break points or first serves in? You can use the dataset however you want, but here are some things you could start with. - It is a great way to practice pandas to generate general statistics like points played, serve percentages, games won, breakpoints etc. and compare them with the statistics from other websites. - You can visualize the spatial positioning of the players on the court. I.e. answer the question if there is a difference between the return position of Nadal and Djokovic. - You can calculate movement statistics like distance covered. - You can calculate the percentage of forehand and backhands, or shot types like slice, topspin for each player. - You can find out where the players are serving to? (Do not forget that Nadal is a lefty). To get you started, I have created a sample kernel. Find it [here](https://www.kaggle.com/robseidl/australian-open-mens-final-2019-data-exploration). [1]: https://en.wikipedia.org/wiki/Direct_linear_transformation [2]: https://www.dropbox.com/s/gakg677f0uvhmb2/Screenshot%202019-03-02%2021.44.11.png?raw=1
    ×

    帕依提提提温馨提示

    该数据集正在整理中,为您准备了其他渠道,请您使用

    注:部分数据正在处理中,未能直接提供下载,还请大家理解和支持。
    暂无相关内容。
    暂无相关内容。
    • 分享你的想法
    去分享你的想法~~

    全部内容

      欢迎交流分享
      开始分享您的观点和意见,和大家一起交流分享.
    所需积分:0 去赚积分?
    • 410浏览
    • 43下载
    • 0点赞
    • 收藏
    • 分享