公开数据集
数据结构 ? 471.84G
README.md
数据量高达472GB,包括了119,197个视频,每个视频时长都为10秒,但是帧率从15~30fps不等,分辨率也从320x240~3840x2160不等。训练视频中有19,197个视频是由大约430名演员真实拍摄的片段,剩余100,000个视频是由真实视频生成的假脸视频。假脸生成使用了DeepFakes, GAN-based以及部分non-learned 方法,使得数据集包含尽可能多的假脸视频。该数据集中的视频包含声音,这是目前绝大部分数据集所不具备的,但是没有针对声音的标注信息。根据官网以及Kaggle竞赛的Leaderboard排名信息,目前SOTA分数loss在0.42左右,还有很大的提升空间,但是计算资源要求很高,根据调研部分人使用了超过8块以上的V100GPU,因此采用该数据集的论文很少。
This competition is closed for submissions. Participants' selected code submissions were re-run by the host on a privately-held test set and the private leaderboard results have been finalized. Late submissions will not be opened, due to an inability to replicate the unique design of this competition.
Training Set
This code competition's training set is not available directly on Kaggle, as its size is prohibitively large to train in Kaggle. Instead, it's strongly recommended that you train offline and load the externally trained model as an external dataset into Kaggle Notebooks to perform inference on the Test Set. Review Getting Started for more detailed information.
The full training set is just over 470 GB. We've made it available as one giant file, as well as 50 smaller files, each ~10 GB in size. You must accept the competition's rules to gain access to any of the links below.
Files
train_sample_videos.zip - a ZIP file containing a sample set of training videos and a
metadata.json
with labels. the full set of training videos is available through the links provided above.sample_submission.csv - a sample submission file in the correct format.
test_videos.zip - a zip file containing a small set of videos to be used as a public validation set.
To understand the datasets available for this competition, review the Getting Started information.
Columns
filename
- the filename of the videolabel
- whether the video is REAL or FAKEoriginal
- in the case that a train set video is FAKE, the original video is listed heresplit
- this is always equal to "train".
- 分享你的想法
全部内容
数据使用声明:
- 1、该数据来自于互联网数据采集或服务商的提供,本平台为用户提供数据集的展示与浏览。
- 2、本平台仅作为数据集的基本信息展示、包括但不限于图像、文本、视频、音频等文件类型。
- 3、数据集基本信息来自数据原地址或数据提供方提供的信息,如数据集描述中有描述差异,请以数据原地址或服务商原地址为准。
- 1、本站中的所有数据集的版权都归属于原数据发布者或数据提供方所有。
- 1、如您需要转载本站数据,请保留原数据地址及相关版权声明。
- 1、如本站中的部分数据涉及侵权展示,请及时联系本站,我们会安排进行数据下线。