GitHub NLP分析的标题和描述 超过800万个GitHub发布了标题和描述
Over 8 million GitHub issue titles and descriptions from 2017. Prepared from instructions at How To Create Data Products...NLP,Software Classification
2.85G
790
David Shinn
JMP dtaset发布的Kaggle 2018性别薪酬差距调查
Business,Computer Science Classification
116.76M
782
SAS&JMP fun&fan
在 Steam 上发布的启动器视频游戏
Finance,Investing,Games,Video Games,Crowdfunding Classification
1.88M
293
Anton Savchenko
印度新闻数据集,包含了《印度时报》发布的大约360万个事件
This news dataset is a persistent historical archive of noteable events in the Indian subcontinent from start-2001 to q1...NLP,Arts and Entertainment,News,Cities and Urban Areas Classification
226.84M
329
Rohit Kulkarni
零售交易[于2020年7月17日发布]
Online Communities,Retail and Shopping,NLP,Data Visualization,Tabular Data,Data Cleaning Classification
1.3M
278
Jahnic Beck-Joseph
Facebook发布的300维预训练FastText英语单词向量
300-dimensional pretrained FastText English word vectors released by Facebook.The first line of the file contains the nu...NLP,Arts and Entertainment,Games Classification
4.52G
383
Vladimir Demidov
Facebook 发布的300维预训练,在 Common Crawl 上训练的200万个词向量
300-dimensional pretrained FastText English word vectors released by Facebook.The first line of the file contains the nu...NLP,Arts and Entertainment Classification
650M
448
Manish Maharjan
巴西的科技新闻,数据集包含巴西媒体发布的大约 2720条新闻
This dataset was provided by Miti Inteligência, a company specialized in monitoring news and social media established i...NLP,News,Text Data,Text Mining,Science and Technology Classification
44.47M
320
Miti Inteligência
reddit自我发布分类任务,包含1000 多个精心挑选的类别
Welcome to the Reddit Self-Post Classification Task (RSPCT)!The aim of this dataset was to create an interesting, large...NLP,Classification,Computer Science,Multiclass Classification Classification
839.37M
341
Mike Swarbrick Jones