Select Language

AI社区

公开数据集

相关搜索
您是不是在找?
今日排行
本周排行
本月排行
基于Reddit评论的单词表示法的全局矢量数据集 GloVe Reddit Comments Global Vectors for Word Representation based on Reddit comments...NLP Classification
19.1G 338
Facebook 发布的300维预训练,在 Common Crawl 上训练的200万个词向量 300-dimensional pretrained FastText English word vectors released by Facebook.The first line of the file contains the nu...NLP,Arts and Entertainment Classification
650M 340
英国癌症的文本挖掘和分析,英国癌症的自然语言处理 Text mining and analysis on Cancer UK Natural language processing on cancer UK...NLP,Biology,Text Data,Health Conditions Classification
4.33M 252
6.32G 254
Trump Tweet.csv NLP,Text Data Classification
0.07M 206
伯特小一阶 Arts and Entertainment,NLP Classification
837.78M 450
Taptap reviews Games,Video Games,NLP,Deep Learning Classification
3.6M 213
用于NLP的文本数据集 This is a bundle of three text data sets to be used for NLP research.Dialog system technology challenge 7 (DSTC7)UbuntuA...NLP,Earth and Nature,Education Classification
6.49G 576
英语,法语 NLP,Text Data,Languages Classification
18.33M 281
SMS Spam Ham Prediction Business,Earth and Nature,Internet,Economics,NLP Classification
0.48M 224
Warframe Steam 星际战甲用户评论数据 The data is crawled from STEAM, up until April 22nd, 2019...NLP,Video Games Classification
20.22M 263
斯坦福GloVe 200d数据集,转化为word2vec格式数据 Is the Stanford GloVe 200d dataset converted to word2vec format...NLP,Computer Science Classification
661.31M 550
确定代词解析数据集 ContextHere's the csv dataset for Definite Pronoun Resolution Dataset contributed by Rahman and Ng. (2012) http://ww...NLP Classification
143K 272
Reddit 评论分数预测,使用 NLP 预测评论分数 The idea behind this dataset is to try to predict whether a particular comment would be highly up-voted or down-voted gi...NLP,Computer Science,Social Science Classification
1.8G 249
维基百科的句子,英语维基百科转储中收集了780万个句子 The wikipedia dump is a giant XML file and contains loads of not-so-useful content. I needed some english text for some...NLP,Text Mining Classification
891.28M 306
《辛普森一家》的台词 Arts and Entertainment,NLP,Text Data,Text Mining,Comics and Animation Classification
8.94M 404
带注释的GMB语料库,GMB文本语料库的注释子集 Named Entity Recognition for annotated corpus using GMB(Groningen Meaning Bank) corpus for entity classification with en...NLP,Exploratory Data Analysis,Classification,Random Forest Classification
1.52M 274
ConceptNet Numberbatch 向量,来自 ConceptNet 的词向量 These are the word vectors released by the Conceptnet project.ConceptNet的本质是一个三元组:...NLP Classification
899.91M 273
假新闻分类 News,NLP Classification
142.92M 477
Allennlp包 Computer Science,NLP Classification
715.44M 293