相关数据分类
10
553
5
6
9
13
19
2
3
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
Medium Articles 包含标记为AI、机器学习、数据科学或人工智能的帖子,以及用户信息
Medium taps into the brains of the world’s most insightful writers, thinkers, and storytellers to bring you the smartes...NLP,Text Data,Literature Classification
1.8G
311
AiswaryaRamachandran
FakeNewsNet 假新闻研究数据收集,假新闻、虚假信息、数据挖掘
This is a repository for an ongoing data collection project for fake news research at ASU. We describe and compare FakeN...NLP,News,Social Science,Social Networks Classification
72.61M
994
Deepak Mahudeswaran
医学成绩单,从mtsamples获取的医学转录数据
Medical data is extremely hard to find due to HIPAA privacy regulations. This dataset offers a solution by providing med...NLP,Health,Medicine Classification
16.22M
264
Tara Boyle
reddit自我发布分类任务,包含1000 多个精心挑选的类别
Welcome to the Reddit Self-Post Classification Task (RSPCT)!The aim of this dataset was to create an interesting, large...NLP,Classification,Computer Science,Multiclass Classification Classification
839.37M
281
Mike Swarbrick Jones
NLTK路透社新闻文件,NLTK路透社语料库中的所有文件
This dataset contains the ID, categories, and raw text from each file in NLTK's Reuters corpus.ContentEach file (row...NLP,Computer Science,News,Text Data,Text Mining Classification
3.3M
610
Brad Boldenow
阿拉伯ULMFiT模型,基于Ar Wikipedia语料库的阿拉伯语模型
Arabic is a major world language yet is is under represented on the Internet and there is a lack of resources for Arabic...NLP,Transfer Learning,Languages Classification
160.13M
590
Abed Khooli
Kaggle工作
Computer Science,Education,NLP,Recommender Systems,Search Engines Classification
0.27M
271
AbdullahAli
ConceptNet Numberbatch 向量,来自 ConceptNet 的词向量
These are the word vectors released by the Conceptnet project.ConceptNet的本质是一个三元组:...NLP Classification
899.91M
307
Nohman
带注释的GMB语料库,GMB文本语料库的注释子集
Named Entity Recognition for annotated corpus using GMB(Groningen Meaning Bank) corpus for entity classification with en...NLP,Exploratory Data Analysis,Classification,Random Forest Classification
1.52M
305
Shoumik
《辛普森一家》的台词
Arts and Entertainment,NLP,Text Data,Text Mining,Comics and Animation Classification
8.94M
479
Pierre Megret
维基百科的句子,英语维基百科转储中收集了780万个句子
The wikipedia dump is a giant XML file and contains loads of not-so-useful content. I needed some english text for some...NLP,Text Mining Classification
891.28M
335
Mike Ortman
Reddit 评论分数预测,使用 NLP 预测评论分数
The idea behind this dataset is to try to predict whether a particular comment would be highly up-voted or down-voted gi...NLP,Computer Science,Social Science Classification
1.8G
274
Evan Hallmark
确定代词解析数据集
ContextHere's the csv dataset for Definite Pronoun Resolution Dataset contributed by Rahman and Ng. (2012) http://ww...NLP Classification
143K
296
Ariba Siddiqui