Kaggle工作
Computer Science,Education,NLP,Recommender Systems,Search Engines Classification
0.27M
272
AbdullahAli
阿拉伯ULMFiT模型,基于Ar Wikipedia语料库的阿拉伯语模型
Arabic is a major world language yet is is under represented on the Internet and there is a lack of resources for Arabic...NLP,Transfer Learning,Languages Classification
160.13M
595
Abed Khooli
NLTK路透社新闻文件,NLTK路透社语料库中的所有文件
This dataset contains the ID, categories, and raw text from each file in NLTK's Reuters corpus.ContentEach file (row...NLP,Computer Science,News,Text Data,Text Mining Classification
3.3M
616
Brad Boldenow
reddit自我发布分类任务,包含1000 多个精心挑选的类别
Welcome to the Reddit Self-Post Classification Task (RSPCT)!The aim of this dataset was to create an interesting, large...NLP,Classification,Computer Science,Multiclass Classification Classification
839.37M
283
Mike Swarbrick Jones
医学成绩单,从mtsamples获取的医学转录数据
Medical data is extremely hard to find due to HIPAA privacy regulations. This dataset offers a solution by providing med...NLP,Health,Medicine Classification
16.22M
266
Tara Boyle
FakeNewsNet 假新闻研究数据收集,假新闻、虚假信息、数据挖掘
This is a repository for an ongoing data collection project for fake news research at ASU. We describe and compare FakeN...NLP,News,Social Science,Social Networks Classification
72.61M
1001
Deepak Mahudeswaran
Medium Articles 包含标记为AI、机器学习、数据科学或人工智能的帖子,以及用户信息
Medium taps into the brains of the world’s most insightful writers, thinkers, and storytellers to bring you the smartes...NLP,Text Data,Literature Classification
1.8G
314
AiswaryaRamachandran
实体提取从Pitchfork评论
Business,Arts and Entertainment,Music,Retail and Shopping,NLP,Popular Culture Classification
14.49M
601
Justin K
圣诞节的食谱
Religion and Belief Systems,NLP,Cooking and Recipes,Holidays and Cultural Events Classification
2.51M
531
Gary Broughton
数以千计的关于爱情的问题,该数据集包含来自QA服务的爱情类问题和答案
ContextRUSSIAN LANGUAGEThis dataset collected from real answers to questions of the mail.ru service: https://otvet.mail....NLP,Education,Text Data,Languages Classification
176.23M
264
Boris Zubarev
ACL论文选集,论文数据来自ACL选集
The Accepted paper's data from ACL Anthology. An abstract of a paper is extracted from arXiv if it exists.The data i...NLP,Education,Literature Classification
1.14M
255
Takahiro Kubo
电子邮件垃圾邮件
ContextSome emails from [Spam Assassin][1] to create models that can differentiate between spam and ham (non - spam) ema...NLP,Classification,Software,Email and Messaging Classification
12.08M
264
Wessel van Lit
123.13M
448
Henry Dashwood