相关数据分类
10
553
5
6
9
13
19
2
3
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
访谈
NLP,Exploratory Data Analysis,Data Cleaning,Feature Engineering,Employment Classification
4.37M
275
Rahul
拼图竞赛数据集,包含翻译成英语的文本
These datasets refer to [jigsaw competition](https://www.kaggle.com/c/jigsaw-multilingual-toxic-comment-classification)T...NLP Classification
664.76M
233
Pavel Peskov
28种语言中的停止词,自然语言处理中的文本预处理
Stopwords are the words in any language which does not add much meaning to a sentence. They can safely be ignored withou...NLP,Computer Science,Text Data,Languages Classification
0.09M
553
Heeral Dedhia
Septuagint
Earth and Nature,Religion and Belief Systems,NLP,Text Data,Languages Classification
7.39M
222
Abbrivia
covid19 西班牙语 es py tweets 早 2020年4月底
Earth and Nature,Health,Social Networks,Coronavirus,NLP,Text Data Classification
805.29M
531
Marvin M. Agüero-Torales
电子邮件文本分类
If you are working, then you are bound to face the problem of reading all the emails that are cluttered in your inbox. S...NLP,Business,Classification,Arts and Entertainment,News,Text Data Classification
18.22M
279
Dipankar Srirag
COVID 19印尼推特,与“新冠肺炎”和“政府”相关的印尼推文
ContentThis dataset contains Indonesian Tweets of users who have applied the following keywords: Corona and Pemerintah o...NLP,Deep Learning,Coronavirus,Social Networks,Email and Messaging,Government Classification
31.14M
245
Dionisius Darryl Hermansyah
Youtube数据集包含43471个频道、325292个视频和1264035条评论
ContextA portion of data grabbed from Youtube ContentDataset contains youtube channels-videos-comments AcknowledgementsD...NLP,Online Communities,Social Networks Classification
629.07M
481
Ramin Rahimzada
名称实体识别数据集
The label annotation mistakes by human annotators brings up two challenges to NER:mistakes in the test set can interfere...NLP Classification
5.64M
248
M?bius
罗伯特·弗罗斯特系列
Arts and Entertainment,Education,NLP,Literature,Text Data,Transformers Classification
0.22M
538
Archan Ghosh
BERT英语无冠词双冠词,BERT英语无上限训练数据的双谱图频率
Is BERT the right model to fine tune your data on? Or do you need to pretrain from scratch?Know your model's trainin...NLP,Music Classification
1.99G
244
Todd Cook
客户服务中的关系策略,来自四个来源的旅行相关客户服务数据集
Relational Strategies in Customer Service (RSiCS) DatasetHuman-computer data from three live customer service Intelligen...NLP,Business,Text Data Classification
57.78M
303
NoLogic
Virgool数据集,这是一套从virgool.io收集的波斯文章数据
This could be a nice tool for Persian writers or bloggers to automatically pick the suggested hashtag or even subject fo...NLP,Education,Software,Literature Classification
58.89M
305
Mehran Rafiee