相关搜索
您是不是在找?
今日排行
本周排行
本月排行
纯文本维基百科,每个文件都包含维基百科文章的集合
Wikipedia dumps contain a tremendous amount of markup. WikiMedia Text is a hybrid of markdown and HTML, making it very d...NLP,Computer Science,Text Data,Text Mining Classification
23.71G
246
DavidShapiro
品酒师给出的葡萄酒评论数据 ,使用文本分类来对评论中的评论者进行分类
Thinking of Natural Language Processing as a beginner!!The dataset has been about the wine comments or reviews that has...NLP,Business,News,Text Data,Multiclass Classification,Alcohol Classification
50.35M
541
Subhasree Mohapatra
阿拉伯新闻文章半岛电视台.net
Business,Education,News,NLP,Text Data,Psychology,Text Mining Classification
111.89M
487
Abdelkader Rhouati
多模态仇恨言语,150000条带有文本和图像的推特,用于仇恨检测
现有的仇恨语音数据集仅包含文本数据。我们创建了一个新的手动注释的多模态仇恨语音数据集,该数据集由150000条推文组成,每条推...NLP,Online Communities,Image Data,Multiclass Classification,Social Networks Classification
6.55G
562
Victor Callejas Fuentes
文本中的情感,句子中表达主要情感的文本数据
I was looking for a well labeled dataset to perform a multiclass classification. I wanted to do something more than just...NLP,Earth and Nature,Text Data,Multiclass Classification Classification
2.15M
267
Ishant
带有语言标签的文本数据。它可以用于语言检测。
Language Detection Dataset Text data with language labels. It can be used for language detection....NLP,Classification,Computer Science,Multiclass Classification,Languages Classification
31.7M
534
Ishant
科研论文主题建模
Business,Earth and Nature,Education,NLP,Psychology Classification
21.96M
233
Abishek Sudarshan
Tanglish情绪分析推文,使用了4个标签来描述推特的情绪
So it all started when I was looking for Abusive Tamil tweets in the Roman Script to use for a project and instead of fi...NLP,Deep Learning,Online Communities,People Classification
0.85M
256
vyom bhatia
用户评级为10M的Goodreads图书数据集
Arts and Entertainment,Social Science,NLP,Literature,Recommender Systems Classification
1128.5M
465
Bahram Jannesar
所有英文停止字(700+;)
Computer Science,Education,NLP,Feature Engineering,Python Classification
0.01M
208
Ragnar
所有NeurIPS(NIPS)文件
Computer Science,Sports,NLP,Deep Learning,Artificial Intelligence,Neural Networks Classification
310.53M
189
Ragnar
荷兰新闻文章
Internet,News,NLP,Text Data,Exploratory Data Analysis,Text Mining Classification
351.62M
212
Max Scheijen
Virgool数据集,这是一套从virgool.io收集的波斯文章数据
This could be a nice tool for Persian writers or bloggers to automatically pick the suggested hashtag or even subject fo...NLP,Education,Software,Literature Classification
58.89M
293
Mehran Rafiee
COVID 19印尼推特,与“新冠肺炎”和“政府”相关的印尼推文
ContentThis dataset contains Indonesian Tweets of users who have applied the following keywords: Corona and Pemerintah o...NLP,Deep Learning,Coronavirus,Social Networks,Email and Messaging,Government Classification
31.14M
230
Dionisius Darryl Hermansyah
电子邮件文本分类
If you are working, then you are bound to face the problem of reading all the emails that are cluttered in your inbox. S...NLP,Business,Classification,Arts and Entertainment,News,Text Data Classification
18.22M
266
Dipankar Srirag
28种语言中的停止词,自然语言处理中的文本预处理
Stopwords are the words in any language which does not add much meaning to a sentence. They can safely be ignored withou...NLP,Computer Science,Text Data,Languages Classification
0.09M
519
Heeral Dedhia
拼图竞赛数据集,包含翻译成英语的文本
These datasets refer to [jigsaw competition](https://www.kaggle.com/c/jigsaw-multilingual-toxic-comment-classification)T...NLP Classification
664.76M
222
Pavel Peskov
消费者投诉-金融产品,该数据集包括消费者对金融产品的投诉和文本
This data is a collection of complaints about consumer financial products and services that we sent to companies for res...NLP,Beginner,Text Data,Banking,Text Mining,Lending Classification
243.79M
284
Ashwini Kumar