Select Language

AI社区

公开数据集

相关搜索
您是不是在找?
今日排行
本周排行
本月排行
本维基百科,每个件都包含维基百科章的集合 Wikipedia dumps contain a tremendous amount of markup. WikiMedia Text is a hybrid of markdown and HTML, making it very d...NLP,Computer Science,Text Data,Text Mining Classification
23.71G 246
品酒师给出的葡萄酒评论数据 ,使用本分类来对评论中的评论者进行分类 Thinking of Natural Language Processing as a beginner!!The dataset has been about the wine comments or reviews that has...NLP,Business,News,Text Data,Multiclass Classification,Alcohol Classification
50.35M 541
阿拉伯新闻章半岛电视台.net Business,Education,News,NLP,Text Data,Psychology,Text Mining Classification
111.89M 487
多模态仇恨言语,150000条带有本和图像的推特,用于仇恨检测 现有的仇恨语音数据集仅包含文本数据。我们创建了一个新的手动注释的多模态仇恨语音数据集,该数据集由150000条推文组成,每条推...NLP,Online Communities,Image Data,Multiclass Classification,Social Networks Classification
6.55G 562
诗歌数据集(NLP) NLP,Text Data,LSTM,RNN,Transformers Classification
20.87M 526
本中的情感,句子中表达主要情感的本数据 I was looking for a well labeled dataset to perform a multiclass classification. I wanted to do something more than just...NLP,Earth and Nature,Text Data,Multiclass Classification Classification
2.15M 267
带有语言标签的本数据。它可以用于语言检测。 Language Detection Dataset Text data with language labels. It can be used for language detection....NLP,Classification,Computer Science,Multiclass Classification,Languages Classification
31.7M 534
科研论主题建模 Business,Earth and Nature,Education,NLP,Psychology Classification
21.96M 233
Tanglish情绪分析推,使用了4个标签来描述推特的情绪 So it all started when I was looking for Abusive Tamil tweets in the Roman Script to use for a project and instead of fi...NLP,Deep Learning,Online Communities,People Classification
0.85M 256
用户评级为10M的Goodreads图书数据集 Arts and Entertainment,Social Science,NLP,Literature,Recommender Systems Classification
1128.5M 465
所有英停止字(700+;) Computer Science,Education,NLP,Feature Engineering,Python Classification
0.01M 208
所有NeurIPS(NIPS) Computer Science,Sports,NLP,Deep Learning,Artificial Intelligence,Neural Networks Classification
310.53M 189
荷兰新闻 Internet,News,NLP,Text Data,Exploratory Data Analysis,Text Mining Classification
351.62M 212
Virgool数据集,这是一套从virgool.io收集的波斯章数据 This could be a nice tool for Persian writers or bloggers to automatically pick the suggested hashtag or even subject fo...NLP,Education,Software,Literature Classification
58.89M 293
阿拉伯圣训九册 NLP,Multiclass Classification,Clustering Classification
94.48M 249
COVID 19印尼推特,与“新冠肺炎”和“政府”相关的印尼推 ContentThis dataset contains Indonesian Tweets of users who have applied the following keywords: Corona and Pemerintah o...NLP,Deep Learning,Coronavirus,Social Networks,Email and Messaging,Government Classification
31.14M 230
电子邮件本分类 If you are working, then you are bound to face the problem of reading all the emails that are cluttered in your inbox. S...NLP,Business,Classification,Arts and Entertainment,News,Text Data Classification
18.22M 266
28种语言中的停止词,自然语言处理中的本预处理 Stopwords are the words in any language which does not add much meaning to a sentence. They can safely be ignored withou...NLP,Computer Science,Text Data,Languages Classification
0.09M 519
拼图竞赛数据集,包含翻译成英语的 These datasets refer to [jigsaw competition](https://www.kaggle.com/c/jigsaw-multilingual-toxic-comment-classification)T...NLP Classification
664.76M 222
消费者投诉-金融产品,该数据集包括消费者对金融产品的投诉和 This data is a collection of complaints about consumer financial products and services that we sent to companies for res...NLP,Beginner,Text Data,Banking,Text Mining,Lending Classification
243.79M 284