Select Language



2019 NFL球探组合 Software,Sports,Text Data Classification
0.02M 523
墨尔本Airbnb家庭旅馆的活动数据集 ContextMelbourne was announced as 6th on the list of top ten cities for users globally in 2016 and has been one of the t...Hotels and Accommodations,Australia Classification
120M 561
文本挖掘 电子商务 数据集 Business Classification
8.11M 288
Twitter情绪分析数据集 ContextThe objective of this task is to detect hate speech in tweets. For the sake of simplicity, we say a tweet contain...Classification,Social Networks Classification
1.89M 666
数量文本 Others Classification
15359.5M 237
文本语料数据集,包含所有《宋飞》剧集脚本的文本语料库 ContextSeinfeld is my favorite TV show. I wrote a script to scrape the scripts of all Seinfeld episodes from the site se...Text Data Classification
1.41M 594
将验证码图像转换成文本的数据集 ContextEverytime I read a paper or anything abut image processing the examples are always easy and most of time don'...NLP,Text Data Classification
9.1M 617
唐纳德·特朗普(Donald Trump)42分钟的咆哮|文本生成 Business,Online Communities,Politics,NLP Classification
0.02M 289
文本维基百科,每个文件都包含维基百科文章的集合 Wikipedia dumps contain a tremendous amount of markup. WikiMedia Text is a hybrid of markdown and HTML, making it very d...NLP,Computer Science,Text Data,Text Mining Classification
23.71G 308
品酒师给出的葡萄酒评论数据 ,使用文本分类来对评论中的评论者进行分类 Thinking of Natural Language Processing as a beginner!!The dataset has been about the wine comments or reviews that has...NLP,Business,News,Text Data,Multiclass Classification,Alcohol Classification
50.35M 674
多模态仇恨言语,150000条带有文本和图像的推特,用于仇恨检测 现有的仇恨语音数据集仅包含文本数据。我们创建了一个新的手动注释的多模态仇恨语音数据集,该数据集由150000条推文组成,每条推...NLP,Online Communities,Image Data,Multiclass Classification,Social Networks Classification
6.55G 755
文本中的情感,句子中表达主要情感的文本数据 I was looking for a well labeled dataset to perform a multiclass classification. I wanted to do something more than just...NLP,Earth and Nature,Text Data,Multiclass Classification Classification
2.15M 325
带有语言标签的文本数据。它可以用于语言检测。 Language Detection Dataset Text data with language labels. It can be used for language detection....NLP,Classification,Computer Science,Multiclass Classification,Languages Classification
31.7M 668
电子邮件文本分类 If you are working, then you are bound to face the problem of reading all the emails that are cluttered in your inbox. S...NLP,Business,Classification,Arts and Entertainment,News,Text Data Classification
18.22M 312
28种语言中的停止词,自然语言处理中的文本预处理 Stopwords are the words in any language which does not add much meaning to a sentence. They can safely be ignored withou...NLP,Computer Science,Text Data,Languages Classification
0.09M 664
拼图竞赛数据集,包含翻译成英语的文本 These datasets refer to [jigsaw competition]( Classification
664.76M 278
消费者投诉-金融产品,该数据集包括消费者对金融产品的投诉和文本 This data is a collection of complaints about consumer financial products and services that we sent to companies for res...NLP,Beginner,Text Data,Banking,Text Mining,Lending Classification
243.79M 334
JigSaw有毒评论分类清理数据,竖锯评论,带感情,评论长度和翻译文本 I've been working on the JigSaw Multilingual Toxic Comment classification competition and found that the data requir...NLP,Deep Learning,Feature Engineering,Text Data Classification
263.44M 394
矛盾的,我亲爱的华生翻译的文本 NLP,Classification,Text Data Classification
9.47M 273
OSCAR尼泊尔语语料库,尼泊尔语文本语料库,用于训练NLP的无监督语言模型 The files are from [OSCAR Corpus]( Please visit their site for more information.The dataset i...NLP,Computer Science,Movies and TV Shows,Text Data,Languages Classification
3.1G 387