含 5000 张电影海报的 IMDB 数据集 #数据集此数据集由Nazima#内容创建,包含以下文件:...Arts and Entertainment,Movies and TV Shows Classification
551K 426
MyAnimelist动漫数据集,含15000条动漫数据 MyAnimelist动漫数据集,包含15000条动漫数据。这是一份未经过滤的流行网站,在2019年2月4日更新的动画片...Anime and Manga Classification
5M 1554
含内容的UCI新闻聚合器数据集 Earth and Nature,Education,Online Communities,News,NLP,Linguistics Classification
16.11M 264
文本语料数据集,含所有《宋飞》剧集脚本的文本语料库 ContextSeinfeld is my favorite TV show. I wrote a script to scrape the scripts of all Seinfeld episodes from the site se...Text Data Classification
1.41M 501
印度新闻数据集,含了《印度时报》发布的大约360万个事件 This news dataset is a persistent historical archive of noteable events in the Indian subcontinent from start-2001 to q1...NLP,Arts and Entertainment,News,Cities and Urban Areas Classification
226.84M 296
一个数据集,含带有条件的评论中的标记和未标记的句子 This dataset was created during my PhD ( at the University of Sevi...NLP,Text Data,Universities and Colleges,Ratings and Reviews Classification
794.68M 535
纯文本维基百科,每个文件都含维基百科文章的集合 Wikipedia dumps contain a tremendous amount of markup. WikiMedia Text is a hybrid of markdown and HTML, making it very d...NLP,Computer Science,Text Data,Text Mining Classification
23.71G 265
COVID19相关常见问题,此数据含与新冠肺炎相关的问答集19 What is this?This data contains collection of question and answers related to COVID19.Where does this come from?Thi...NLP,Health,Coronavirus,Psychology,Diseases Classification
0.1M 264
Youtube数据集含43471个频道、325292个视频和1264035条评论 ContextA portion of data grabbed from Youtube ContentDataset contains youtube channels-videos-comments AcknowledgementsD...NLP,Online Communities,Social Networks Classification
629.07M 484
拼图竞赛数据集,含翻译成英语的文本 These datasets refer to [jigsaw competition]( Classification
664.76M 234
消费者投诉-金融产品,该数据集括消费者对金融产品的投诉和文本 This data is a collection of complaints about consumer financial products and services that we sent to companies for res...NLP,Beginner,Text Data,Banking,Text Mining,Lending Classification
243.79M 297
Reddit印度NLP数据集,数据集括2017-2020年从R/India子版块的帖子 [![forthebadge made-with-python](]( [!...NLP,Classification,Multiclass Classification,India Classification
117.86M 266
Python列表 Computer Science,Programming,NLP,Beginner,Clustering Classification
0.05M 234
预先训练的伯特,括剧本 Computer Science,NLP Classification
4477.46M 231
EmojifyData数据集:1800万条英文推文,全部含表情符号 So, me and my friend was participating IPavlov course on deep learning in NLP. As out final project we want to work on s...NLP,Online Communities,Text Data,Social Networks Classification
2.58G 539
NLP Word2Vec 现有的word2vec嵌入,括手套和谷歌新闻,用于被训练来重建单词的语言上下文 Word2vec is a group of related models that are used to produce word embeddings. These models are shallow, two-layer neur...NLP,Computer Science Classification
5.89G 330
巴西的科技新闻,数据集含巴西媒体发布的大约 2720条新闻 This dataset was provided by Miti Inteligência, a company specialized in monitoring news and social media established i...NLP,News,Text Data,Text Mining,Science and Technology Classification
44.47M 283
SMILES OCR数据集,含超过 90 万个 SMILES 格式的单一产品反应 SMILES(简化分子输入行输入系统)是一种用于输入和表示分子和反应的行符号(一种使用可打印字符的印刷方法)。该数据集包含超过...NLP,Chemistry Classification
175M 741
Allennlp Computer Science,NLP Classification
715.44M 320
汽车消费者汽车评级和评论,数据集含 62 个主要品牌的数据 This is a dataset containing consumer's thought and the star rating of car manufacturer/model/type.Content-Currently...Automobiles and Vehicles,Retail and Shopping,Ratings and Reviews Classification
137.62M 666