Select Language

AI社区

公开数据集

相关搜索
您是不是在找?
今日排行
本周排行
本月排行
28种语言中的停止词,自然语言处理中的文本预处理 Stopwords are the words in any language which does not add much meaning to a sentence. They can safely be ignored withou...NLP,Computer Science,Text Data,Languages Classification
0.09M 518
仇恨言语罗马乌尔都语(HS RU 20) NLP,Artificial Intelligence Classification
0.49M 429
媒体文章集2020版 Arts and Entertainment,Computer Science,Education,NLP Classification
1.63M 314
日语-英语字幕语料库(JESC)[CLEANED],由280万个句子组成的大型语料库 This dataset is cleaned version of JESC by handling misplelled English words and doing word segmentation using:English=...NLP,Business,Computer Science,Languages Classification
220.08M 288
1994 2009年Arxiv量子物理学论文 Education,NLP,Physics Classification
92.19M 278
美国总统辩论2020 Politics,NLP,Text Data,United States,Text Mining Classification
0.11M 547
Reddit印度NLP数据集,数据集包括2017-2020年从R/India子版块的帖子 [![forthebadge made-with-python](http://ForTheBadge.com/images/badges/made-with-python.svg)](https://www.python.org/) [!...NLP,Classification,Multiclass Classification,India Classification
117.86M 245
零售交易[于2020年7月17日发布] Online Communities,Retail and Shopping,NLP,Data Visualization,Tabular Data,Data Cleaning Classification
1.3M 223
2020年共和党大会演讲 Social Science,Politics,NLP,Languages Classification
0.24M 210
乔·拜登2020年DNC演讲 Politics,NLP Classification
0.02M 227
1914 2019年电影摄影中的所有伤害 Arts and Entertainment,Movies and TV Shows,NLP,Crime,Text Data,Exploratory Data Analysis,Text Mining Classification
0.09M 264
美国2020年总统大选演讲 Politics,NLP,Text Data,Text Mining Classification
8.82M 240
Reddit的2400篇文章数据集 NLP,Text Data,Multiclass Classification Classification
3.88M 210
DetikNews标题(每2015 2020年1月15日至10月15日) Internet,Online Communities,News,NLP Classification
1.17M 292
乔·拜登推特(2007 2020) Internet,Social Networks,Email and Messaging,Politics,NLP,Text Data Classification
1.73M 324
研究论文主题建模2.0 Earth and Nature,Education,NLP,Astronomy,Research Classification
21.96M 202
IPL 2020年 Business,Sports,NLP,Football,Cricket Classification
2.36M 223
GPT2预训练模型(Pytorch) NLP,Deep Learning Classification
11539.5M 555
GPT2输出数据集 Business,NLP Classification
6506.05M 239
斯坦福情感树银行v2(SST2 Arts and Entertainment,Education,Movies and TV Shows,NLP,Text Data Classification
46.5M 539