相关搜索
您是不是在找?
今日排行
本周排行
本月排行
百度人计算
# DatasetThis dataset was created by wuwu# ContentsIt contains the following files:...Person Classification
1538.1M
384
wuwu
万篇德国新闻文章数据集,10kGNAD基于一百万篇文章语料库
10kGNAD数据集旨在作为第一个德国主题分类数据集解决部分问题。它由一家奥地利在线报纸的10273篇德语新闻文章组成,分为9个主题...NLP,Classification,Computer Science,Programming,News,Social Science Classification
51.81M
363
Timo Block
一百万条新闻标题
Format: CSV ; Single Filepublish_date: Date of publishing for the article in yyyyMMdd formatheadline_text: Text of the h...NLP,News Classification
57.43M
359
Rohit Kulkarni
纯文本维基百科,每个文件都包含维基百科文章的集合
Wikipedia dumps contain a tremendous amount of markup. WikiMedia Text is a hybrid of markdown and HTML, making it very d...NLP,Computer Science,Text Data,Text Mining Classification
23.71G
343
DavidShapiro
波斯维基百科数据集,波斯语(波斯语)维基百科语料库
Persian(Farsi) Wikipedia Dataset | دیتاست ویکی پدیا فارسی شامل تمامی مقالات فارسی...NLP,Deep Learning,Text Data,Data Analytics Classification
804.48M
341
Milad Farzalizadeh
百思买电子商务NER数据集
NLP,Deep Learning,Linguistics,E-Commerce Services,Research Classification
0.07M
741
DataTurks
Word2vec在维基百科上训练数据(单字母+双字母),以捕捉unigram和bigram
这是一个单词嵌入模型,创建于维基百科+各种来源的评论。与从基于短语的方法(不考虑相邻词的短语/双词上下文)创建双词不同,这...NLP,Computer Science,Software,Programming,Neural Networks Classification
8.62G
474
aintnosunshine
维基百科Word2Vec,Apache Spark word2vec由200K维基百科页面培训
I used Apache Spark to extract more than 6 million phrases from 200,000 English Wikipedia pages. Here is the process of...NLP,Business,Earth and Nature,Text Mining Classification
132.74M
451
Maziyar
维基百科的句子,英语维基百科转储中收集了780万个句子
The wikipedia dump is a giant XML file and contains loads of not-so-useful content. I needed some english text for some...NLP,Text Mining Classification
891.28M
401
Mike Ortman