RMU论文 最终数据文件
Business,Earth and Nature,Health,Universities and Colleges,Healthcare Classification
4.18M
582
Jay Wang
纯文本维基百科,每个文件都包含维基百科文章的集合
Wikipedia dumps contain a tremendous amount of markup. WikiMedia Text is a hybrid of markdown and HTML, making it very d...NLP,Computer Science,Text Data,Text Mining Classification
23.71G
245
DavidShapiro
所有NeurIPS(NIPS)文件
Computer Science,Sports,NLP,Deep Learning,Artificial Intelligence,Neural Networks Classification
310.53M
188
Ragnar
NLTK路透社新闻文件,NLTK路透社语料库中的所有文件
This dataset contains the ID, categories, and raw text from each file in NLTK's Reuters corpus.ContentEach file (row...NLP,Computer Science,News,Text Data,Text Mining Classification
3.3M
575
Brad Boldenow