8.49G
553
kambarakun![](https://www.payititi.com/api/avatar/show.php?username=xiaochengxu&size=large)
用于命名实体识别的标注语料库,使用BIO和POS标签注释的语料库
Annotated (BIO) Corpus for Named Entity RecognitionThis corpus is made up of texts of news sites and built specifically...Business,Arts and Entertainment,Literature,Languages Classification
2.21M
610
Alexander Kovalev![](https://www.payititi.com/api/avatar/show.php?username=ceshishuju009&size=large)
44.46M
523
Sameer Dev![](https://www.payititi.com/api/avatar/show.php?username=xiaochengxu&size=large)
来自Goodreads的2018年最佳图书数据
来自Goodreads的2018年最佳图书数据...NLP,Image Data,Tabular Data,Literature Classification
81.57M
834
Naren![](https://www.payititi.com/api/avatar/show.php?username=xiaochengxu&size=large)
268.7M
648
Yakin![](https://www.payititi.com/api/avatar/show.php?username=ceshishuju009&size=large)
ATIS数据集清洁重新点燃,ATIS数据集的清理和平衡分割
ATIS DataSetThe ATIS dataset is a standard benchmark dataset widely used as an intent classification and slot filling ta...NLP,Classification,Earth and Nature,Computer Science,Health Classification
1.02M
526
kpe![](https://www.payititi.com/api/avatar/show.php?username=xiaochengxu&size=large)
Dmoztools分类数据, 包含艺术、商业、计算机、游戏、健康、科学购物、社会等
# DatasetThis dataset was created by Patanjali ChintalapatiReleased under Other (specified in description)# ContentsIt c...NLP,Text Mining,Websites Classification
279.6M
499
Patanjali Chintalapati![](https://www.payititi.com/api/avatar/show.php?username=xiaochengxu&size=large)
Machado de Assis的116部小说和其他文本数据
este repositório estão contidas 116 obras de ficção e outros textos de Machado de Assis nos formatos pdf e txt nas c...NLP,Business,Literature,Art,Brazil Classification
40.38M
780
Luiz Amaral![](https://www.payititi.com/api/avatar/show.php?username=ceshishuju009&size=large)
命名实体识别(NER)从临床提取感兴趣的实体(例如,疾病名称、药物名称
Problem StatementClinical studies often require detailed patients’ information documented in clinical narratives. Named...NLP,Health,Health Conditions,Model Comparison,Statistical Analysis,Artificial Intelligence Classification
249.01M
440
Ramashankar Nayak![](https://www.payititi.com/api/avatar/show.php?username=xiaochengxu&size=large)
用无衬线谷歌字体书写的字母图像数据集
# DatasetThis dataset was created by Jihye YeomReleased under Other (specified in description)# ContentsIt contains the...NLP,Image Data Classification
768M
549
CheaperThanTires![](https://www.payititi.com/api/avatar/show.php?username=ceshishuju009&size=large)
CoNLL003 命名实体识别(NER)问题的注释数据集
This is an annotated dataset for Named Entity Recognition (NER) problemContentThis dataset is divided into train.txt, te...NLP,Arts and Entertainment,Computer Science,Text Data,Games,Text Mining Classification
4.63M
586
AlaaKhaled![](https://www.payititi.com/api/avatar/show.php?username=xiaochengxu&size=large)
有毒嵌入物,拼图有毒评论挑战中的通用句子编码文本
There's no need for everyone to encode the same text with the Universal Sentence EmbeddingThis data set contains the...NLP,Deep Learning,Earth and Nature Classification
610.81M
734
Liling Tan![](https://www.payititi.com/api/avatar/show.php?username=xiaochengxu&size=large)
英语词频,⅓ 百万网络上最常见的英语单词
This dataset contains the counts of the 333,333 most commonly-used single words on the English language web, as derived...Languages Classification
4.73M
721
Rachael Tatman![](https://www.payititi.com/api/avatar/show.php?username=ceshishuju009&size=large)
斯坦福自然语言推理 (SNLI) 语料库的 Jsonl 格式
这是斯坦福大学自然语言推理(snLI)语料库的1.0版本。如果你使用这个语料库,请引用这篇论文: http://nlp.Stanford.edu/pubs/snli...Languages Classification
483.45M
539
John S. Hudzina![](https://www.payititi.com/api/avatar/show.php?username=ceshishuju009&size=large)
2.46M
465
NLTK Data![](https://www.payititi.com/api/avatar/show.php?username=ceshishuju009&size=large)
Mac Morpho,带有词性标签的巴西葡萄牙语新闻文本
The canonical metadata on NLTK:packageid=mac_morphoname=MAC-MORPHO:BrazilianPortuguesenewstextwithpart-of-speechtagswebp...Earth and Nature Classification
10.43M
846
NLTK Data![](https://www.payititi.com/api/avatar/show.php?username=ceshishuju009&size=large)