推特数据集上的 Umass 全球英语
Internet,Universities and Colleges,Email and Messaging,Linguistics,Languages Classification
1.21M
565
Rachael Tatman
日语英语双语语库
Psychology,Linguistics,Languages,Artificial Intelligence Classification
346.39M
219
Team AI
英语越南语数据集
Deep Learning,Languages,Artificial Intelligence,Transfer Learning Classification
24.46M
187
Hau T. Nguyen
越南语英语数据集
Classification,Deep Learning,Languages,Artificial Intelligence,Transfer Learning Classification
19.59M
204
Hau T. Nguyen
BERT英语无冠词双冠词,BERT英语无上限训练数据的双谱图频率
Is BERT the right model to fine tune your data on? Or do you need to pretrain from scratch?Know your model's trainin...NLP,Music Classification
1.99G
249
Todd Cook
拼图竞赛数据集,包含翻译成英语的文本
These datasets refer to [jigsaw competition](https://www.kaggle.com/c/jigsaw-multilingual-toxic-comment-classification)T...NLP Classification
664.76M
237
Pavel Peskov
日语-英语字幕语料库(JESC)[CLEANED],由280万个句子组成的大型语料库
This dataset is cleaned version of JESC by handling misplelled English words and doing word segmentation using:English=...NLP,Business,Computer Science,Languages Classification
220.08M
302
Wahyu Setianto
经典英语文学语料库与元数据,经典英语书籍及其作者
This is a dataset about classic readings in English, some cases other language translated to English.Dickens, Plato, Sha...NLP,Arts and Entertainment,Literature Classification
431.55M
305
Raynard Jon
用于语音克隆的英语多说话人语料库 CSTR-VCTK语料库
This CSTR VCTK Corpus includes speech data uttered by 109 native speakers of English with various accents. Each speaker...NLP,Audio Data Classification
15.22G
417
Michael Fekadu