来自202个Stackexchange站点的标记集合
This data is extracted from StackExchange for over 200+ websites under the Umbrella. This data consists of all possible...NLP,Business,Online Communities,Text Data Classification
16.75M
247
Shiv Kumar Ganesh
CC-100 卡纳达语单语言数据集:来自Web爬网数据的1300万条单语言数据集
This monolingual dataset includes roughly 13 million uncleaned Kannada sentences crawled from numerous websites....NLP,Text Data,Languages Classification
3.51G
316
Darshan
来自AskUbuntu的意图识别聊天机器人语料库
Context190 questions and answers from https://askubuntu.com. ContentWhat's inside is more than just rows and columns...NLP,Artificial Intelligence Classification
0.23M
504
Elvin Aghammadzada
客户服务中的关系策略,来自四个来源的旅行相关客户服务数据集
Relational Strategies in Customer Service (RSiCS) DatasetHuman-computer data from three live customer service Intelligen...NLP,Business,Text Data Classification
57.78M
304
NoLogic
来自wallstreetbets等的Subreddit数据,用于后验量化交易算法的情绪分析
All of the submissions to each of the r/wallstreetbets, r/investing, r/options, and r/SecurityAnalysis subreddits since...NLP,Online Communities,Investing Classification
1.49G
264
Sheridan Green
来自印度快报的新闻文章数据集
Business,Arts and Entertainment,News,NLP,Classification,Deep Learning,Linguistics,Recommender Systems Classification
63.24M
237
Pulkit Komal
COVID 19ie来自文献处理的三胞胎
Earth and Nature,Computer Science,Health Conditions,Coronavirus,NLP Classification
22.35M
237
Enrique Martín-López
来自的新闻数据集伦塔·鲁
Arts and Entertainment,Internet,News,Politics,NLP,Russia Classification
1988.17M
241
DmitryYutkin