公开数据集

Python中情感库的比较,分析Vader、Pattern和Stanford CoreNLP的总统演讲和推文

14.45M

693 浏览

0 喜欢

1 次下载

0 条讨论

NLP,Computer Science,Programming,Social Science,Python,Retail and Shopping Classification

**Context**- These datasets were produced as part of a little research project I undertook for a blog post on sentiment......

数据介绍
文件预览
相关论文
Code
分享讨论(0)
使用声明

启动Notebook开发

数据结构 ? 14.45M

* 以上分析是由系统提取分析形成的结果，具体实际数据为准。

README.md

**Context** - These datasets were produced as part of a little research project I undertook for a blog post on sentiment analysis, which you can access here: https://bit.ly/32PmWdf - I uploaded a more extensive dataset of presidential speeches on Kaggle here: https://bit.ly/2E7Fmvw **Content** The datasets were created to compare sentiment scores across two text types (tweets versus political speeches) and three sentiment models (Pattern, Vader and the polarity model incorporated in Stanford CoreNLP). ***- sentiment_speeches_Kaggle.csv: *** The dataset contains sentiment codings (Pattern,Vader & Stanford model) for al inauguration and state of the union speeches of US presidents since 1917 on sentence level. **-sentiment_tweets°Kaggle.csv:** The dataset contains sentiment codings (Pattern,Vader & Stanford model) for a sample of around 11500 tweets of US politicians (Donald J. Trump (Rep.), Rand Paul (Rep.),Ted Cruz (Rep.), Alexandria Ocasio-Cortez (Dem.), Nancy Pelosi (Dem.) and Bernie Sanders (Dem.)). The sentiment polarity has been computed on tweet-level, not sentence-level. I simply scraped the last 2000-ish tweets of each timeline using the GetOldTweets module for Python. You can read more about the data collection process in the aforementioned blog post (https://bit.ly/32PmWdf). **Distribution of sentiment scores** ![](https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F2342187%2Fac6b4a3ca5a33d654f7fb2c755165634%2Fdistribution%20sentiment%20comparison.png?generation=1595583728484033&alt=media) ** Some example analyses ** - The fact that the analyses of presidential speeches has been performed on sentence-level makes an analysis of polarity-development within the same text extremely easy. For example, one can easily plot the polarity scores across a single inaugural speech, like I did here for the inaugural addressees since JFK: ![](https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F2342187%2Fb7ff60d3cbc674632f301564bd80636f%2Fsentiment_inauguration.png?generation=1595583860959296&alt=media) - One could also compare how these models code a particular tweet or speech and index the amount of (dis)agreement betwene the different models: ![](https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F2342187%2Fb207991f30deb2b14c0ef048f0ebdad8%2Fagreement_vs_disagreement_libraries.png?generation=1595583966382283&alt=media) You can read more about the data collection, wrangling and analysis process in the aforementioned blog post.

暂无相关内容。

分享你的想法

去分享你的想法~~

全部内容

欢迎交流分享

开始分享您的观点和意见，和大家一起交流分享.

数据使用声明：

一、数据来源与展示说明：

1、该数据来自于互联网数据采集或服务商的提供，本平台为用户提供数据集的展示与浏览。
2、本平台仅作为数据集的基本信息展示、包括但不限于图像、文本、视频、音频等文件类型。
3、数据集基本信息来自数据原地址或数据提供方提供的信息，如数据集描述中有描述差异，请以数据原地址或服务商原地址为准。

二、所有权说明：

1、本站中的所有数据集的版权都归属于原数据发布者或数据提供方所有。

三、数据转载说明：

1、如您需要转载本站数据，请保留原数据地址及相关版权声明。

四、侵权与处理说明：

1、如本站中的部分数据涉及侵权展示，请及时联系本站，我们会安排进行数据下线。

所需积分：

8 去赚积分？

693浏览
1下载
0点赞
收藏
分享

Select Language

AI社区

今日排行

本月搜索

Dataset Category