公开数据集
数据结构 ? 14.22M
Data Structure ?
* 以上分析是由系统提取分析形成的结果,具体实际数据为准。
README.md
Nuno Moniz
LIAAD - INESC Tec; Sciences College, University of Porto
Email: nmmoniz '@' inesctec.pt
Lu?-s Torgo
LIAAD - INESC Tec; Sciences College, University of Porto
Email: ltorgo '@' dcc.fc.up.pt
Data Set Information:
This is a large data set of news items and their respective social feedback on multiple platforms: Facebook, Google+ and linkedIn.
The collected data relates to a period of 8 months, between November 2015 and July 2016, accounting for about 100,000 news items on four different topics: economy, microsoft, obama and palestine.
This data set is tailored for evaluative comparisons in predictive analytics tasks, although allowing for tasks in other research areas such as topic detection and tracking, sentiment analysis in short text, first story detection or news recommendation.
Further details on the process of building the data set are provided in the article mentioned in the 'Relevant Papers' section.
An .R file is provided to provide a simple introduction to handling the data set.
Attribute Information:
#######################
# VARIABLES OF NEWS DATA #
#######################
IDlink (numeric): Unique identifier of news items
Title (string): Title of the news item according to the official media sources
Headline (string): Headline of the news item according to the official media sources
Source (string): Original news outlet that published the news item
Topic (string): Query topic used to obtain the items in the official media sources
PublishDate (timestamp): Date and time of the news items' publication
SentimentTitle (numeric): Sentiment score of the text in the news items' title
SentimentHeadline (numeric): Sentiment score of the text in the news items' headline
Facebook (numeric): Final value of the news items' popularity according to the social media source Facebook
GooglePlus (numeric): Final value of the news items' popularity according to the social media source Google+
linkedIn (numeric): Final value of the news items' popularity according to the social media source linkedIn
#################################
# VARIABLES OF SOCIAL FEEDBACK DATA #
#################################
IDlink (numeric): Unique identifier of news items
TS1 (numeric): Level of popularity in time slice 1 (0-20 minutes upon publication)
TS2 (numeric): Level of popularity in time slice 2 (20-40 minutes upon publication)
TS... (numeric): Level of popularity in time slice ...
TS144 (numeric): Final level of popularity after 2 days upon publication
Relevant Papers:
Nuno Moniz and Lu?-s Torgo (2018), a€?Multi-Source Social Feedback of online News Feedsa€?, CoRR, [Web link]
Citation Request:
When using this data set, please cite the following article.
Nuno Moniz and Lu?-s Torgo (2018), a€?Multi-Source Social Feedback of online News Feedsa€?, CoRR, [Web link]
@Article{Moniz2018,
title = {Multi-Source Social Feedback of online News Feeds},
author = {Nuno Moniz and Lua€?is Torgo},
year = {2018},
ee = {[Web link]},
volume = {[Web link]},
journal = {CoRR},
}
帕依提提提温馨提示
该数据集正在整理中,为您准备了其他渠道,请您使用
- 分享你的想法
全部内容
数据使用声明:
- 1、该数据来自于互联网数据采集或服务商的提供,本平台为用户提供数据集的展示与浏览。
- 2、本平台仅作为数据集的基本信息展示、包括但不限于图像、文本、视频、音频等文件类型。
- 3、数据集基本信息来自数据原地址或数据提供方提供的信息,如数据集描述中有描述差异,请以数据原地址或服务商原地址为准。
- 1、本站中的所有数据集的版权都归属于原数据发布者或数据提供方所有。
- 1、如您需要转载本站数据,请保留原数据地址及相关版权声明。
- 1、如本站中的部分数据涉及侵权展示,请及时联系本站,我们会安排进行数据下线。