公开数据集
数据结构 ? 143K
Data Structure ?
* 以上分析是由系统提取分析形成的结果,具体实际数据为准。
README.md
Data Set Information:
The data set consists of paper reviews sent to an international conference mostly in Spanish (some are in English). It has a total of N = 405 instances evaluated with a 5-point scale ('-2': very negative, '-1': negative, '0': neutral, '1': positive, '2': very positive), expressing the reviewer's opinion about the paper and the orientation perceived by a reader who does not know the reviewer's evaluation (more details in the attributes' section). The distribution of the original scores is more uniform in comparison to the revised scores. This difference is assumed to come from a discrepancy between the way the paper is evaluated and the way the review is written by the original reviewer.
The data set is stored in JSON format, the structure is as follows:
Paper: {
papers have an associated timespan and a paper ID, each paper contains some reviews. The reviews have their own ID, the review text, the remarks (which can be empty), the language of the review, its orientation and evaluation.
Some relevant statistics (excluding reviews in English and empty reviews):
- Number of words:
Min: 3 Max: 530 Avg: 88.64 Stdev: 69.76
- Number of sentences:
Min: 1 Max: 47 Avg: 8.91 Stdev: 7.54
Attribute Information:
1. Timespan (datetime): A date associated with the year of conference, which in turn corresponds with the time the review was written. The data set includes four years of reviews worth of conferences.
2. Paper ID (integer): This number identifies each individual paper from a given conference. The data set has 172 different papers.
3. Preliminary decision (label): The preliminary decision of acceptance or rejection of a paper taken by the conference committee.
4. Review ID (integer: A serial number identifier for each review as a correlative with respect to each individual paper. (e.g. the second review of some paper would correspond to the number $2$). The data set has a total of 405 reviews. Most papers have 2 reviews each.
5. Text (text): Comments and detailed review of the paper. This is read by the authors and the editing commission of the conference. The editors determine if the paper should be published or not depending on the reviews. There are $6$ instances of empty reviews.
6. Remarks (text): Additional comments that can be read only by the editing commission of the conference. This is used in conjunction with the previous attribute to determine if the paper should be published. This is an optional attribute. Whenever it is possible it is concatenated at the end of the main body of the review. Some reviews do not have remarks, this is indicated with an empty string ''.
7. Language (text): Language corresponding to the review (it may be English or Spanish). In this case the majority of the reviews are in Spanish, with only $17$ instances of English reviews.
8. Orientation (integer from -2 to 2): Review classification defined by the authors of this study, according to the 5-point scale previously described, obtained through the authors' systematic judgement of each review. This attribute represents the subjective perception of each review (i.e. how negative or positive the review is perceived when someone reads it).
9. evaluation (integer from -2 to 2): Review classification as defined by the reviewer, according to the 5-point scale previously described. This attribute represents the real evaluation given to the paper, as determined by the reviewers.
10. Confidence (integer from 1 to 5): Value describing the confidence of the reviewer, a higher value denotes more confidence, while a lower value indicates less confidence.
Relevant Papers:
Keith, B., Fuentes, E., & Meneses, C. (2017). A Hybrid Approach for Sentiment Analysis Applied to Paper Reviews. Available at: [Web link]
Citation Request:
Please cite the following paper when using this data set.
- APA:
Keith, B., Fuentes, E., & Meneses, C. (2017). A Hybrid Approach for Sentiment Analysis Applied to Paper Reviews. Available at: [Web link]
- BibTeX:
@article{keith2017hybrid,
title={A Hybrid Approach for Sentiment Analysis Applied to Paper Reviews},
author={Keith, Brian and Fuentes, Exequiel and Meneses, Claudio},
year={2017}
}
帕依提提提温馨提示
该数据集正在整理中,为您准备了其他渠道,请您使用
- 分享你的想法
全部内容
数据使用声明:
- 1、该数据来自于互联网数据采集或服务商的提供,本平台为用户提供数据集的展示与浏览。
- 2、本平台仅作为数据集的基本信息展示、包括但不限于图像、文本、视频、音频等文件类型。
- 3、数据集基本信息来自数据原地址或数据提供方提供的信息,如数据集描述中有描述差异,请以数据原地址或服务商原地址为准。
- 1、本站中的所有数据集的版权都归属于原数据发布者或数据提供方所有。
- 1、如您需要转载本站数据,请保留原数据地址及相关版权声明。
- 1、如本站中的部分数据涉及侵权展示,请及时联系本站,我们会安排进行数据下线。