# Context
I was curious about the hot topics in quantum physics as reflected by the [quant-ph](https://arxiv.org/archive/quant-ph) category on arXiv. Citation counts have a long lag, and so do journal publications, and I wanted a more immediate measure of interest. [SciRate](http://scirate.com/) is fairly well known in this community, and I noticed that after the initial two-three weeks, the number of Scites a paper gets hardly increases further. So the number of Scites is both immediate and near constant after a short while.
# Content
The main dataset (`scirate_quant-ph.csv`) is the metadata of all papers published in quant-ph between 2012-01-01 and 2016-12-31 that had at least ten Scites, as crawled on 2016-12-31. It has six columns:
- The id column as exported by pandas.
- The arXiv id.
- The year of publication.
- The month of publication.
- The day of publication.
- The number of Scites (this column defines the order).
- The title.
- All authors separates by a semicolon.
- The abstract.
The author names were subjected to normalization and the chances are high that the same author only appears with a unique name.
The name normalization was the difficult part in compiling this collection, and this is why the number of Scites was lower bounded. A second file (`scirate_quant-ph_unnormalized.csv`) includes all papers that appeared between 2012-2016 irrespective of the number of Scites, but the author names are not normalized. The actual number of Scites for each paper may show a slight variation between the two datasets because the unnormalized version was compiled more than a month later.
# Acknowledgements
Many thanks to SciRate for tolerating my crawling trials and not blacklisting my IP address.
# Inspiration
Unleash topic models and author analysis to find out what or who is hot in quantum physics today. Build a generative model to write trendy fake titles like [SnarXiv](http://snarxiv.org/) does it for hep-th.
