Select Language

AI社区

公开数据集

中等故事

中等故事

634.48M
303 浏览
0 喜欢
1 次下载
0 条讨论
Business,Literature,Text Data Classification

数据结构 ? 634.48M

    Data Structure ?

    * 以上分析是由系统提取分析形成的结果,具体实际数据为准。

    README.md

    Building a data-set of 1.4 million stories Over the past few weeks I have been building a massive data-set of Medium stories. Initially, my goal was to better understand Medium’s clap-metric, but I quickly realized that this data could accomplish MUCH more than just that. Imagine if writers could choose a title with statistical models. Or if readers could automatically subscribe to the best authors in their field. Even answering basic questions like “How many claps should I give?” would be extremely valuable to both the writers and readers of Medium. These are all things that can be created with the right data and enough man-power. We have the data, now we just need your help. ---------- Intro to the Data The data-set consists of 1.4 million stories from 95 of Medium’s most popular story-tags. Every story was published between August 1st, 2017 and August 1st, 2018. I chose to collect the contents of story cards rather than the contents of entire stories for a few reasons. First, I didn’t want to run into any issues with Medium’s ownership rules. Second, it is around 90x faster to scrape story-cards than it is to scrape entire articles (which means more data for less time and less memory). Here is the full list of the information I was able to collect for each story: Title, Sub-Title, Author, Publication, Date, Tags, Read-Time, Claps-Received, Story-URL, and Author-URL. ---------- If you want a more in-depth introduction to the data-set, look at my GitHub. In the following repository I published a data analysis notebook to answer some of the most interesting questions about Medium’s readers, authors, and publications. Here’s the link. https://github.com/harrisonjansma/Analyzing_Medium/blob/master/Medium_EDA_expanded.ipynb [1]: https://cdn-images-1.medium.com/max/1250/1*MHsIo8FAYDcGJLxiqVOWfw.png
    ×

    帕依提提提温馨提示

    该数据集正在整理中,为您准备了其他渠道,请您使用

    注:部分数据正在处理中,未能直接提供下载,还请大家理解和支持。
    暂无相关内容。
    暂无相关内容。
    • 分享你的想法
    去分享你的想法~~

    全部内容

      欢迎交流分享
      开始分享您的观点和意见,和大家一起交流分享.
    所需积分:0 去赚积分?
    • 303浏览
    • 1下载
    • 0点赞
    • 收藏
    • 分享