Select Language



数据泄露 2004-2017 (EN)

数据泄露 2004-2017 (EN)

234 浏览
0 喜欢
2 次下载
0 条讨论
Internet,Data Visualization Classification

数据结构 ? 0.09M

    Data Structure ?

    * 以上分析是由系统提取分析形成的结果,具体实际数据为准。

    Context Data breaches. Incidents in the world, that compromised more than 30000 records, between 2004 and 2017. English version. I wanted to visualize the data including the possibility to compare numbers between variable levels. I did some improvements in levels of variables as well as data, and I did a visualization. I also uploaded [this version of the dataset in Spanish][1]. I did the visualization with Tableau software. In this post in my blog, you can read more about it: [Spanish version][2] and [English version][3]. You can also see the visualization in this link: [Spanish version][4] and [English version][5]. Content The dataset has 270 observations and 11 variables. Most of them, are categorical variables. Incidents happened between 2004 and 2017. Last updated: February 2018. Format: CSV2. Variables (columns) [EN]: - Entity: name of the organization (public or private) that had the breach. String - Alternative Name: other known names of the entity. String - Story: tells a summary of what happened. String - Year: year of the breach. Date - Records Lost: number of records that the breach compromised.Integer - Sector: organization's main sector (or field of business). String - Method of Leak: main cause of the breach. String - 1st source (link): 1st. url with more info about the breach. String - 2nd source (link): 2nd. url with more info about the breach. String - 3rd source (link): 3rd. url with more info about the breach. String - Source name: name of the source of news, official reports, blog, etc. included. Note that some of them have changed after I replaced some previous broken links that the original dataset had. String Acknowledgements [][6]. Before the improvements, a first dataset was downloaded from this site, by the end of 2017. Inspiration The main question to be answered with the data visualization was "What quantities of records were compromised by important data breaches, in organizations and sectors, between 2004 and 2017, and what was the reason?". I wanted to have a visual answer that allows to compare numbers between year, sector, and method of leak. It would be great to improve the dataset adding new variables for data mining in the future. Achieving a complete and exhaustive "Data Breaches 2004-2017" dataset, would help to an in-depth analysis of incidents in this period. 2017 has been the worst year in the history. [1]: [2]: [3]: [4]: [5]: [6]:



    • 分享你的想法


    所需积分:0 去赚积分?
    • 234浏览
    • 2下载
    • 0点赞
    • 收藏
    • 分享