公开数据集
数据结构 ? 27148.7M
Data Structure ?
* 以上分析是由系统提取分析形成的结果,具体实际数据为准。
README.md
Dataset Description
In response to the COVID-19 pandemic, the White House and a coalition of leading research groups have prepared the COVID-19 Open Research Dataset (CORD-19). CORD-19 is a resource of over 200,000 scholarly articles, including over 100,000 with full text, about COVID-19, SARS-CoV-2, and related coronaviruses. This freely available dataset is provided to the global research community to apply recent advances in natural language processing and other AI techniques to generate new insights in support of the ongoing fight against this infectious disease. There is a growing urgency for these approaches because of the rapid acceleration in new coronavirus literature, making it difficult for the medical research community to keep up.
Call to Action
We are issuing a call to action to the world's artificial intelligence experts to develop text and data mining tools that can help the medical community develop answers to high priority scientific questions. The CORD-19 dataset represents the most extensive machine-readable coronavirus literature collection available for data mining to date. This allows the worldwide AI research community the opportunity to apply text and data mining approaches to find answers to questions within, and connect insights across, this content in support of the ongoing COVID-19 response efforts worldwide. There is a growing urgency for these approaches because of the rapid increase in coronavirus literature, making it difficult for the medical community to keep up.
A list of our initial key questions can be found under the **[Tasks](https://www.kaggle.com/allen-institute-for-ai/CORD-19-research-challenge/tasks)** section of this dataset. These key scientific questions are drawn from the NASEM’s SCIED (National Academies of Sciences, Engineering, and Medicine’s Standing Committee on Emerging Infectious Diseases and 21st Century Health Threats) [research topics](https://www.nationalacademies.org/event/03-11-2020/standing-committee-on-emerging-infectious-diseases-and-21st-century-health-threats-virtual-meeting-1) and the World Health Organization’s [R&D Blueprint](https://www.who.int/blueprint/priority-diseases/key-action/Global_Research_Forum_FINAL_VERSION_for_web_14_feb_2020.pdf?ua=1) for COVID-19.
Many of these questions are suitable for text mining, and we encourage researchers to develop text mining tools to provide insights on these questions.
We are maintaining a summary of the [community's contributions](https://www.kaggle.com/covid-19-contributions). For guidance on how to make your contributions useful, we're maintaining a [forum thread](https://www.kaggle.com/allen-institute-for-ai/CORD-19-research-challenge/discussion/138484) with the feedback we're getting from the medical and health policy communities.
Prizes
Kaggle is sponsoring a *$1,000 per task* award to the winner whose submission is identified as best meeting the evaluation criteria. The winner may elect to receive this award as a charitable donation to COVID-19 relief/research efforts or as a monetary payment. More details on the prizes and timeline can be found on the [discussion post](https://www.kaggle.com/allen-institute-for-ai/CORD-19-research-challenge/discussion/135826).
Accessing the Dataset
We have made this dataset available on Kaggle. Watch out for [periodic updates](https://www.kaggle.com/allen-institute-for-ai/CORD-19-research-challenge/discussion/137474).
The dataset is also hosted on [AI2's Semantic Scholar](https://pages.semanticscholar.org/coronavirus-research). And you can search the dataset using AI2's new [COVID-19 explorer](https://cord-19.apps.allenai.org/).
The licenses for each dataset can be found in the all _ sources _ metadata csv file.
Acknowledgements
![](https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F1314380%2Fae91071ed467eb59eaaaa42f0b7c040d%2Fcovid-19_partners_logos.png?generation=1591119342897058&alt=media)
This dataset was created by the Allen Institute for AI in partnership with the Chan Zuckerberg Initiative, Georgetown University’s Center for Security and Emerging Technology, Microsoft Research, IBM, and the National Library of Medicine - National Institutes of Health, in coordination with The White House Office of Science and Technology Policy.
×
帕依提提提温馨提示
该数据集正在整理中,为您准备了其他渠道,请您使用
注:部分数据正在处理中,未能直接提供下载,还请大家理解和支持。
暂无相关内容。
暂无相关内容。
- 分享你的想法
去分享你的想法~~
全部内容
欢迎交流分享
开始分享您的观点和意见,和大家一起交流分享.
数据使用声明:
- 1、该数据来自于互联网数据采集或服务商的提供,本平台为用户提供数据集的展示与浏览。
- 2、本平台仅作为数据集的基本信息展示、包括但不限于图像、文本、视频、音频等文件类型。
- 3、数据集基本信息来自数据原地址或数据提供方提供的信息,如数据集描述中有描述差异,请以数据原地址或服务商原地址为准。
- 1、本站中的所有数据集的版权都归属于原数据发布者或数据提供方所有。
- 1、如您需要转载本站数据,请保留原数据地址及相关版权声明。
- 1、如本站中的部分数据涉及侵权展示,请及时联系本站,我们会安排进行数据下线。