数据结构 ?
Data Structure ?
* 以上分析是由系统提取分析形成的结果,具体实际数据为准。
Bio-medical texts have a lot of information which can be used for
developments in the medical field. Traditionally, domain experts used to
manually extract such information. Automating this information
extraction task can help speed up progress in the field.
To name a few use cases of bio-medical events, they show the effects of
drugs on a person. They can also be used to identify certain medical
conditions in a person.
Hence automating extraction of events from bio-medical texts is very beneficial
The dataset is just a simplified version of the event annotated GENIA dataset derived from the version available in TEES
It consists of the original bio-medical text, labelled trigger words,
location of trigger word in the text and the event type associated with
the trigger word
There are 3 sets of data (train (8k+ sentences), devel (about 3k sentences) and test (about 3k sentences)).
Each set has 4 columns namely "Sentence", "TriggerWord",
"TriggerWordLoc" and "EventType", capturing the original bio-medical
text, trigger words in the sentence, location of the trigger words in
the sentence and the event type associated with the trigger words
The dataset is just a simplified version of the event annotated GENIA dataset derived from the version available in TEES
The original source dataset is from BionLP Shared Task 2011
A complete unprocessed version seems to be present in genia-event-2011 dataset too
For TEES licensing information please refer this link
For GENIA dataset licensing information, please refer the file
"GE11-LICENSE" present beside the data files (.csv) in this kaggle
Photo Credits: Louis Reed on Unsplash
- 分享你的想法
- 1、该数据来自于互联网数据采集或服务商的提供,本平台为用户提供数据集的展示与浏览。
- 2、本平台仅作为数据集的基本信息展示、包括但不限于图像、文本、视频、音频等文件类型。
- 3、数据集基本信息来自数据原地址或数据提供方提供的信息,如数据集描述中有描述差异,请以数据原地址或服务商原地址为准。
- 1、本站中的所有数据集的版权都归属于原数据发布者或数据提供方所有。
- 1、如您需要转载本站数据,请保留原数据地址及相关版权声明。
- 1、如本站中的部分数据涉及侵权展示,请及时联系本站,我们会安排进行数据下线。