公开数据集
数据结构 ? 153M
Data Structure ?
* 以上分析是由系统提取分析形成的结果,具体实际数据为准。
README.md
Data Set Information:
Relevant Information:
All data is fully anonymized.
Data was originally collected from 19 participants, but the TAC readings of 6 participants were deemed unusable by SCRAM [1]. The data included is from the remaining 13 participants.
Accelerometer data was collected from smartphones at a sampling rate of 40Hz (file: all_accelerometer_data_pids_13.csv). The file contains 5 columns: a timestamp, a participant ID, and a sample from each axis of the accelerometer. Data was collected from a mix of 11 iPhones and 2 Android phones as noted in phone_types.csv. TAC data was collected using SCRAM [2] ankle bracelets and was collected at 30 minute intervals. The raw TAC readings are in the raw_tac directory. TAC readings which are more readily usable for processing are in clean_tac directory and have two columns: a timestamp and TAC reading. The cleaned TAC readings: (1) were processed with a zero-phase low-pass filter to smooth noise without shifting phase; (2) were shifted backwards by 45 minutes so the labels more closely match the true intoxication of the participant (since alcohol takes about 45 minutes to exit through the skin.) Please see the above referenced study for more details on how the data was processed ([Web link]).
1 - [Web link]
2 - J. Robert Zettl. The determination of blood alcohol concentration by transdermal measurement. [Web link], 2002.
Number of Instances:
Accelerometer readings: 14,057,567
TAC readings: 715
Participants: 13
Number of Attributes:
- Time series: 3 axes of accelerometer data (columns x, y, z in all_accelerometer_data_pids_13.csv)
- Static: 1 phone-type feature (in phone_types.csv)
- Target: 1 time series of TAC for each of the 13 participants (in clean_tac directory).
For Each Attribute:
(Main)
all_accelerometer_data_pids_13.csv:
time: integer, unix timestamp, milliseconds
pid: symbolic, 13 categories listed in pids.txt
x: continuous, time-series
y: continuous, time-series
z: continuous, time-series
clean_tac/*.csv:
timestamp: integer, unix timestamp, seconds
TAC_Reading: continuous, time-series
phone_type.csv:
pid: symbolic, 13 categories listed in pids.txt
phonetype: symbolic, 2 categories (iPhone, Android)
(Other)
raw/*.xlsx:
TAC Level: continuous, time-series
IR Voltage: continuous, time-series
Temperature: continuous, time-series
Time: datetime
Date: datetime
Missing Attribute Values:
None
Target Distribution:
TAC is measured in g/dl where 0.08 is the legal limit for intoxication while driving
Mean TAC: 0.065 +/- 0.182
Max TAC: 0.443
TAC Inner Quartiles: 0.002, 0.029, 0.092
Mean Time-to-last-drink: 16.1 +/- 6.9 hrs
Attribute Information:
Provide information about each attribute in your data set.
Relevant Papers:
Past Usage:
(a) Complete reference of article where it was described/used:
Killian, J.A., Passino, K.M., Nandi, A., Madden, D.R. and Clapp, J., Learning to Detect Heavy Drinking Episodes Using Smartphone Accelerometer Data. In Proceedings of the 4th International Workshop on Knowledge Discovery in Healthcare Data co-located with the 28th International Joint Conference on Artificial Intelligence (IJCAI 2019) (pp. 35-42). [Web link]
(b) Indication of what attribute(s) were being predicted
Features: Three-axis time series accelerometer data
Target: Time series transdermal alcohol content (TAC) data (real-time measure of intoxication)
(c) Indication of study's results
The study decomposed each time series into 10 second windows and performed binary classification to predict if windows corresponded to an intoxicated participant (TAC >= 0.08) or sober participant (TAC < 0.08). The study tested several models and achieved a test accuracy of 77.5% with a random forest.
Citation Request:
When using this dataset, please cite: Killian, J.A., Passino, K.M., Nandi, A., Madden, D.R. and Clapp, J., Learning to Detect Heavy Drinking Episodes Using Smartphone Accelerometer Data. In Proceedings of the 4th International Workshop on Knowledge Discovery in Healthcare Data co-located with the 28th International Joint Conference on Artificial Intelligence (IJCAI 2019) (pp. 35-42). [Web link]
(a) Owner of database
Jackson A Killian (jkillian '@' g.harvard.edu, Harvard University); Danielle R Madden (University of Southern California); John Clapp (University of Southern California)
(b) Donor of database
Jackson A Killian (jkillian '@' g.harvard.edu, Harvard University); Danielle R Madden (University of Southern California); John Clapp (University of Southern California)
(c) Date collected
May 2017
(d) Date submitted
Jan 2020
帕依提提提温馨提示
该数据集正在整理中,为您准备了其他渠道,请您使用
- 分享你的想法
全部内容
数据使用声明:
- 1、该数据来自于互联网数据采集或服务商的提供,本平台为用户提供数据集的展示与浏览。
- 2、本平台仅作为数据集的基本信息展示、包括但不限于图像、文本、视频、音频等文件类型。
- 3、数据集基本信息来自数据原地址或数据提供方提供的信息,如数据集描述中有描述差异,请以数据原地址或服务商原地址为准。
- 1、本站中的所有数据集的版权都归属于原数据发布者或数据提供方所有。
- 1、如您需要转载本站数据,请保留原数据地址及相关版权声明。
- 1、如本站中的部分数据涉及侵权展示,请及时联系本站,我们会安排进行数据下线。