Datasets Overview

We have put together a collection of datasets for your use. You will find the datasets here.

Structure

The data is represented in a CSV file in a single structured format:

participant-id session-id timestamp (s) source DATA_TYPE_1 DATA_TYPE_2 AND ETC
-- -- -- -- -- -- --

The CSV parsers on our starter apps are built to handle files in this format. So as long as you import a csv file in this format, the data will be parsed and stored correctly in the mobile applications.

Details

participant-id: id of the participant

session-id: Data collected from participants can be broken down into various sessions. This is an optional value as this is not always the case.

timestamp: unix timestamp from epoch in milliseconds

source: this is the sensor name

Dataset Details

OpenICE: Generate Simulated Data

OpenICE will let you generate simulated data by creating an ICE Device Adapter under the Data Recorder app. Check out their webpage for more info: https://www.openice.info/

MIMIC II

https://physionet.org/mimic2/demo/

This data is from ICU patients before they have passed away. The data is de-identified and all doctors notes have been removed. You will find information collected in the ICU, such as lab values, medications, diagnoses, and procedures. A more detailed introduction to the MIMIC II dataset is provided here. Check out the image below for more information on how data is structured:

Dataset Size: 4000 patients

How to download it: Scroll all the way to the bottom (in the link above) and download the `mimic2dead.sql.gz` file. You can import this data using MySQL Workbench.

Fitbit Dataset

Participant IDs: 1-2

Devices Used: Fitbit

Data Collected: weight, nutrients, calories, steps, activity, and sleep

Details: This data was collected on a Fitbit from public website located here.

Fitabase (mTurk) Dataset

Participant IDs: Various IDs

Devices Used: Fitbit

Data Collected: activity, calories, intensities, steps, heart rate, sleep, and weight

Details: This data was collected on a Fitbit from public website located here: https://zenodo.org/record/53894.

Empatica Dataset

Participant IDs: 5-10

Devices Used: Empatica

Data Collected:

Details: This data was collected on an Empatica device. Multiple participants were run through studies. See participant ids 5 to 10.

Stress Datasets (EDA, ECG, EEG)

Participant IDs: 5-10, 11-13, 19

Devices Used: ECG - Hexoskin, EDA - Empatica, EEG - Mindwave and/or Muse

Data Collected:

  • EDA (for participants 5-10)
    • heart rate, rr interval, skin conductance, and temperature
  • EDA (for participants 11-13 and 19)
    • temperature, electrodermal activity, photoplethysmograph data, accelerometer sensor data, time between heart beats, and heart rate
  • ECG (for participants 11-13 and 19)
    • acceleration, breathing rate, activity, ecg, cadence, epiration, heart rate, inspiration, ventilation, nn interval, rr interval, steps, and tidal volume
  • EEG (for participants 11-13 and 19)
    • mindwave: attention, signal quality, meditation level, band power
    • muse: raw eeg values

Details:

The datasets on github only includes processed EDA data. Other (ECG and EEG) were too large to put up on github. Therefore you will have to download them separately. Check out the raw, unprocessed stress datasets here: https://ibm.box.com/s/fobxq6z5ah49l8f6xc2vfvgf6t2dou6s. Note, these datasets are unprocessed and are in their own format. Please see the information below on how to understand the raw datasets.

Processed EDA Data:

This data was collected on an Empatica device. Multiple participants were run through studies. See participant ids 11-13 and 19.

  • ACC.csv: Data from a 3-axis accelerometer sensor
    • The accelerometer was configured to measure acceleration in the range [-2g, 2g]. Therefore the unit in this file is 1/64g.
    • Data from x, y, and z axis are respectively in 5th, 6th and 7th column.
  • BVP.csv: Data from a photoplethysmograph
  • EDA.csv: Data from an electrodermal activity sensor expressed as microsiemens (μS)
  • HR.csv: Average heart rate extracted from the BVP signal
    • The 5th column is the sample rate expressed in Hz.
  • IBI.csv: Time between individuals' heart beats extracted from the BVP signal
    • The 5th column is the duration in seconds (s) of the detected inter-beat interval (i.e., the distance in seconds from the previous beat).
  • TEMP.csv: Data from temperature sensor expressed degrees on the Celsius (°C) scale

Raw, Unprocessed ECG Data:

This data was collected from Hexoskin. The .wav files are the compressed version of the data collected from Hexoskin. Check out this page for more information on how to parse them.

The CSV files were generated from Hexoskin APIs. Check out this page for more information: https://api.hexoskin.com/docs/index.html#introduction

Note: the folder names for the raw stress datasets are in this structure: "subjectID__location__date". Location can only be 1 (in field) or 0 (in the lab). The date is in the format of yymmdd. Also take into account the subjectID's from the raw datasets (on Box) do not map directly to the processed datasets (on github). Take a look at the participant-id-mapping.csv file to understand the mapping.

Raw, Unprocessed EEG Data:

This data was collected from Mindwave or Muse. You can determine the device it was collected from by checking the file type. Mindwave data is in a text file while Muse data is in a csv file.

To understand/parse Mindwave data, check out the links below:

To understand/parse Muse data, check out the link below:

Note: the folder names for the raw stress datasets are in this structure: "subjectID__location__date". Location can only be 1 (in field) or 0 (in the lab). The date is in the format of yymmdd. Also take into account the subjectID's from the raw datasets (on Box) do not map directly to the processed datasets (on github). Take a look at the participant-id-mapping.csv file to understand the mapping.

iHealth BP

Participant IDs: 3

Devices Used: iHealth Wrist Blood Pressure Monitor

Data Collected: heart rate, rr interval, skin conductance, and temperature

Details: None

iHealth Pulse Ox

Participant IDs: 3

Devices Used: iHealth Finger Pulse Oximeter.

Data Collected: SPO2 and Heart Rate

Details: None

Jawbone Dataset

Participant IDs: 4

Devices Used: Jawbone Up Move

Data Collected: Steps, Calories, Weight, etc

Details: None

Participant ID Metadata

These participant IDs are referring to the IDs from our datasets here.

User ID Sessions Notes
1 None Gender: Male.
2 None Gender: Male.
3 None Gender: Female.
4 None Gender: Female
5 2, 3 None
6 4, 5 None
7 6, 7 None
8 8, 9 None
9 10 None
10 11 None
Various IDs None This is pertaining to the csv files inside fitbit-mturk. See here for more details: https://zenodo.org/record/53894

Additional Data

results matching ""

    No results matching ""