This is the multi-page printable view of this section.
Click here to print.
Return to the regular view of this page.
Documentation
This website provides four different approaches to documentation for the THPC19 Dataset. This structure comes from the Divio Documentation system, which you can read more about here.
- Reference Guides: Start here if you want just the facts on the contents of the dataset
- Explanation: Start here if you want a more detailed exploration of the dataset–where does it come from, and who does it include?
- Tutorials: Start here if you want the basics of how to load up the dataset and start working with it
- How-To Guides: Start here if you’d like an overview of the analysis and methods you can run on the dataset
1 - Data Dictionary
Schema and contents of the data tables
Data Table Overview
Tables in the dataset are as follows:
2 - Explanation and Background
Exploration of the contents of the dataset
The Trillium Health Partners COVID-19 (THPC19) dataset is comprised of deidentified health related data associated with a cohort of 509 patients admitted to THP via the Emergency Department between November 1st, 2020 and March 15th, 2021, meeting the criteria for likely COVID-19.
Data is available in three separate tables: one row representing patient data at admission, one representing daily stay data for days 1 through 7 and 14, and one representing patient outcome data.
Variables
Some of the following variables and variable types are included in the data tables. For additional detail, please see the Reference section of the documentation.
- Patient demographic information
- Free text data, manually collected (patient comorbidities)
- Numeric values (vitals, labs)
- Indicators
2.1 - Data at Admission
Descriptions
This table includes data at collected at admission for suspected COVID-19 patients. Data includes basic demographic data (age and sex), some text data (comorbidities), and initial tests and labs.
2.2 - Day Breakdown
Description
This table includes a breakdown of patient information over time, for each day of admission (measured from 6:00am to 5:59am) on the first 7 days as well as day 14. Note that day 0 is the day of ER presentation and day 1 is the day of hospital admission, and that these two may be the same. The parent_id column links to both the Admission dataset (on id) and the Outcomes dataset (on parent_id).
2.3 - Outcomes
Description
This table includes detailed information on the outcome of each patient listed in the dataset. The parent_id column links to both the Admission dataset (on id) and the Day Breakdown dataset (on parent_id).
3 - Tutorials
Working with and analyzing the data
These tutorials are designed to help answer basic questions and provide the skills you need to start working with and using the data. Check back in as more tutorials are posted over time.
3.1 - Loading Demographic Data
How do I load and work with demographic data?
In this tutorial, you’ll learn how to load in your first dataset and use it to create a descriptive plot of a demographic variable (in this case, patient sex.)
4 - Getting Started
Accessing the platform and loading in the data
To start using the dataset, access the Health Data Nexus. Check out this page for information on how to use the data.