Building a db file on a subset of the MIMIC-III data
Prerequisites
- [X ] Put an X between the brackets on this line if you have done all of the following:
- Checked the online documentation: https://mimic.mit.edu/
- Checked that your issue isn't already addressed: https://github.com/MIT-LCP/mimic-code/issues?utf8=%E2%9C%93&q=
Description
Description of the issue, including:
- what you have tried I have successfully managed to run and build a mimic db file through the shell program for sqlite.
I am just curious if these codes can be run to build a db file on a subset of the data.
Thanks!
What format do you expect a db file to be?
What format do you expect a db file to be?
To clarify, I have managed to take the shell program that compiles all csv.gz files into a single SQLite database file. I was just wondering if I could do the same thing, but with let's say 10% of the MIMIC-III patients, or any fraction of the dataset.
Yes for sure! You can run the same code using the demo dataset: https://physionet.org/content/mimiciii/
That would give you a 100 patient subset.
Hi thanks for your response,
Does this MIMIC demo dataset contain the same amount of information for 100 patients, or is it simply a condensed version?
Also what if I wanted to increase from 100 to 200 patients? How would I go about that?
The demo dataset is simply a filter on all the tables in the database, requiring the subject_id to be in a list of 100 apriori selected subject_id. We also remove the noteevents table.
You can easily recreate this if you have the full dataset and expand the subject_id list.