This page shows the technical details of what happened when authorised researcher Ya-Ting Yang requested one or more actions to be run against real patient data in the project, within a secure environment.
By cross-referencing the indicated Requested Actions with the
Pipeline section below, you can infer what
security level
various outputs were written to. Outputs marked as
highly_sensitive
can never be viewed directly by a researcher; they can only
request that code runs against them. Outputs marked as
moderately_sensitive
can be viewed by an approved researcher by logging into a highly
secure environment. Only outputs marked as
moderately_sensitive
can be requested for release to the public, via a controlled
output review service.
Jobs
-
- Job identifier:
-
7fsdsqe3nsod62ta
Pipeline
Show project.yaml
version: '3.0'
expectations:
population_size: 1000
actions:
# study cohort
#2022
generate_study_population_covid_primarycare:
run: cohortextractor:latest generate_cohort --study-definition study_definition_covid_primarycare
outputs:
highly_sensitive:
cohort: output/input_covid_primarycare.csv
generate_study_population_covid_SGSS:
run: cohortextractor:latest generate_cohort --study-definition study_definition_covid_SGSS
outputs:
highly_sensitive:
cohort: output/input_covid_SGSS.csv
generate_study_population_covid_admission:
run: cohortextractor:latest generate_cohort --study-definition study_definition_covid_admission
outputs:
highly_sensitive:
cohort: output/input_covid_admission.csv
# matching
pre_matching: # filter incident cases # filter antibiotics
run: r:latest -e 'rmarkdown::render("analysis/matching/pre_matching.Rmd", knit_root_dir = "/workspace", output_dir="/workspace/output")'
needs: [generate_study_population_covid_primarycare, generate_study_population_covid_SGSS,generate_study_population_covid_admission]
outputs:
highly_sensitive:
csv1: output/case_covid_hosp.csv
csv2: output/control_covid_infection.csv
moderately_sensitive:
html: output/pre_matching.html
matching: #R MatchIt matching with replacement
run: r:latest -e 'rmarkdown::render("analysis/matching/matching.Rmd", knit_root_dir = "/workspace", output_dir="/workspace/output")'
needs: [pre_matching]
outputs:
moderately_sensitive:
html: output/matching.html
highly_sensitive:
rds1: output/matched_patients.rds
rds2: output/unmatched_cases.rds
csv: output/matched_patients_id.csv # unique patient ID
check_unmatched:
run: r:latest -e 'rmarkdown::render("analysis/matching/check_unmatched.Rmd", knit_root_dir = "/workspace", output_dir="/workspace/output")'
needs: [matching]
outputs:
moderately_sensitive:
html: output/check_unmatched.html
# extract
extract_variables: # confounders # ab exposure
run: cohortextractor:latest generate_cohort --study-definition study_definition_outcome --with-end-date-fix
needs: [matching]
outputs:
highly_sensitive:
cohort: output/input_outcome.csv
process_Rmatching: # confounders # ab exposure
run: r:latest analysis/process/process_Rmatching.R
needs: [extract_variables,matching]
outputs:
highly_sensitive:
cohort1: output/matched_outcome.rds
cohort2: output/matched_outcome_check.rds # filter died & de-regist again
rds1: output/abtype79.rds
rds2: output/comor17.rds
extract_variables_ab_time: # per ab exposure
run: cohortextractor:latest generate_cohort --study-definition study_definition_ab_time --with-end-date-fix # unique matched patient ID
needs: [matching]
outputs:
highly_sensitive:
cohort: output/input_ab_time.csv
process_ab_time: # exposures #merge ab time with mathced patients
run: r:latest -e 'rmarkdown::render("analysis/process/process_ab_time.Rmd", knit_root_dir = "/workspace", output_dir="/workspace/output")'
needs: [extract_variables_ab_time,process_Rmatching]
outputs:
moderately_sensitive:
html: output/process_ab_time.html
highly_sensitive:
rds: output/matched_ab.rds
# RF
pre_RF_process: # split train and valid set # create category variables of ab exposure
run: r:latest -e 'rmarkdown::render("analysis/RF/pre_RF_process.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
needs: [process_ab_time,process_Rmatching]
outputs:
moderately_sensitive:
html: output/pre_RF_process.html
highly_sensitive:
rds1: output/train.rds
rds2: output/valid.rds
rds3: output/train_cat.rds
rds4: output/valid_cat.rds
model_clogit: # coditional logistic regression for picking exposure variables
run: r:latest -e 'rmarkdown::render("analysis/RF/model_clogit.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
needs: [pre_RF_process]
outputs:
moderately_sensitive:
html: output/model_clogit.html
# classification_tree: #decision tree check # category
# run: r:latest -e 'rmarkdown::render("analysis/classification_tree.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# needs: [model_clogit]
# outputs:
# moderately_sensitive:
# html: output/classification_tree.html
classification_tree: #decision tree check # category
run: r:latest -e 'rmarkdown::render("analysis/RF/classification_tree.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
needs: [pre_RF_process]
outputs:
moderately_sensitive:
html: output/classification_tree.html
# classification_tree_all: #decision tree check # category #all predictor
# run: r:latest -e 'rmarkdown::render("analysis/RF/classification_tree_all.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# needs: [model_clogit]
# outputs:
# moderately_sensitive:
# html: output/classification_tree_all.html
# classification_tree_ind_rpart: #decision tree check # category #per predictor
# run: r:latest -e 'rmarkdown::render("analysis/classification_tree_ind_rpart.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# needs: [model_clogit]
# outputs:
# moderately_sensitive:
# html: output/classification_tree_ind_rpart.html
RF_uni: # category # individual variables
run: r:latest -e 'rmarkdown::render("analysis/RF/RF_uni.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
needs: [pre_RF_process]
outputs:
moderately_sensitive:
html: output/RF_uni.html
# model_RandomForest_decile_cat: # create decile groups for probabilities # get counfounders #6:6
# run: r:latest -e 'rmarkdown::render("analysis/model_RandomForest_decile_cat.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# needs: [model_clogit,model_RandomForest_cat]
# outputs:
# moderately_sensitive:
# html: output/model_RandomForest_decile_cat.html
# rds1: output/development_cat.rds
# rds2: output/validation_cat.rds
# model_cat: # coditional logistic regression for decile groups
# run: r:latest -e 'rmarkdown::render("analysis/model_cat.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# needs: [model_RandomForest_decile_cat]
# outputs:
# moderately_sensitive:
# html: output/model_cat.html
##### COVID hospital admission
# matching
pre_matching_hosp: # filter incident cases # filter antibiotics
run: r:latest -e 'rmarkdown::render("analysis/matching/pre_matching_hosp.Rmd", knit_root_dir = "/workspace", output_dir="/workspace/output")'
needs: [generate_study_population_covid_admission]
outputs:
highly_sensitive:
csv1: output/case_covid_icu_death.csv
csv2: output/control_covid_hosp.csv
moderately_sensitive:
html: output/pre_matching_hosp.html
matching_hosp: #R MatchIt matching with replacement
run: r:latest -e 'rmarkdown::render("analysis/matching/matching_hosp.Rmd", knit_root_dir = "/workspace", output_dir="/workspace/output")'
needs: [pre_matching_hosp]
outputs:
moderately_sensitive:
html: output/matching_hosp.html
highly_sensitive:
rds1: output/matched_patients_hosp.rds
rds2: output/unmatched_cases_hosp.rds
csv: output/matched_patients_id_hosp.csv # unique patient ID
check_unmatched_hosp:
run: r:latest -e 'rmarkdown::render("analysis/matching/check_unmatched_hosp.Rmd", knit_root_dir = "/workspace", output_dir="/workspace/output")'
needs: [matching_hosp]
outputs:
moderately_sensitive:
html: output/check_unmatched_hosp.html
# extract
extract_variables_hosp: # confounders # ab exposure
run: cohortextractor:latest generate_cohort --study-definition study_definition_outcome_hosp --with-end-date-fix
needs: [matching_hosp]
outputs:
highly_sensitive:
cohort: output/input_outcome_hosp.csv
process_Rmatching_hosp: # confounders # ab exposure
run: r:latest analysis/process/process_Rmatching_hosp.R
needs: [extract_variables_hosp,matching_hosp]
outputs:
highly_sensitive:
cohort1: output/matched_outcome_hosp.rds
cohort2: output/matched_outcome_check_hosp.rds # filter died & de-regist again
rds1: output/abtype79_hosp.rds
rds2: output/comor17_hosp.rds
extract_variables_ab_time_hosp: # per ab exposure
run: cohortextractor:latest generate_cohort --study-definition study_definition_ab_time_hosp --with-end-date-fix # unique matched patient ID
needs: [matching_hosp]
outputs:
highly_sensitive:
cohort: output/input_ab_time_hosp.csv
process_ab_time_hosp: # exposures #merge ab time with mathced patients
run: r:latest -e 'rmarkdown::render("analysis/process/process_ab_time_hosp.Rmd", knit_root_dir = "/workspace", output_dir="/workspace/output")'
needs: [extract_variables_ab_time_hosp,process_Rmatching_hosp]
outputs:
moderately_sensitive:
html: output/process_ab_time_hosp.html
highly_sensitive:
rds: output/matched_ab_hosp.rds
# RF
pre_RF_process_hosp: # split train and valid set # create category variables of ab exposure
run: r:latest -e 'rmarkdown::render("analysis/RF/pre_RF_process_hosp.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
needs: [process_ab_time_hosp,process_Rmatching_hosp]
outputs:
moderately_sensitive:
html: output/pre_RF_process_hosp.html
highly_sensitive:
rds1: output/train_hosp.rds
rds2: output/valid_hosp.rds
rds3: output/train_cat_hosp.rds
rds4: output/valid_cat_hosp.rds
##### general population
generate_study_population_general_population:
run: cohortextractor:latest generate_cohort --study-definition study_definition_general_population
outputs:
highly_sensitive:
cohort: output/input_general_population.csv
pre_matching_general: # filter non-cases # filter antibiotics #id #index date
run: r:latest -e 'rmarkdown::render("analysis/matching/pre_matching_general.Rmd", knit_root_dir = "/workspace", output_dir="/workspace/output")'
needs: [generate_study_population_general_population]
outputs:
highly_sensitive:
csv1: output/general_id_2020-02-01.csv
csv2: output/general_id_2020-03-01.csv
csv3: output/general_id_2020-04-01.csv
csv4: output/general_id_2020-05-01.csv
csv5: output/general_id_2020-06-01.csv
csv6: output/general_id_2020-07-01.csv
csv7: output/general_id_2020-08-01.csv
csv8: output/general_id_2020-09-01.csv
csv9: output/general_id_2020-10-01.csv
csv10: output/general_id_2020-11-01.csv
csv11: output/general_id_2020-12-01.csv
csv12: output/general_id_2021-01-01.csv
csv13: output/general_id_2021-02-01.csv
csv14: output/general_id_2021-03-01.csv
csv15: output/general_id_2021-04-01.csv
csv16: output/general_id_2021-05-01.csv
csv17: output/general_id_2021-06-01.csv
csv18: output/general_id_2021-07-01.csv
csv19: output/general_id_2021-08-01.csv
csv20: output/general_id_2021-09-01.csv
csv21: output/general_id_2021-10-01.csv
csv22: output/general_id_2021-11-01.csv
csv23: output/general_id_2021-12-01.csv
csv24: output/general_id_2022-01-01.csv
csv25: output/general_id_2022-02-01.csv
csv26: output/general_id_2022-03-01.csv
csv27: output/general_id_2022-04-01.csv
csv28: output/general_id_2022-05-01.csv
csv29: output/general_id_2022-06-01.csv
csv30: output/general_id_2022-07-01.csv
csv31: output/general_id_2022-08-01.csv
csv32: output/general_id_2022-09-01.csv
csv33: output/general_id_2022-10-01.csv
csv34: output/general_id_2022-11-01.csv
csv35: output/general_id_2022-12-01.csv
moderately_sensitive:
html: output/pre_matching_general.html
matching_general: #R MatchIt matching with replacement
run: r:latest -e 'rmarkdown::render("analysis/matching/matching_general.Rmd", knit_root_dir = "/workspace", output_dir="/workspace/output")'
needs: [pre_matching_general,pre_matching]
outputs:
moderately_sensitive:
html: output/matching_general.html
highly_sensitive:
rds1: output/matched_patients_general.rds
rds2: output/unmatched_cases_general.rds
csv: output/matched_patients_id_general.csv # unique patient ID
# extract_ab_general_20200201: #prior 3-yr ab of index date
# run: cohortextractor:latest generate_cohort --study-definition study_definition_ab_general_20200201 --with-end-date-fix
# needs: [pre_matching_general]
# outputs:
# highly_sensitive:
# cohort: output/input_ab_general_20200201.csv
# extract_ab_general_20200301:
# run: cohortextractor:latest generate_cohort --study-definition study_definition_ab_general_20200301 --with-end-date-fix
# needs: [pre_matching_general]
# outputs:
# highly_sensitive:
# cohort: output/input_ab_general_20200301.csv
# extract_ab_general_20200401:
# run: cohortextractor:latest generate_cohort --study-definition study_definition_ab_general_20200401 --with-end-date-fix
# needs: [pre_matching_general]
# outputs:
# highly_sensitive:
# cohort: output/input_ab_general_20200401.csv
# extract_ab_general_20200501:
# run: cohortextractor:latest generate_cohort --study-definition study_definition_ab_general_20200501 --with-end-date-fix
# needs: [pre_matching_general]
# outputs:
# highly_sensitive:
# cohort: output/input_ab_general_20200501.csv
# extract_ab_general_20200601:
# run: cohortextractor:latest generate_cohort --study-definition study_definition_ab_general_20200601 --with-end-date-fix
# needs: [pre_matching_general]
# outputs:
# highly_sensitive:
# cohort: output/input_ab_general_20200601.csv
# extract_ab_general_20200701:
# run: cohortextractor:latest generate_cohort --study-definition study_definition_ab_general_20200701 --with-end-date-fix
# needs: [pre_matching_general]
# outputs:
# highly_sensitive:
# cohort: output/input_ab_general_20200701.csv
# extract_ab_general_20200801:
# run: cohortextractor:latest generate_cohort --study-definition study_definition_ab_general_20200801 --with-end-date-fix
# needs: [pre_matching_general]
# outputs:
# highly_sensitive:
# cohort: output/input_ab_general_20200801.csv
# extract_ab_general_20200901:
# run: cohortextractor:latest generate_cohort --study-definition study_definition_ab_general_20200901 --with-end-date-fix
# needs: [pre_matching_general]
# outputs:
# highly_sensitive:
# cohort: output/input_ab_general_20200901.csv
# extract_ab_general_202001001:
# run: cohortextractor:latest generate_cohort --study-definition study_definition_ab_general_20201001 --with-end-date-fix
# needs: [pre_matching_general]
# outputs:
# highly_sensitive:
# cohort: output/input_ab_general_20201001.csv
# extract_ab_general_202001101:
# run: cohortextractor:latest generate_cohort --study-definition study_definition_ab_general_20201101 --with-end-date-fix
# needs: [pre_matching_general]
# outputs:
# highly_sensitive:
# cohort: output/input_ab_general_20201101.csv
# extract_ab_general_202001201:
# run: cohortextractor:latest generate_cohort --study-definition study_definition_ab_general_20201201 --with-end-date-fix
# needs: [pre_matching_general]
# outputs:
# highly_sensitive:
# cohort: output/input_ab_general_20201201.csv
# extract_ab_general_20210101: #prior 3-yr ab of index date
# run: cohortextractor:latest generate_cohort --study-definition study_definition_ab_general_20210101 --with-end-date-fix
# needs: [pre_matching_general]
# outputs:
# highly_sensitive:
# cohort: output/input_ab_general_20210101.csv
# extract_ab_general_20210201: #prior 3-yr ab of index date
# run: cohortextractor:latest generate_cohort --study-definition study_definition_ab_general_20210201 --with-end-date-fix
# needs: [pre_matching_general]
# outputs:
# highly_sensitive:
# cohort: output/input_ab_general_20210201.csv
# extract_ab_general_20210301:
# run: cohortextractor:latest generate_cohort --study-definition study_definition_ab_general_20210301 --with-end-date-fix
# needs: [pre_matching_general]
# outputs:
# highly_sensitive:
# cohort: output/input_ab_general_20210301.csv
# extract_ab_general_20210401:
# run: cohortextractor:latest generate_cohort --study-definition study_definition_ab_general_20210401 --with-end-date-fix
# needs: [pre_matching_general]
# outputs:
# highly_sensitive:
# cohort: output/input_ab_general_20210401.csv
# extract_ab_general_20210501:
# run: cohortextractor:latest generate_cohort --study-definition study_definition_ab_general_20210501 --with-end-date-fix
# needs: [pre_matching_general]
# outputs:
# highly_sensitive:
# cohort: output/input_ab_general_20210501.csv
# extract_ab_general_20210601:
# run: cohortextractor:latest generate_cohort --study-definition study_definition_ab_general_20210601 --with-end-date-fix
# needs: [pre_matching_general]
# outputs:
# highly_sensitive:
# cohort: output/input_ab_general_20210601.csv
# extract_ab_general_20210701:
# run: cohortextractor:latest generate_cohort --study-definition study_definition_ab_general_20210701 --with-end-date-fix
# needs: [pre_matching_general]
# outputs:
# highly_sensitive:
# cohort: output/input_ab_general_20210701.csv
# extract_ab_general_20210801:
# run: cohortextractor:latest generate_cohort --study-definition study_definition_ab_general_20210801 --with-end-date-fix
# needs: [pre_matching_general]
# outputs:
# highly_sensitive:
# cohort: output/input_ab_general_20210801.csv
# extract_ab_general_20210901:
# run: cohortextractor:latest generate_cohort --study-definition study_definition_ab_general_20210901 --with-end-date-fix
# needs: [pre_matching_general]
# outputs:
# highly_sensitive:
# cohort: output/input_ab_general_20210901.csv
# extract_ab_general_202101001:
# run: cohortextractor:latest generate_cohort --study-definition study_definition_ab_general_20211001 --with-end-date-fix
# needs: [pre_matching_general]
# outputs:
# highly_sensitive:
# cohort: output/input_ab_general_20211001.csv
# extract_ab_general_202101101:
# run: cohortextractor:latest generate_cohort --study-definition study_definition_ab_general_20211101 --with-end-date-fix
# needs: [pre_matching_general]
# outputs:
# highly_sensitive:
# cohort: output/input_ab_general_20211101.csv
# extract_ab_general_202101201:
# run: cohortextractor:latest generate_cohort --study-definition study_definition_ab_general_20211201 --with-end-date-fix
# needs: [pre_matching_general]
# outputs:
# highly_sensitive:
# cohort: output/input_ab_general_20211201.csv
# extract_ab_general_20220201: #prior 3-yr ab of index date
# run: cohortextractor:latest generate_cohort --study-definition study_definition_ab_general_20220201 --with-end-date-fix
# needs: [pre_matching_general]
# outputs:
# highly_sensitive:
# cohort: output/input_ab_general_20220201.csv
# extract_ab_general_20220301:
# run: cohortextractor:latest generate_cohort --study-definition study_definition_ab_general_20220301 --with-end-date-fix
# needs: [pre_matching_general]
# outputs:
# highly_sensitive:
# cohort: output/input_ab_general_20220301.csv
# extract_ab_general_20220401:
# run: cohortextractor:latest generate_cohort --study-definition study_definition_ab_general_20220401 --with-end-date-fix
# needs: [pre_matching_general]
# outputs:
# highly_sensitive:
# cohort: output/input_ab_general_20220401.csv
# extract_ab_general_20220501:
# run: cohortextractor:latest generate_cohort --study-definition study_definition_ab_general_20220501 --with-end-date-fix
# needs: [pre_matching_general]
# outputs:
# highly_sensitive:
# cohort: output/input_ab_general_20220501.csv
# extract_ab_general_20220601:
# run: cohortextractor:latest generate_cohort --study-definition study_definition_ab_general_20220601 --with-end-date-fix
# needs: [pre_matching_general]
# outputs:
# highly_sensitive:
# cohort: output/input_ab_general_20220601.csv
# extract_ab_general_20220701:
# run: cohortextractor:latest generate_cohort --study-definition study_definition_ab_general_20220701 --with-end-date-fix
# needs: [pre_matching_general]
# outputs:
# highly_sensitive:
# cohort: output/input_ab_general_20220701.csv
# extract_ab_general_20220801:
# run: cohortextractor:latest generate_cohort --study-definition study_definition_ab_general_20220801 --with-end-date-fix
# needs: [pre_matching_general]
# outputs:
# highly_sensitive:
# cohort: output/input_ab_general_20220801.csv
# extract_ab_general_20220901:
# run: cohortextractor:latest generate_cohort --study-definition study_definition_ab_general_20220901 --with-end-date-fix
# needs: [pre_matching_general]
# outputs:
# highly_sensitive:
# cohort: output/input_ab_general_20220901.csv
# extract_ab_general_202201001:
# run: cohortextractor:latest generate_cohort --study-definition study_definition_ab_general_20221001 --with-end-date-fix
# needs: [pre_matching_general]
# outputs:
# highly_sensitive:
# cohort: output/input_ab_general_20221001.csv
# extract_ab_general_202201101:
# run: cohortextractor:latest generate_cohort --study-definition study_definition_ab_general_20221101 --with-end-date-fix
# needs: [pre_matching_general]
# outputs:
# highly_sensitive:
# cohort: output/input_ab_general_20221101.csv
# extract_ab_general_202201201:
# run: cohortextractor:latest generate_cohort --study-definition study_definition_ab_general_20221201 --with-end-date-fix
# needs: [pre_matching_general]
# outputs:
# highly_sensitive:
# cohort: output/input_ab_general_20221201.csv
# classification_tree_contd: #decision tree check # contd
# run: r:latest -e 'rmarkdown::render("analysis/classification_tree_contd.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# needs: [model_RF_process]
# outputs:
# moderately_sensitive:
# html: output/classification_tree_contd.html
# # model_tuneRF: #mtry,
# # run: r:latest -e 'rmarkdown::render("analysis/model_tuneRF.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# # needs: [model_RF_process]
# # outputs:
# # moderately_sensitive:
# # html: output/model_tuneRF.html
# # model_RF_training: #
# # run: r:latest -e 'rmarkdown::render("analysis/model_RF_training.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# # needs: [model_RF_process]
# # outputs:
# # moderately_sensitive:
# # html: output/model_RF_training.html
# model_RandomForest: # pick variables for model training # contd
# run: r:latest -e 'rmarkdown::render("analysis/model_RandomForest.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# needs: [model_RF_process]
# outputs:
# moderately_sensitive:
# html: output/model_RandomForest.html
# # csv1: output/var_tree.csv
# rds: output/model_RandomForest.rds
# model_RandomForest_cat: # pick variables for model training # category #6:6
# run: r:latest -e 'rmarkdown::render("analysis/model_RandomForest_cat.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# needs: [model_clogit]
# outputs:
# moderately_sensitive:
# html: output/model_RandomForest_cat.html
# highly_sensitive:
# rds: output/model_RandomForest_cat.rds
# train: output/train_6_cat.rds
# valid: output/valid_6_cat.rds
# model_RandomForest_check_cat: # check performance
# run: r:latest -e 'rmarkdown::render("analysis/model_RandomForest_check_cat.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# needs: [model_RandomForest_cat]
# outputs:
# moderately_sensitive:
# html: output/model_RandomForest_check_cat.html
# model_RandomForest_cat_ind: # pick variables for model training # category #6:6 # individual variables
# run: r:latest -e 'rmarkdown::render("analysis/model_RandomForest_cat_ind.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# needs: [model_clogit]
# outputs:
# moderately_sensitive:
# html: output/model_RandomForest_cat_ind.html
# model_RandomForest_decile_cat: # create decile groups for probabilities # get counfounders #6:6
# run: r:latest -e 'rmarkdown::render("analysis/model_RandomForest_decile_cat.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# needs: [model_clogit,model_RandomForest_cat]
# outputs:
# moderately_sensitive:
# html: output/model_RandomForest_decile_cat.html
# rds1: output/development_cat.rds
# rds2: output/validation_cat.rds
# model_cat: # coditional logistic regression for decile groups
# run: r:latest -e 'rmarkdown::render("analysis/model_cat.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# needs: [model_RandomForest_decile_cat]
# outputs:
# moderately_sensitive:
# html: output/model_cat.html
# RF_descriptive_stat_cat:
# run: r:latest -e 'rmarkdown::render("analysis/RF_descriptive_stat_cat.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# needs: [model_RandomForest_decile_cat]
# outputs:
# moderately_sensitive:
# html: output/RF_descriptive_stat_cat.html
# # model_RF_clust: # use proximity
# # run: r:latest -e 'rmarkdown::render("analysis/model_RF_clust.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# # needs: [model_RF_process]
# # outputs:
# # moderately_sensitive:
# # html: output/model_RF_clust.html
# # # csv1: output/var_tree.csv
# # # rds: output/model_RandomForest.rds
# model_RandomForest_check: # check performance
# run: r:latest -e 'rmarkdown::render("analysis/model_RandomForest_check.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# needs: [model_RF_process,model_RandomForest]
# outputs:
# moderately_sensitive:
# html: output/model_RandomForest_check.html
# model_RandomForest_tree: # check tree
# run: r:latest -e 'rmarkdown::render("analysis/model_RandomForest_tree.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# needs: [model_RF_process,model_RandomForest]
# outputs:
# moderately_sensitive:
# html: output/model_RandomForest_tree.html
# model_RandomForest_decile: # create decile groups for probabilities # get counfounders
# run: r:latest -e 'rmarkdown::render("analysis/model_RandomForest_decile.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# needs: [model_RF_process,model_RandomForest,process_ab_time,process_Rmatching]
# outputs:
# moderately_sensitive:
# html: output/model_RandomForest_decile.html
# rds1: output/development.rds
# rds2: output/validation.rds
# RF_descriptive_stat:
# run: r:latest -e 'rmarkdown::render("analysis/RF_descriptive_stat.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# needs: [model_RandomForest_decile]
# outputs:
# moderately_sensitive:
# html: output/RF_descriptive_stat.html
# model: # coditional logistic regression for decile groups
# run: r:latest -e 'rmarkdown::render("analysis/model.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# needs: [model_RandomForest_decile]
# outputs:
# moderately_sensitive:
# html: output/model.html
# model_clogit_adjusted: # coditional logistic regression for expo variables
# run: r:latest -e 'rmarkdown::render("analysis/model_clogit_adjusted.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# needs: [model_RandomForest_decile,model_RF_process]
# outputs:
# moderately_sensitive:
# html: output/model_clogit_adjusted.html
# model_logistic: # logistic regression for expo variables
# run: r:latest -e 'rmarkdown::render("analysis/model_logistic.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# needs: [model_RandomForest_decile,model_RF_process]
# outputs:
# moderately_sensitive:
# html: output/model_logistic.html
# ## updated method
# RF_model: # pick variables for model training #distinct # ab users # merge
# run: r:latest -e 'rmarkdown::render("analysis/RF_model.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# needs: [model_RF_process]
# outputs:
# moderately_sensitive:
# html: output/RF_model.html
# rds1: output/RF_model.rds
# rds2: output/RF_model_decile.rds
# RF_model_develop: # pick variables for model training #distinct # ab users # development
# run: r:latest -e 'rmarkdown::render("analysis/RF_model_develop.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# needs: [model_RF_process,RF_model]
# outputs:
# moderately_sensitive:
# html: output/RF_model_develop.html
# rds1: output/RF_model_develop.rds
# rds2: output/RF_model_decile_develop.rds
# RF_model_valid: # pick variables for model training #distinct # ab users # validation
# run: r:latest -e 'rmarkdown::render("analysis/RF_model_valid.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# needs: [model_RF_process,RF_model_develop]
# outputs:
# moderately_sensitive:
# html: output/RF_model_valid.html
# rds1: output/RF_model_decile_valid.rds
# RF_classification_check:
# run: r:latest -e 'rmarkdown::render("analysis/RF_classification_check.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# needs: [model_RF_process,RF_model_develop,RF_model_valid]
# outputs:
# moderately_sensitive:
# html: output/RF_classification_check.html
# descriptive_stat:
# run: r:latest -e 'rmarkdown::render("analysis/descriptive_stat.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# needs: [model_RF_process,RF_model_develop,RF_model_valid]
# outputs:
# moderately_sensitive:
# html: output/descriptive_stat.html
# # # main analysis
# table1_round:
# run: r:latest analysis/table1.R
# needs: [pre_matching,process_Rmatching]
# outputs:
# moderately_sensitive:
# csv1: output/table1_unmatched.csv
# csv2: output/table1_matched.csv
# csv3: output/table1_random.csv
# table2_round:
# run: r:latest analysis/table2.R
# needs: [process_Rmatching]
# outputs:
# moderately_sensitive:
# csv1: output/table2_matched.csv
# csv3: output/table2_random.csv
# table3_round: # baseline table of exposure variables/ training &validation
# run: r:latest analysis/table3.R
# needs: [model_RF_process]
# outputs:
# moderately_sensitive:
# csv1: output/table3_train.csv
# csv2: output/table3_valid.csv
# csv3: output/table3_all.csv
# # variables check
# check_variables: # check input
# run: r:latest -e 'rmarkdown::render("analysis/check_variables.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# needs: [model_RF_process]
# outputs:
# moderately_sensitive:
# html: output/check_variables.html
# ###### random 1 control #####
# classification_check: # RF # total_ab #1000trees used to compared with 6controls
# run: r:latest -e 'rmarkdown::render("analysis/classification_check.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# needs: [model_RF_process]
# outputs:
# moderately_sensitive:
# html: output/classification_check.html
# rds1: output/train_1control.rds
# rds2: output/valid_1control.rds
# classification_check_1_control: # RF # all # 1 control
# run: r:latest -e 'rmarkdown::render("analysis/classification_check_1_control.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# needs: [model_RF_process,classification_check]
# outputs:
# moderately_sensitive:
# html: output/classification_check_1_control.html
# classification_check_6_control: # RF # total_ab #1:6
# run: r:latest -e 'rmarkdown::render("analysis/classification_check_6_control.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# needs: [model_RF_process]
# outputs:
# moderately_sensitive:
# html: output/classification_check_6_control.html
# classification_check_logi: #logistic #6 controls
# run: r:latest -e 'rmarkdown::render("analysis/classification_check_logi.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# needs: [model_RF_process]
# outputs:
# moderately_sensitive:
# html: output/classification_check_logi.html
# classification_check_logi_1_control: #logistic # single control # decile
# run: r:latest -e 'rmarkdown::render("analysis/classification_check_logi_1_control.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# needs: [model_RF_process]
# outputs:
# moderately_sensitive:
# html: output/classification_check_logi_1_control.html
# #decile check
# classification_check_logi_1_control_decile: #logistic # single control # total ab decile
# run: r:latest -e 'rmarkdown::render("analysis/classification_check_logi_1_control_decile.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# needs: [model_RF_process]
# outputs:
# moderately_sensitive:
# html: output/classification_check_logi_1_control_decile.html
# classification_check_1_control_decile: # RF # all # 1 control # total ab decile
# run: r:latest -e 'rmarkdown::render("analysis/classification_check_1_control_decile.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# needs: [model_RF_process,classification_check]
# outputs:
# moderately_sensitive:
# html: output/classification_check_1_control_decile.html
# # remove outlier
# classification_check_1_control_0.9: # RF # all # 1 control # remove90th outlier
# run: r:latest -e 'rmarkdown::render("analysis/classification_check_1_control_0.9.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# needs: [model_RF_process,classification_check]
# outputs:
# moderately_sensitive:
# html: output/classification_check_1_control_0.9.html
# classification_check_logi_1_control_0.9: #logistic # single control remove90th outlier
# run: r:latest -e 'rmarkdown::render("analysis/classification_check_logi_1_control_0.9.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# needs: [model_RF_process]
# outputs:
# moderately_sensitive:
# html: output/classification_check_logi_1_control_0.9.html
# # classification_check_logi_1_decile: #logistic # single control # decile group
# # run: r:latest -e 'rmarkdown::render("analysis/classification_check_logi_1_control_decile.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# # needs: [model_RF_process]
# # outputs:
# # moderately_sensitive:
# # html: output/classification_check_logi_1_control_decile.html
# # distinct
# model_RF_distinct: # pick variables for model training # distinct patients
# run: r:latest -e 'rmarkdown::render("analysis/model_RF_distinct.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# needs: [model_RF_process]
# outputs:
# moderately_sensitive:
# html: output/model_RF_distinct.html
# rds: output/model_RF_distinct.rds
# model_RF_distinct_check:
# run: r:latest -e 'rmarkdown::render("analysis/model_RF_distinct_check.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# needs: [model_RF_process,model_RF_distinct]
# outputs:
# moderately_sensitive:
# html: output/model_RF_distinct_check.html
# # random 1 control
# model_RF_random_1_control: # random pick one control in subclass
# run: r:latest -e 'rmarkdown::render("analysis/model_RF_random_1_control.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# needs: [model_RF_process]
# outputs:
# moderately_sensitive:
# html: output/model_RF_random_1_control.html
# rds: output/model_RF_random_1_control.rds
# model_RF_random_1_control_check:
# run: r:latest -e 'rmarkdown::render("analysis/model_RF_random_1_control_check.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# needs: [model_RF_process,model_RF_random_1_control]
# outputs:
# moderately_sensitive:
# html: output/model_RF_random_1_control_check.html
# # #######
# # model_tuneRF: #
# # run: r:latest -e 'rmarkdown::render("analysis/model_tuneRF.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# # needs: [model_RF_process]
# # outputs:
# # moderately_sensitive:
# # html: output/model_tuneRF.html
# # check_ab_time:
# # run: r:latest -e 'rmarkdown::render("analysis/check_ab_time.Rmd", knit_root_dir = "/workspace", output_dir="/workspace/output")'
# # needs: [process_ab_time]
# # outputs:
# # moderately_sensitive:
# # html: output/check_ab_time.html
# # # highly_sensitive:
# # # rds: output/matched_patients_monthly_ab.rds
# # check_RF_grid:
# # run: r:latest -e 'rmarkdown::render("analysis/check_RF_grid.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# # needs: [process_ab_time]
# # outputs:
# # moderately_sensitive:
# # html: output/check_RF_grid.html
# # check_RF:
# # run: r:latest -e 'rmarkdown::render("analysis/check_RF.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# # needs: [process_ab_time]
# # outputs:
# # moderately_sensitive:
# # html: output/check_RF.html
# # model_RF:
# # run: r:latest -e 'rmarkdown::render("analysis/model_RF.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# # needs: [process_ab_time]
# # outputs:
# # moderately_sensitive:
# # html: output/model_RF.html
# # model_RF_process_subclass: # random sampling by subclass
# # run: r:latest -e 'rmarkdown::render("analysis/model_RF_process_subclass.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# # needs: [process_ab_time]
# # outputs:
# # moderately_sensitive:
# # html: output/model_RF_process_subclass.html
# # model_RF_process_check_sample: # check sample method
# # run: r:latest -e 'rmarkdown::render("analysis/model_RF_process_check_sample.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# # needs: [process_ab_time, process_Rmatching]
# # outputs:
# # moderately_sensitive:
# # html: output/model_RF_process_check_sample.html
# # # check
# # process_filter_ab: # filter ab users
# # run: r:latest -e 'rmarkdown::render("analysis/process_filter_ab.Rmd", knit_root_dir = "/workspace", output_dir="/workspace/output")'
# # needs: [process_Rmatching]
# # outputs:
# # moderately_sensitive:
# # html: output/process_filter_ab.html
# # highly_sensitive:
# # csv: output/matched_patients_id_ab.csv
# # extract_variables_ab_yr1:
# # run: cohortextractor:latest generate_cohort --study-definition study_definition_ab_yr1 --with-end-date-fix
# # needs: [process_filter_ab]
# # outputs:
# # highly_sensitive:
# # cohort: output/input_ab_yr1.csv
# # extract_variables_ab_yr2:
# # run: cohortextractor:latest generate_cohort --study-definition study_definition_ab_yr2 --with-end-date-fix
# # needs: [process_filter_ab]
# # outputs:
# # highly_sensitive:
# # cohort: output/input_ab_yr2.csv
# # extract_variables_ab_yr3:
# # run: cohortextractor:latest generate_cohort --study-definition study_definition_ab_yr3 --with-end-date-fix
# # needs: [process_filter_ab]
# # outputs:
# # highly_sensitive:
# # cohort: output/input_ab_yr3.csv
# # extract_variables_ab_yr3_15d:
# # run: cohortextractor:latest generate_cohort --study-definition study_definition_ab_yr3_15d --with-end-date-fix
# # needs: [process_filter_ab]
# # outputs:
# # highly_sensitive:
# # cohort: output/input_ab_yr3_15d.csv
# # process_merge_ab: # merge 1-2-3 year ab
# # run: r:latest -e 'rmarkdown::render("analysis/process_merge_ab.Rmd", knit_root_dir = "/workspace", output_dir="/workspace/output")'
# # needs: [process_Rmatching,extract_variables_ab_yr3_15d, extract_variables_ab_yr3,extract_variables_ab_yr2,extract_variables_ab_yr1]
# # outputs:
# # moderately_sensitive:
# # html: output/process_merge_ab.html
# # highly_sensitive:
# # rds: output/matched_patients_monthly_ab.rds
# # check_ab_yr1:
# # run: r:latest -e 'rmarkdown::render("analysis/check_ab_yr1.Rmd", knit_root_dir = "/workspace", output_dir="/workspace/output")'
# # needs: [extract_variables_ab_yr1,matching,process_Rmatching]
# # outputs:
# # moderately_sensitive:
# # html: output/check_ab_yr1.html
# # check_ab_yr3:
# # run: r:latest -e 'rmarkdown::render("analysis/check_ab_yr3.Rmd", knit_root_dir = "/workspace", output_dir="/workspace/output")'
# # needs: [process_Rmatching]
# # outputs:
# # moderately_sensitive:
# # html: output/check_ab_yr3.html
# # check_abtype:
# # run: r:latest -e 'rmarkdown::render("analysis/check_abtype.Rmd", knit_root_dir = "/workspace", output_dir="/workspace/output")'
# # needs: [process_Rmatching]
# # outputs:
# # moderately_sensitive:
# # html: output/check_abtype.html
# # check_process_1:
# # run: r:latest -e 'rmarkdown::render("analysis/check_process_1.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# # needs: [generate_study_population_covid_primarycare,generate_study_population_covid_SGSS,generate_study_population_covid_admission]
# # outputs:
# # moderately_sensitive:
# # html: output/check_process_1.html
# # # check_RF:
# # # run: r:latest -e 'rmarkdown::render("analysis/check_RF.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# # # needs: [process_Rmatching]
# # # outputs:
# # # moderately_sensitive:
# # # html: output/check_RF.html
# # # check_RF_grid:
# # # run: r:latest -e 'rmarkdown::render("analysis/check_RF_grid.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# # # needs: [process_Rmatching]
# # # outputs:
# # # moderately_sensitive:
# # # html: output/check_RF_grid.html
# # check_RF_yr1:
# # run: r:latest -e 'rmarkdown::render("analysis/check_RF_yr1.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# # needs: [extract_variables_ab_yr1,matching,process_Rmatching]
# # outputs:
# # moderately_sensitive:
# # html: output/check_RF_yr1.html
Timeline
-
Created:
-
Started:
-
Finished:
-
Runtime: 00:09:06
These timestamps are generated and stored using the UTC timezone on the backend.
Job information
- Status
-
Failed
Job exited with an error
- Backend
- TPP
- Workspace
- ab_covid_rf
- Requested by
- Ya-Ting Yang
- Branch
- ab_covid_rf
- Force run dependencies
- No
- Git commit hash
- d13f347
- Requested actions
-
-
RF_uni
-