Job request: 14690
- Organisation:
- University of Manchester
- Workspace:
- cc_rf
- ID:
- jjcljx66btc3thbm
This page shows the technical details of what happened when the authorised researcher Ya-Ting Yang requested one or more actions to be run against real patient data in the project, within a secure environment.
By cross-referencing the list of jobs with the
pipeline section below, you can infer what
security level
various outputs were written to. Researchers can never directly
view outputs marked as
highly_sensitive
;
they can only request that code runs against them. Outputs
marked as
moderately_sensitive
can be viewed by an approved researcher by logging into a highly
secure environment. Only outputs marked as
moderately_sensitive
can be requested for release to the public, via a controlled
output review service.
Jobs
-
- Job identifier:
-
jzg5ouu4bguloqrt
Pipeline
Show project.yaml
version: '3.0'
expectations:
population_size: 1000
actions:
# study cohort
generate_study_population_covid_primarycare:
run: cohortextractor:latest generate_cohort --study-definition study_definition_covid_primarycare
outputs:
highly_sensitive:
cohort: output/input_covid_primarycare.csv
generate_study_population_covid_SGSS:
run: cohortextractor:latest generate_cohort --study-definition study_definition_covid_SGSS
outputs:
highly_sensitive:
cohort: output/input_covid_SGSS.csv
generate_study_population_covid_admission:
run: cohortextractor:latest generate_cohort --study-definition study_definition_covid_admission
outputs:
highly_sensitive:
cohort: output/input_covid_admission.csv
process_1:
run: r:latest analysis/process_1.R
needs: [generate_study_population_covid_primarycare, generate_study_population_covid_SGSS,generate_study_population_covid_admission]
outputs:
highly_sensitive:
case: output/case_covid_hosp.csv
control: output/control_covid_infection.csv
# matching
matching: #R MatchIt matching with replacement
run: r:latest -e 'rmarkdown::render("analysis/matching.Rmd", knit_root_dir = "/workspace", output_dir="/workspace/output")'
needs: [process_1]
outputs:
moderately_sensitive:
html: output/matching.html
highly_sensitive:
rds1: output/matched_patients.rds
rds2: output/unmatched_cases.rds
csv: output/matched_patients_id.csv
check_unmatched:
run: r:latest -e 'rmarkdown::render("analysis/check_unmatched.Rmd", knit_root_dir = "/workspace", output_dir="/workspace/output")'
needs: [matching]
outputs:
moderately_sensitive:
html: output/check_unmatched.html
extract_variables: # confounders
run: cohortextractor:latest generate_cohort --study-definition study_definition_outcome --with-end-date-fix
needs: [matching]
outputs:
highly_sensitive:
cohort: output/input_outcome.csv
process_Rmatching: # confounders
run: r:latest analysis/process_Rmatching.R
needs: [extract_variables,matching]
outputs:
highly_sensitive:
cohort1: output/matched_outcome.rds
cohort2: output/matched_outcome_check.rds # filter died $ de-regist again
rds1: output/abtype79.rds
rds2: output/comor17.rds
# extract ab for RF
extract_variables_ab_time: # exposure variables
run: cohortextractor:latest generate_cohort --study-definition study_definition_ab_time --with-end-date-fix
needs: [matching]
outputs:
highly_sensitive:
cohort: output/input_ab_time.csv
process_ab_time: # exposures
run: r:latest -e 'rmarkdown::render("analysis/process_ab_time.Rmd", knit_root_dir = "/workspace", output_dir="/workspace/output")'
needs: [extract_variables_ab_time,process_Rmatching]
outputs:
moderately_sensitive:
html: output/process_ab_time.html
highly_sensitive:
rds: output/matched_ab.rds
model_RF_process: # distinct patient, check variables
run: r:latest -e 'rmarkdown::render("analysis/model_RF_process.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
needs: [process_ab_time,process_Rmatching]
outputs:
moderately_sensitive:
html: output/model_RF_process.html
highly_sensitive:
rds1: output/train_X.rds
rds2: output/train_Y.rds
rds3: output/valid_X.rds
rds4: output/valid_Y.rds
model_RF_training: #
run: r:latest -e 'rmarkdown::render("analysis/model_RF_training.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
needs: [model_RF_process]
outputs:
moderately_sensitive:
html: output/model_RF_training.html
model_RandomForest: #
run: r:latest -e 'rmarkdown::render("analysis/model_RandomForest.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
needs: [model_RF_process]
outputs:
moderately_sensitive:
html: output/model_RandomForest.html
check_ab_time:
run: r:latest -e 'rmarkdown::render("analysis/check_ab_time.Rmd", knit_root_dir = "/workspace", output_dir="/workspace/output")'
needs: [process_ab_time]
outputs:
moderately_sensitive:
html: output/check_ab_time.html
# highly_sensitive:
# rds: output/matched_patients_monthly_ab.rds
check_RF_grid:
run: r:latest -e 'rmarkdown::render("analysis/check_RF_grid.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
needs: [process_ab_time]
outputs:
moderately_sensitive:
html: output/check_RF_grid.html
check_RF:
run: r:latest -e 'rmarkdown::render("analysis/check_RF.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
needs: [process_ab_time]
outputs:
moderately_sensitive:
html: output/check_RF.html
model_RF:
run: r:latest -e 'rmarkdown::render("analysis/model_RF.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
needs: [process_ab_time]
outputs:
moderately_sensitive:
html: output/model_RF.html
model_RF_process_subclass: # random sampling by subclass
run: r:latest -e 'rmarkdown::render("analysis/model_RF_process_subclass.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
needs: [process_ab_time]
outputs:
moderately_sensitive:
html: output/model_RF_process_subclass.html
model_RF_process_check_sample: # check sample method
run: r:latest -e 'rmarkdown::render("analysis/model_RF_process_check_sample.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
needs: [process_ab_time, process_Rmatching]
outputs:
moderately_sensitive:
html: output/model_RF_process_check_sample.html
# check
process_filter_ab: # filter ab users
run: r:latest -e 'rmarkdown::render("analysis/process_filter_ab.Rmd", knit_root_dir = "/workspace", output_dir="/workspace/output")'
needs: [process_Rmatching]
outputs:
moderately_sensitive:
html: output/process_filter_ab.html
highly_sensitive:
csv: output/matched_patients_id_ab.csv
extract_variables_ab_yr1:
run: cohortextractor:latest generate_cohort --study-definition study_definition_ab_yr1 --with-end-date-fix
needs: [process_filter_ab]
outputs:
highly_sensitive:
cohort: output/input_ab_yr1.csv
extract_variables_ab_yr2:
run: cohortextractor:latest generate_cohort --study-definition study_definition_ab_yr2 --with-end-date-fix
needs: [process_filter_ab]
outputs:
highly_sensitive:
cohort: output/input_ab_yr2.csv
extract_variables_ab_yr3:
run: cohortextractor:latest generate_cohort --study-definition study_definition_ab_yr3 --with-end-date-fix
needs: [process_filter_ab]
outputs:
highly_sensitive:
cohort: output/input_ab_yr3.csv
extract_variables_ab_yr3_15d:
run: cohortextractor:latest generate_cohort --study-definition study_definition_ab_yr3_15d --with-end-date-fix
needs: [process_filter_ab]
outputs:
highly_sensitive:
cohort: output/input_ab_yr3_15d.csv
process_merge_ab: # merge 1-2-3 year ab
run: r:latest -e 'rmarkdown::render("analysis/process_merge_ab.Rmd", knit_root_dir = "/workspace", output_dir="/workspace/output")'
needs: [process_Rmatching,extract_variables_ab_yr3_15d, extract_variables_ab_yr3,extract_variables_ab_yr2,extract_variables_ab_yr1]
outputs:
moderately_sensitive:
html: output/process_merge_ab.html
highly_sensitive:
rds: output/matched_patients_monthly_ab.rds
check_ab_yr1:
run: r:latest -e 'rmarkdown::render("analysis/check_ab_yr1.Rmd", knit_root_dir = "/workspace", output_dir="/workspace/output")'
needs: [extract_variables_ab_yr1,matching,process_Rmatching]
outputs:
moderately_sensitive:
html: output/check_ab_yr1.html
check_ab_yr3:
run: r:latest -e 'rmarkdown::render("analysis/check_ab_yr3.Rmd", knit_root_dir = "/workspace", output_dir="/workspace/output")'
needs: [process_Rmatching]
outputs:
moderately_sensitive:
html: output/check_ab_yr3.html
check_abtype:
run: r:latest -e 'rmarkdown::render("analysis/check_abtype.Rmd", knit_root_dir = "/workspace", output_dir="/workspace/output")'
needs: [process_Rmatching]
outputs:
moderately_sensitive:
html: output/check_abtype.html
check_process_1:
run: r:latest -e 'rmarkdown::render("analysis/check_process_1.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
needs: [generate_study_population_covid_primarycare,generate_study_population_covid_SGSS,generate_study_population_covid_admission]
outputs:
moderately_sensitive:
html: output/check_process_1.html
# check_RF:
# run: r:latest -e 'rmarkdown::render("analysis/check_RF.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# needs: [process_Rmatching]
# outputs:
# moderately_sensitive:
# html: output/check_RF.html
# check_RF_grid:
# run: r:latest -e 'rmarkdown::render("analysis/check_RF_grid.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
# needs: [process_Rmatching]
# outputs:
# moderately_sensitive:
# html: output/check_RF_grid.html
check_RF_yr1:
run: r:latest -e 'rmarkdown::render("analysis/check_RF_yr1.Rmd", knit_root_dir = "/workspace", output_dir = "output")'
needs: [extract_variables_ab_yr1,matching,process_Rmatching]
outputs:
moderately_sensitive:
html: output/check_RF_yr1.html
Timeline
-
Created:
-
Started:
-
Finished:
-
Runtime: 07:47:52
These timestamps are generated and stored using the UTC timezone on the TPP backend.
Code comparison
Compare the code used in this Job Request