Job request: 8883
- Organisation:
- Bennett Institute
- Workspace:
- covid_mortality_over_time
- ID:
- atzen5yaxz5jwneb
This page shows the technical details of what happened when the authorised researcher Linda Nab requested one or more actions to be run against real patient data within a secure environment.
By cross-referencing the list of jobs with the pipeline section below, you can infer what security level the outputs were written to.
The output security levels are:
-
highly_sensitive
- Researchers can never directly view these outputs
- Researchers can only request code is run against them
-
moderately_sensitive
- Can be viewed by an approved researcher by logging into a highly secure environment
- These are the only outputs that can be requested for public release via a controlled output review service.
Jobs
-
- Job identifier:
-
4k5yduxtimvj5mdo
-
- Job identifier:
-
cdcfxszzzqvis3ts
-
- Job identifier:
-
r3z4mh76t5zebgmc
-
- Job identifier:
-
h3t35gs277vsfqph
-
- Job identifier:
-
qjxmmk2dgi6gfjqc
-
- Job identifier:
-
2httd6j76xuo35p3
-
- Job identifier:
-
cqblmraqlksnaexw
-
- Job identifier:
-
fb7izfzkwumc5ugw
-
- Job identifier:
-
pgy73ikvqytjdjdx
-
- Job identifier:
-
vk6t27q3efxkyal7
-
- Job identifier:
-
4rnzhp5jpd4pr4kv
-
- Job identifier:
-
e6p32y5ivkocomyk
-
- Job identifier:
-
goe4kced6h5xozus
-
- Job identifier:
-
4bjvx6gazaht3szd
-
- Job identifier:
-
32pmsfs7fu54c3ox
-
- Job identifier:
-
5ou5t7rbrp6di4lz
-
- Job identifier:
-
nqp3jpkfmjrwvsa7
-
- Job identifier:
-
a6pjwb6nvhwt7nue
-
- Job identifier:
-
l2fzg6wlg2emd4vg
-
- Job identifier:
-
4tmlf53o2aa7d5vf
-
- Job identifier:
-
pccn7nrfaurqftvz
-
- Job identifier:
-
q3alnsjzt3jwprod
-
- Job identifier:
-
5nqsdcvqdnqjejkd
-
- Job identifier:
-
vsl6m4u7lot2pk74
-
- Job identifier:
-
rw7qeev6gee6k6zv
Pipeline
Show project.yaml
version: '3.0'
expectations:
population_size: 1000
actions:
# Extract data
# When argument --index-date-range is changed, change has to be made in ./analysis/config.json too
generate_study_population:
run: >
cohortextractor:latest generate_cohort
--study-definition study_definition
--skip-existing
--output-format=csv.gz
--index-date-range "2020-03-01 to 2022-02-01 by month"
outputs:
highly_sensitive:
cohort: output/input_*.csv.gz
# Extract ethnicity
generate_study_population_ethnicity:
run: >
cohortextractor:latest generate_cohort
--study-definition study_definition_ethnicity
--output-format=csv.gz
outputs:
highly_sensitive:
cohort: output/input_ethnicity.csv.gz
# Join data
join_cohorts:
run: >
cohort-joiner:v0.0.7
--lhs output/input_202*.csv.gz
--rhs output/input_ethnicity.csv.gz
--output-dir=output/joined
needs: [generate_study_population, generate_study_population_ethnicity]
outputs:
highly_sensitive:
cohort: output/joined/input_202*.csv.gz
# Calculate mortality rates (crude + subgroup specific)
calculate_measures:
run: >
cohortextractor:latest generate_measures
--study-definition study_definition
--skip-existing
--output-dir=output/joined
needs: [join_cohorts]
outputs:
moderately_sensitive:
measure: output/joined/measure_*_mortality_rate.csv
# Calculate mortality rates ckd_rrt subgroup
calculate_measures_ckd_rrt:
run: r:latest analysis/measures_calc_ckd_rrt.R
needs: [join_cohorts]
outputs:
moderately_sensitive:
measure: output/joined/measure_ckd_rrt_mortality_rate.csv
# Redact rates
redact_rates:
run: r:latest analysis/utils/redact_rates.R
needs: [calculate_measures, calculate_measures_ckd_rrt]
outputs:
moderately_sensitive:
csvs: output/rates/redacted/*_redacted.csv
# Standardise crude mortality rate
standardise_crude_rates:
run: r:latest analysis/crude_rates_standardise.R
needs: [redact_rates]
outputs:
moderately_sensitive:
csvs: output/rates/standardised/crude_*std.csv
# Standardise subgroup specific mortality rates
standardise_subgroup_rates:
run: r:latest analysis/subgroups_rates_standardise.R
needs: [redact_rates]
outputs:
moderately_sensitive:
csvs: output/rates/standardised/*_std.csv
# Process subgroup specific mortality rates
process_subgroup_rates:
run: r:latest analysis/utils/process_rates.R
needs: [standardise_subgroup_rates, standardise_subgroup_rates]
outputs:
moderately_sensitive:
csvs: output/rates/processed/*.csv
# Calculate standardised rate ratios
calculate_rate_ratios:
run: r:latest analysis/subgroups_ratios.R
needs: [standardise_subgroup_rates, process_subgroup_rates]
outputs:
moderately_sensitive:
csvs: output/ratios/*.csv
# Plot and save graphs depicting the crude rates
visualise_crude_rates:
run: r:latest analysis/crude_rates_visualise.R
needs: [standardise_crude_rates]
outputs:
moderately_sensitive:
pngs: output/figures/rates_crude/*.png
# Plot and save graphs depicting the subgroup specific mortality rates
visualise_subgroup_rates:
run: r:latest analysis/subgroups_rates_visualise.R
needs: [standardise_subgroup_rates, process_subgroup_rates]
outputs:
moderately_sensitive:
pngs: output/figures/rates_subgroups/*.png
# Plot and save graphs depicting the subgroup specific mortality ratios
visualise_subgroup_ratios:
run: r:latest analysis/subgroups_ratios_visualise.R
needs: [calculate_rate_ratios]
outputs:
moderately_sensitive:
pngs: output/figures/ratios_subgroups/*.png
# SECOND PART OF STUDY
generate_study_population_wave1:
run: >
cohortextractor:latest generate_cohort
--study-definition study_definition_wave1
--skip-existing
--output-format=csv.gz
outputs:
highly_sensitive:
cohort: output/input_wave1.csv.gz
generate_study_population_wave2:
run: >
cohortextractor:latest generate_cohort
--study-definition study_definition_wave2
--skip-existing
--output-format=csv.gz
outputs:
highly_sensitive:
cohort: output/input_wave2.csv.gz
generate_study_population_wave3:
run: >
cohortextractor:latest generate_cohort
--study-definition study_definition_wave3
--skip-existing
--output-format=csv.gz
outputs:
highly_sensitive:
cohort: output/input_wave3.csv.gz
# Join data
join_cohorts_waves:
run: >
cohort-joiner:v0.0.7
--lhs output/input_wave*.csv.gz
--rhs output/input_ethnicity.csv.gz
--output-dir=output/joined
needs: [generate_study_population_wave1, generate_study_population_wave2, generate_study_population_wave3, generate_study_population_ethnicity]
outputs:
highly_sensitive:
cohort: output/joined/input_wave*.csv.gz
# Process data
process_data:
run: r:latest analysis/data_process.R
needs: [join_cohorts_waves]
outputs:
highly_sensitive:
rds: output/processed/input_wave*.rds
# Skim data
skim_data_wave1:
run: r:latest analysis/data_skim.R output/processed/input_wave1.rds output/data_properties
needs: [process_data]
outputs:
moderately_sensitive:
txt1: output/data_properties/input_wave1_skim.txt
txt2: output/data_properties/input_wave1_coltypes.txt
txt3: output/data_properties/input_wave1_tabulate.txt
skim_data_wave2:
run: r:latest analysis/data_skim.R output/processed/input_wave2.rds output/data_properties
needs: [process_data]
outputs:
moderately_sensitive:
txt1: output/data_properties/input_wave2_skim.txt
txt2: output/data_properties/input_wave2_coltypes.txt
txt3: output/data_properties/input_wave2_tabulate.txt
skim_data_wave3:
run: r:latest analysis/data_skim.R output/processed/input_wave3.rds output/data_properties
needs: [process_data]
outputs:
moderately_sensitive:
txt1: output/data_properties/input_wave3_skim.txt
txt2: output/data_properties/input_wave3_coltypes.txt
txt3: output/data_properties/input_wave3_tabulate.txt
# Create table one
create_table_one:
run: r:latest analysis/table_one.R
needs: [process_data]
outputs:
moderately_sensitive:
html: output/tables/table1.html
# Kaplan-Meier
create_kaplan_meier:
run: r:latest analysis/waves_kaplan_meier.R
needs: [process_data]
outputs:
moderately_sensitive:
pngs: output/figures/kaplan_meier/wave*_*.png
# COX ph models
model_cox_ph:
run: r:latest analysis/waves_model_survival.R
needs: [process_data]
outputs:
moderately_sensitive:
csvs1: output/tables/wave*_effect_estimates.csv
csvs2: output/tables/wave*_ph_tests.csv
csvs3: output/tables/wave*_log_file.csv
# Create table two
create_table_two:
run: r:latest analysis/table_two.R
needs: [model_cox_ph]
outputs:
moderately_sensitive:
html: output/tables/table2.html
Timeline
-
Created:
-
Started:
-
Finished:
-
Runtime: 118:05:08
These timestamps are generated and stored using the UTC timezone on the TPP backend.
Job request
- Status
-
Succeeded
- Backend
- TPP
- Workspace
- covid_mortality_over_time
- Requested by
- Linda Nab
- Branch
- main
- Force run dependencies
- Yes
- Git commit hash
- 4a51b47
- Requested actions
-
-
run_all
-
Code comparison
Compare the code used in this job request