Job request: 619
- Organisation:
- The London School of Hygiene & Tropical Medicine
- Workspace:
- carehomes
- ID:
- pa5sqfwrzg4u3xja
This page shows the technical details of what happened when the authorised researcher Emily Nightingale requested one or more actions to be run against real patient data within a secure environment.
By cross-referencing the list of jobs with the pipeline section below, you can infer what security level the outputs were written to.
The output security levels are:
-
highly_sensitive
- Researchers can never directly view these outputs
- Researchers can only request code is run against them
-
moderately_sensitive
- Can be viewed by an approved researcher by logging into a highly secure environment
- These are the only outputs that can be requested for public release via a controlled output review service.
Jobs
-
- Job identifier:
-
n7g36hyjcou66k7c
-
- Job identifier:
-
sjxsgzj7h3p4kqf7
-
- Job identifier:
-
kcgmsnd6y7ik3i4b
-
- Job identifier:
-
75rxmpencukpez7p
Pipeline
Show project.yaml
version: '3.0'
expectations:
population_size: 1000000
actions:
generate_cohort:
run: cohortextractor:latest generate_cohort --study-definition study_definition
outputs:
highly_sensitive:
cohort: input.csv
generate_cohort_coverage:
run: cohortextractor:latest generate_cohort --study-definition study_definition_coverage
outputs:
highly_sensitive:
cohort: input_coverage.csv
calc_coverage:
run: r:latest analysis/calculate_tpp_coverage.R input_coverage.csv data/SAPE22DT15_mid_2019_msoa.csv
needs: [generate_cohort_coverage]
outputs:
moderately_sensitive:
log: coverage_log.txt
rds: tpp_msoa_coverage.rds
csv: tpp_msoa_coverage.csv
csv2: msoas_in_tpp.csv
csv3: msoa_gt_100_cov.csv
figure: total_vs_tpp_pop.png
# map: map_coverage_msoa.pdf
data_check:
run: r:latest analysis/data_check.R input.csv tpp_msoa_coverage.rds data/msoa_shp.rds
needs: [generate_cohort, calc_coverage]
outputs:
moderately_sensitive:
log: data_checks.txt
figure1: tpp_coverage_msoa.png
figure2: tpp_coverage_carehomes.png
figure3: tpp_coverage_map.pdf
figure4: age_dist.png
figure5: infection_death_delays.png
figure6: hh_size_dist.png
data_setup:
# last numeric argument relates to cut off for carehome TPP coverage >= X%
run: r:latest analysis/data_setup.R input.csv tpp_msoa_coverage.rds 95
needs: [generate_cohort, calc_coverage]
outputs:
moderately_sensitive:
log: data_setup_log.txt
rds: community_prevalence.rds
csv: community_prevalence.csv
highly_sensitive:
analysisdata: analysisdata.rds
input_clean: input_clean.rds
ch_linelist: ch_linelist.rds
ch_agg_long: ch_agg_long.rds
descriptive:
needs: [data_setup]
run: r:latest analysis/descriptive.R input_clean.rds ch_linelist.rds ch_agg_long.rds community_prevalence.rds
outputs:
moderately_sensitive:
report: descriptive.pdf
log: log_descriptive.txt
data: ch_gp_permsoa.csv
run_models:
needs: [data_setup]
run: r:latest analysis/run_models.R analysisdata.rds community_prevalence.rds 80
outputs:
moderately_sensitive:
coeffs: coeffs_bestmod_80.csv
output: output_model_run_80.txt
log: log_model_run_80.txt
highly_sensitive:
fit: fit_opt_80.rds
data: testdata_80.rds
validate_models:
needs: [run_models]
run: r:latest analysis/validate_models.R fit_opt_80.rds testdata_80.rds 80
outputs:
moderately_sensitive:
report: test_pred_figs_80.pdf
run_all:
needs: [validate_models, descriptive]
# In order to be valid this action needs to define a run commmand and
# some output. We don't really care what these are but the below seems to
# do the trick.
run: cohortextractor:latest --version
outputs:
moderately_sensitive:
whatever: project.yaml
Timeline
-
Created:
-
Started:
-
Finished:
-
Runtime: 00:00:33
These timestamps are generated and stored using the UTC timezone on the TPP backend.
Code comparison
Compare the code used in this job request