Job request: 317

Organisation:: The London School of Hygiene & Tropical Medicine
Workspace:: carehomes
ID:: bpwybaiidkst3gy3

This page shows the technical details of what happened when the authorised researcher Seb Bacon requested one or more actions to be run against real patient data within a secure environment.

By cross-referencing the list of jobs with the pipeline section below, you can infer what security level the outputs were written to.

The output security levels are:

highly_sensitive
- Researchers can never directly view these outputs
- Researchers can only request code is run against them
moderately_sensitive
- Can be viewed by an approved researcher by logging into a highly secure environment
- These are the only outputs that can be requested for public release via a controlled output review service.

Jobs

Action:

data_setup

Status:

Job identifier:

ax5fks6cgogcnlwc-manually-set
Action:

data_check

Status:

Job identifier:

huvolyvgn5vcdvfi-manually-set

Pipeline

Show project.yaml

version: '3.0'

expectations:
  population_size: 1000000

actions:
  generate_cohort:
    run: cohortextractor:latest generate_cohort --study-definition study_definition 
    outputs:
      highly_sensitive:
        cohort: input.csv

  generate_cohort_coverage:
    run: cohortextractor:latest generate_cohort --study-definition study_definition_coverage
    outputs:
      highly_sensitive:
        cohort: input_coverage.csv

  calc_coverage:
    run: r:latest analysis/calculate_tpp_coverage.R input_coverage.csv data/msoa_pop.csv
    needs: [generate_cohort_coverage]
    outputs:
      moderately_sensitive:
        log: coverage_log.txt
        rds: tpp_msoa_coverage.rds
        csv: tpp_msoa_coverage.csv
        figure: total_vs_tpp_pop.png
#        map: map_coverage_msoa.pdf

  data_check:
    run: r:latest analysis/data_check.R input.csv tpp_msoa_coverage.rds
    needs: [generate_cohort, calc_coverage]
    outputs:
      moderately_sensitive:
        log: data_checks.txt
        figure1: tpp_coverage_msoa.png
        figure2: tpp_coverage_carehomes.png

  data_setup:
    run: r:latest analysis/data_setup.R input.csv tpp_msoa_coverage.rds 90
    needs: [generate_cohort, calc_coverage]
    outputs:
      moderately_sensitive:
        log: data_setup_log.txt
        rds: community_prevalence.rds
        csv: community_prevalence.csv
      highly_sensitive:
        analysisdata: analysisdata.rds
        input_clean: input_clean.rds
        ch_linelist: ch_linelist.rds
        ch_agg_long: ch_agg_long.rds

  descriptive:
    needs: [data_setup]
    run: r:latest analysis/descriptive.R input_clean.rds ch_linelist.rds ch_agg_long.rds community_prevalence.rds
    outputs:
      moderately_sensitive:
        report: descriptive.pdf
        log: log_descriptive.txt
        data: ch_gp_permsoa.csv


  run_models_50_cutoff:
    needs: [data_setup]
    run: r:latest analysis/run_models.R analysisdata.rds community_prevalence.rds 50
    outputs:
      moderately_sensitive:
        coeffs: coeffs_bestmod_50.csv
        output: output_model_run_50.txt
        log: log_model_run_50.txt
      highly_sensitive:
        fit: fit_opt_50.rds
        data: testdata_50.rds

  validate_models_50_cutoff:
    needs: [run_models_50_cutoff]
    run: r:latest analysis/validate_models.R fit_opt_50.rds testdata_50.rds 50
    outputs:
      moderately_sensitive:
        report: test_pred_figs_50.pdf


  run_models_90_cutoff:
    needs: [data_setup]
    run: r:latest analysis/run_models.R analysisdata.rds community_prevalence.rds 90
    outputs:
      moderately_sensitive:
        coeffs: coeffs_bestmod_90.csv
        output: output_model_run_90.txt
        log: log_model_run_90.txt
      highly_sensitive:
        fit: fit_opt_90.rds
        data: testdata_90.rds

  validate_models_90_cutoff:
    needs: [run_models_90_cutoff]
    run: r:latest analysis/validate_models.R fit_opt_90.rds testdata_90.rds 90
    outputs:
      moderately_sensitive:
        report: test_pred_figs_90.pdf


  run_all:
    needs: [validate_models_90_cutoff, validate_models_50_cutoff, descriptive]
    # In order to be valid this action needs to define a run commmand and
    # some output. We don't really care what these are but the below seems to
    # do the trick.
    run: cohortextractor:latest --version
    outputs:
      moderately_sensitive:
        whatever: project.yaml

Timeline

Created: 5 years, 3 months ago 29 Nov 2020 21:15:55 UTC
Started: 5 years, 3 months ago 29 Nov 2020 21:16:01 UTC
Finished: 5 years, 3 months ago 29 Nov 2020 21:48:57 UTC
Runtime: 00:49:41

These timestamps are generated and stored using the UTC timezone on the TPP backend.

Job request

Status: Failed
Backend: TPP
Workspace: carehomes
Requested by: Seb Bacon
Branch: master
Force run dependencies: No
Git commit hash: 53b64fb
Requested actions: data_check

data_setup

Code comparison

Compare the code used in this job request

Compare to the previous job request