Job request: 321

Organisation:: The London School of Hygiene & Tropical Medicine
Workspace:: carehomes
ID:: vhggtbotwfppwve7

This page shows the technical details of what happened when authorised researcher George Hickman requested one or more actions to be run against real patient data in the project, within a secure environment.

By cross-referencing the indicated Requested Actions with the Pipeline section below, you can infer what security level various outputs were written to. Outputs marked as highly_sensitive can never be viewed directly by a researcher; they can only request that code runs against them. Outputs marked as moderately_sensitive can be viewed by an approved researcher by logging into a highly secure environment. Only outputs marked as moderately_sensitive can be requested for release to the public, via a controlled output review service.

Jobs

Action:

generate_cohort_coverage

Status:

Status: Succeeded

Job identifier:

5bahjvkcbis4tc3j
Action:

calc_coverage

Status:

Status: Succeeded

Job identifier:

r567pac6gu2w2ww2
Action:

generate_cohort

Status:

Status: Succeeded

Job identifier:

zqzkgf7ztippemoq
Action:

data_setup

Status:

Status: Succeeded

Job identifier:

3mehbx5eevl4bvgq
Action:

descriptive

Status:

Status: Succeeded

Job identifier:

y5wzjlwl5ncv635i
Action:

run_models_50_cutoff

Status:

Status: Succeeded

Job identifier:

bgcix7rfgojc2t7y
Action:

run_models_90_cutoff

Status:

Status: Succeeded

Job identifier:

yzth5vopt3kzniyw

Pipeline

Show project.yaml

version: '3.0'

expectations:
  population_size: 1000000

actions:
  generate_cohort:
    run: cohortextractor:latest generate_cohort --study-definition study_definition 
    outputs:
      highly_sensitive:
        cohort: input.csv

  generate_cohort_coverage:
    run: cohortextractor:latest generate_cohort --study-definition study_definition_coverage
    outputs:
      highly_sensitive:
        cohort: input_coverage.csv

  calc_coverage:
    run: r:latest analysis/calculate_tpp_coverage.R input_coverage.csv data/msoa_pop.csv
    needs: [generate_cohort_coverage]
    outputs:
      moderately_sensitive:
        log: coverage_log.txt
        rds: tpp_msoa_coverage.rds
        csv: tpp_msoa_coverage.csv
        figure: total_vs_tpp_pop.png
#        map: map_coverage_msoa.pdf

  data_check:
    run: r:latest analysis/data_check.R input.csv tpp_msoa_coverage.rds
    needs: [generate_cohort, calc_coverage]
    outputs:
      moderately_sensitive:
        log: data_checks.txt
        figure1: tpp_coverage_msoa.png
        figure2: tpp_coverage_carehomes.png

  data_setup:
    run: r:latest analysis/data_setup.R input.csv tpp_msoa_coverage.rds 90
    needs: [generate_cohort, calc_coverage]
    outputs:
      moderately_sensitive:
        log: data_setup_log.txt
        rds: community_prevalence.rds
        csv: community_prevalence.csv
      highly_sensitive:
        analysisdata: analysisdata.rds
        input_clean: input_clean.rds
        ch_linelist: ch_linelist.rds
        ch_agg_long: ch_agg_long.rds

  descriptive:
    needs: [data_setup]
    run: r:latest analysis/descriptive.R input_clean.rds ch_linelist.rds ch_agg_long.rds community_prevalence.rds
    outputs:
      moderately_sensitive:
        report: descriptive.pdf
        log: log_descriptive.txt
        data: ch_gp_permsoa.csv


  run_models_50_cutoff:
    needs: [data_setup]
    run: r:latest analysis/run_models.R analysisdata.rds community_prevalence.rds 50
    outputs:
      moderately_sensitive:
        coeffs: coeffs_bestmod_50.csv
        output: output_model_run_50.txt
        log: log_model_run_50.txt
      highly_sensitive:
        fit: fit_opt_50.rds
        data: testdata_50.rds

  validate_models_50_cutoff:
    needs: [run_models_50_cutoff]
    run: r:latest analysis/validate_models.R fit_opt_50.rds testdata_50.rds 50
    outputs:
      moderately_sensitive:
        report: test_pred_figs_50.pdf


  run_models_90_cutoff:
    needs: [data_setup]
    run: r:latest analysis/run_models.R analysisdata.rds community_prevalence.rds 90
    outputs:
      moderately_sensitive:
        coeffs: coeffs_bestmod_90.csv
        output: output_model_run_90.txt
        log: log_model_run_90.txt
      highly_sensitive:
        fit: fit_opt_90.rds
        data: testdata_90.rds

  validate_models_90_cutoff:
    needs: [run_models_90_cutoff]
    run: r:latest analysis/validate_models.R fit_opt_90.rds testdata_90.rds 90
    outputs:
      moderately_sensitive:
        report: test_pred_figs_90.pdf


  run_all:
    needs: [validate_models_90_cutoff, validate_models_50_cutoff, descriptive]
    # In order to be valid this action needs to define a run commmand and
    # some output. We don't really care what these are but the below seems to
    # do the trick.
    run: cohortextractor:latest --version
    outputs:
      moderately_sensitive:
        whatever: project.yaml

Timeline

Created: 3 years, 4 months ago 30 Nov 2020 14:11:33 UTC
Started: 3 years, 4 months ago 30 Nov 2020 14:15:22 UTC
Finished: 3 years, 4 months ago 30 Nov 2020 16:12:04 UTC
Runtime: 00:01:18

These timestamps are generated and stored using the UTC timezone on the TPP backend.

Job information

Status: Succeeded
Backend: TPP
Workspace: carehomes
Requested by: George Hickman
Branch: master
Force run dependencies: No
Git commit hash: bfb52f8
Requested actions: descriptive

run_models_50_cutoff

run_models_90_cutoff

Code comparison

Compare the code used in this Job Request

Compare to the previous Job Request