Skip to content

Job request: 619

Organisation:
The London School of Hygiene & Tropical Medicine
Workspace:
carehomes
ID:
pa5sqfwrzg4u3xja

This page shows the technical details of what happened when the authorised researcher Emily Nightingale requested one or more actions to be run against real patient data in the project, within a secure environment.

By cross-referencing the list of jobs with the pipeline section below, you can infer what security level various outputs were written to. Researchers can never directly view outputs marked as highly_sensitive ; they can only request that code runs against them. Outputs marked as moderately_sensitive can be viewed by an approved researcher by logging into a highly secure environment. Only outputs marked as moderately_sensitive can be requested for release to the public, via a controlled output review service.

Jobs

  • Action:
    calc_coverage
    Status:
    Status: Succeeded
    Job identifier:
    n7g36hyjcou66k7c
  • Action:
    data_setup
    Status:
    Status: Succeeded
    Job identifier:
    sjxsgzj7h3p4kqf7
  • Action:
    descriptive
    Status:
    Status: Succeeded
    Job identifier:
    kcgmsnd6y7ik3i4b
  • Action:
    data_check
    Status:
    Status: Succeeded
    Job identifier:
    75rxmpencukpez7p

Pipeline

Show project.yaml
version: '3.0'

expectations:
  population_size: 1000000

actions:
  generate_cohort:
    run: cohortextractor:latest generate_cohort --study-definition study_definition 
    outputs:
      highly_sensitive:
        cohort: input.csv

  generate_cohort_coverage:
    run: cohortextractor:latest generate_cohort --study-definition study_definition_coverage
    outputs:
      highly_sensitive:
        cohort: input_coverage.csv

  calc_coverage:
    run: r:latest analysis/calculate_tpp_coverage.R input_coverage.csv data/SAPE22DT15_mid_2019_msoa.csv
    needs: [generate_cohort_coverage]
    outputs:
      moderately_sensitive:
        log: coverage_log.txt
        rds: tpp_msoa_coverage.rds
        csv: tpp_msoa_coverage.csv
        csv2: msoas_in_tpp.csv
        csv3: msoa_gt_100_cov.csv
        figure: total_vs_tpp_pop.png
#        map: map_coverage_msoa.pdf

  data_check:
    run: r:latest analysis/data_check.R input.csv tpp_msoa_coverage.rds data/msoa_shp.rds
    needs: [generate_cohort, calc_coverage]
    outputs:
      moderately_sensitive:
        log: data_checks.txt
        figure1: tpp_coverage_msoa.png
        figure2: tpp_coverage_carehomes.png
        figure3: tpp_coverage_map.pdf
        figure4: age_dist.png
        figure5: infection_death_delays.png
        figure6: hh_size_dist.png

  data_setup:
  # last numeric argument relates to cut off for carehome TPP coverage >= X%
    run: r:latest analysis/data_setup.R input.csv tpp_msoa_coverage.rds 95
    needs: [generate_cohort, calc_coverage]
    outputs:
      moderately_sensitive:
        log: data_setup_log.txt
        rds: community_prevalence.rds
        csv: community_prevalence.csv
      highly_sensitive:
        analysisdata: analysisdata.rds
        input_clean: input_clean.rds
        ch_linelist: ch_linelist.rds
        ch_agg_long: ch_agg_long.rds

  descriptive:
    needs: [data_setup]
    run: r:latest analysis/descriptive.R input_clean.rds ch_linelist.rds ch_agg_long.rds community_prevalence.rds
    outputs:
      moderately_sensitive:
        report: descriptive.pdf
        log: log_descriptive.txt
        data: ch_gp_permsoa.csv


  run_models:
    needs: [data_setup]
    run: r:latest analysis/run_models.R analysisdata.rds community_prevalence.rds 80
    outputs:
      moderately_sensitive:
        coeffs: coeffs_bestmod_80.csv
        output: output_model_run_80.txt
        log: log_model_run_80.txt
      highly_sensitive:
        fit: fit_opt_80.rds
        data: testdata_80.rds

  validate_models:
    needs: [run_models]
    run: r:latest analysis/validate_models.R fit_opt_80.rds testdata_80.rds 80
    outputs:
      moderately_sensitive:
        report: test_pred_figs_80.pdf


  run_all:
    needs: [validate_models, descriptive]
    # In order to be valid this action needs to define a run commmand and
    # some output. We don't really care what these are but the below seems to
    # do the trick.
    run: cohortextractor:latest --version
    outputs:
      moderately_sensitive:
        whatever: project.yaml

Timeline

  • Created:

  • Started:

  • Finished:

  • Runtime: 00:00:33

These timestamps are generated and stored using the UTC timezone on the TPP backend.

Job information

Status
Succeeded
Backend
TPP
Workspace
carehomes
Requested by
Emily Nightingale
Branch
master
Force run dependencies
No
Git commit hash
d102e40
Requested actions
  • calc_coverage
  • data_check
  • data_setup
  • descriptive

Code comparison

Compare the code used in this Job Request