Skip to content

Job request: 15011

Organisation:
Bennett Institute
Workspace:
mabsavs-usernonuser-ccw
ID:
qj25fxsqomx4ecpg

This page shows the technical details of what happened when the authorised researcher Linda Nab requested one or more actions to be run against real patient data within a secure environment.

By cross-referencing the list of jobs with the pipeline section below, you can infer what security level the outputs were written to.

The output security levels are:

  • highly_sensitive
    • Researchers can never directly view these outputs
    • Researchers can only request code is run against them
  • moderately_sensitive
    • Can be viewed by an approved researcher by logging into a highly secure environment
    • These are the only outputs that can be requested for public release via a controlled output review service.

Jobs

Pipeline

Show project.yaml
version: '3.0'

expectations:
  population_size: 100000

actions:

  ## # # # # # # # # # # # # # # # # # # # 
  ## Data extraction 
  ## # # # # # # # # # # # # # # # # # # # 

  generate_study_population:
    run: cohortextractor:latest generate_cohort --study-definition study_definition --output-format=csv.gz
    outputs:
      highly_sensitive:
        cohort: output/input.csv.gz

  generate_study_population_ba2:
    run: cohortextractor:latest generate_cohort --study-definition study_definition_ba2 --output-format=csv.gz
    outputs:
      highly_sensitive:
        cohort: output/input_ba2.csv.gz

  generate_study_population_flowchart:
    run: cohortextractor:latest generate_cohort --study-definition study_definition_flowchart --output-format=csv.gz
    outputs:
      highly_sensitive:
        cohort: output/input_flowchart.csv.gz
  
  generate_study_population_flowchart_ba2:
    run: cohortextractor:latest generate_cohort --study-definition study_definition_flowchart_ba2 --output-format=csv.gz
    outputs:
      highly_sensitive:
        cohort: output/input_flowchart_ba2.csv.gz

  ## # # # # # # # # # # # # # # # # # # # 
  ## Data cleaning and description
  ## # # # # # # # # # # # # # # # # # # # 

  data_process:
    run: r:latest analysis/data_process.R ba1
    needs: [generate_study_population]
    outputs:
      highly_sensitive:
        data: output/data/data_processed.rds
        rds: output/data_properties/n_excluded.rds

  data_process_ba2:
    run: r:latest analysis/data_process.R ba2
    needs: [generate_study_population_ba2]
    outputs:
      highly_sensitive:
        data: output/data/ba2_data_processed.rds
        rds: output/data_properties/ba2_n_excluded.rds

  data_process_flowchart:
    run: r:latest analysis/data_process_flowchart.R ba1
    needs: [generate_study_population_flowchart]
    outputs:
      highly_sensitive:
        data: output/data/data_flowchart_processed.rds

  data_process_flowchart_ba2:
    run: r:latest analysis/data_process_flowchart.R ba2
    needs: [generate_study_population_flowchart_ba2]
    outputs:
      highly_sensitive:
        data: output/data/ba2_data_flowchart_processed.rds 
  
  data_properties:
    run: r:latest analysis/data_properties/data_properties.R output/data/data_processed.rds output/data_properties
    needs: [data_process]
    outputs:
      moderately_sensitive:
        txt1: output/data_properties/data_processed_skim.txt
        txt2: output/data_properties/data_processed_coltypes.txt
        txt3: output/data_properties/data_processed_tabulate.txt

  data_properties_ba2:
    run: r:latest analysis/data_properties/data_properties.R output/data/ba2_data_processed.rds output/data_properties
    needs: [data_process_ba2]
    outputs:
      moderately_sensitive:
        txt1: output/data_properties/ba2_data_processed_skim.txt
        txt2: output/data_properties/ba2_data_processed_coltypes.txt
        txt3: output/data_properties/ba2_data_processed_tabulate.txt

  create_flowchart:
    run: r:latest analysis/flowchart.R ba1
    needs: [data_process_flowchart, data_process]
    outputs:
      highly_sensitive:
        csv: output/tables/flowchart/flowchart.csv

  create_flowchart_ba2:
    run: r:latest analysis/flowchart.R ba2
    needs: [data_process_flowchart_ba2, data_process_ba2]
    outputs:
      highly_sensitive:
        csv: output/tables/flowchart/ba2_flowchart.csv
  
  sense_check:
    run: r:latest analysis/data_properties/sense_check.R ba1
    needs: [data_process]
    outputs:
      moderately_sensitive:
        csv: output/data_properties/sense_checks.txt

  sense_check_ba2:
    run: r:latest analysis/data_properties/sense_check.R ba2
    needs: [data_process_ba2]
    outputs:
      moderately_sensitive:
        csv: output/data_properties/ba2_sense_checks.txt

  ## # # # # # # # # # # # # # # # # # # # 
  ## CCW Analysis - Day 0 
  ## # # # # # # # # # # # # # # # # # # # 

  ## # # # # # # # # # # # # # # # # # # # 
  ## Tables
  ## # # # # # # # # # # # # # # # # # # # 

  create_table1:
    run: r:latest analysis/table_1.R ba1
    needs: [data_process]
    outputs:
      moderately_sensitive:
        html: output/tables/table1_redacted.html

  create_table1_ba2:
    run: r:latest analysis/table_1.R ba2
    needs: [data_process_ba2]
    outputs:
      moderately_sensitive:
        html: output/tables/ba2_table1_redacted.html

Timeline

  • Created:

  • Started:

  • Finished:

  • Runtime: 03:33:11

These timestamps are generated and stored using the UTC timezone on the TPP backend.

Job request

Status
Failed
Backend
TPP
Requested by
Linda Nab
Branch
ccw-analysis
Force run dependencies
No
Git commit hash
ec77f8d
Requested actions
  • generate_study_population
  • generate_study_population_ba2
  • generate_study_population_flowchart
  • generate_study_population_flowchart_ba2
  • data_process
  • data_process_ba2
  • data_process_flowchart
  • data_process_flowchart_ba2
  • data_properties
  • data_properties_ba2
  • create_flowchart
  • create_flowchart_ba2

Code comparison

Compare the code used in this job request