Skip to content

Job request: 2469

Organisation:
The London School of Hygiene & Tropical Medicine
Workspace:
post-covid-competing-risks
ID:
cop7auwjkzfhbaj7

This page shows the technical details of what happened when the authorised researcher John Tazare requested one or more actions to be run against real patient data within a secure environment.

By cross-referencing the list of jobs with the pipeline section below, you can infer what security level the outputs were written to.

The output security levels are:

  • highly_sensitive
    • Researchers can never directly view these outputs
    • Researchers can only request code is run against them
  • moderately_sensitive
    • Can be viewed by an approved researcher by logging into a highly secure environment
    • These are the only outputs that can be requested for public release via a controlled output review service.

Jobs

Pipeline

Show project.yaml
version: "3.0"

expectations:
  population_size: 20000

actions:
  generate_covid_cohort:
    run: cohortextractor:latest generate_cohort --study-definition study_definition_covid
    outputs:
      highly_sensitive:
        cohort: output/input_covid.csv

  generate_covid_community_cohort:
    run: cohortextractor:latest generate_cohort --study-definition study_definition_covid_community
    outputs:
      highly_sensitive:
        cohort: output/input_covid_community.csv

  generate_covid_general_population_cohort:
    run: cohortextractor:latest generate_cohort --study-definition study_definition_general_population
    outputs:
      highly_sensitive:
        cohort: output/input_general_population.csv

  generate_pneumonia_cohort:
    run: cohortextractor:latest generate_cohort --study-definition study_definition_pneumonia
    outputs:
      highly_sensitive:
        cohort: output/input_pneumonia.csv

  matching:
    run: python:latest python analysis/match_running.py
    needs: [generate_covid_cohort, generate_covid_general_population_cohort]
    outputs:
      moderately_sensitive: 
        matching_report: output/matching_report_general_population.txt  
      highly_sensitive: 
        combined: output/matched_combined_general_population.csv

  covid_rates_cohort:
    run: stata-mp:latest analysis/000_cr_define_covariates_simple_rates.do "covid"
    needs: [generate_covid_cohort]
    outputs:
      highly_sensitive:
        analysis_dataset: output/cohort_rates_covid.dta
      moderately_sensitive:
        out_dist: output/tabfig/outcomes_in_hosp_covid.csv
        figs: output/tabfig/length_of_stay_covid.svg

  covid_community_rates_cohort:
    run: stata-mp:latest analysis/000_cr_define_covariates_simple_rates.do "covid_community"
    needs: [generate_covid_community_cohort]
    outputs:
      highly_sensitive:
        analysis_dataset: output/cohort_rates_covid_community.dta

  pneumonia_rates_cohort:
    run: stata-mp:latest analysis/000_cr_define_covariates_simple_rates.do "pneumonia"
    needs: [generate_pneumonia_cohort]
    outputs:
      highly_sensitive:
        analysis_dataset: output/cohort_rates_pneumonia.dta
      moderately_sensitive:
        out_dist: output/tabfig/outcomes_in_hosp_pneumonia.csv
        figs: output/tabfig/length_of_stay_pneumonia.svg

  gen_pop_rates_cohort:
    run: stata-mp:latest analysis/000_cr_define_covariates_simple_rates.do "matched_combined_general_population"
    needs: [matching]
    outputs:
      highly_sensitive:
        analysis_dataset: output/cohort_rates_gen_population.dta

  covid_rates:
    run: stata-mp:latest analysis/201_cr_simple_rates.do "covid"
    needs: [covid_rates_cohort]
    outputs:
      moderately_sensitive:
        rates: output/tabfig/rates_summary_covid.csv

  covid_comm_rates:
    run: stata-mp:latest analysis/201_cr_simple_rates.do "covid_community"
    needs: [covid_community_rates_cohort]
    outputs:
      moderately_sensitive:
        rates: output/tabfig/rates_summary_covid_community.csv

  pneumonia_rates:
    run: stata-mp:latest analysis/201_cr_simple_rates.do "pneumonia"
    needs: [pneumonia_rates_cohort]
    outputs:
      moderately_sensitive:
        rates: output/tabfig/rates_summary_pneumonia.csv

  gen_pop_rates:
    run: stata-mp:latest analysis/201_cr_simple_rates.do "gen_population"
    needs: [gen_pop_rates_cohort]
    outputs:
      moderately_sensitive:
        rates: output/tabfig/rates_summary_gen_population.csv

  baseline_characteristics:
    run: stata-mp:latest analysis/400_baseline_characteristics.do
    needs: [covid_rates_cohort, covid_community_rates_cohort, pneumonia_rates_cohort, gen_pop_rates_cohort]
    outputs:
      moderately_sensitive:
        tables: output/tabfig/an_descriptiveTable_*.txt

  append_cohorts:
    run: stata-mp:latest analysis/300_cr_data_management_matching.do
    needs: [covid_rates_cohort, pneumonia_rates_cohort, gen_pop_rates_cohort]
    outputs:
      moderately_sensitive:
        log: output/append_cohorts.txt
      highly_sensitive: 
        dataset: output/combined_covid_pneumonia.dta
        dataset2: output/combined_covid_gen_population.dta

  cox_models:
    run: stata-mp:latest analysis/302_cox_models.do
    needs: [append_cohorts]
    outputs:
      moderately_sensitive:
        log: output/cox_models.txt
        dataset: output/tabfig/cox_model_summary.csv


  fine_gray:
    run: stata-mp:latest analysis/302_competing_events.do
    needs: [append_cohorts]
    outputs:
      moderately_sensitive:
        log: output/competing_events.txt
        dataset: output/tabfig/fine_gray_summary.csv
        figs: output/tabfig/cumInc_*.svg

Timeline

  • Created:

  • Started:

  • Finished:

  • Runtime: 46:54:36

These timestamps are generated and stored using the UTC timezone on the TPP backend.

Job request

Status
Succeeded
Backend
TPP
Requested by
John Tazare
Branch
reviewer-updates
Force run dependencies
Yes
Git commit hash
e058078
Requested actions
  • run_all

Code comparison

Compare the code used in this job request