Skip to content

Job request: 23775

Organisation:
The London School of Hygiene & Tropical Medicine
Workspace:
covid_collateral_hf_update
ID:
zkdhmbxjiausc6ps

This page shows the technical details of what happened when the authorised researcher Emily Herrett requested one or more actions to be run against real patient data in the project, within a secure environment.

By cross-referencing the list of jobs with the pipeline section below, you can infer what security level various outputs were written to. Researchers can never directly view outputs marked as highly_sensitive ; they can only request that code runs against them. Outputs marked as moderately_sensitive can be viewed by an approved researcher by logging into a highly secure environment. Only outputs marked as moderately_sensitive can be requested for release to the public, via a controlled output review service.

Jobs

Pipeline

Show project.yaml
version: '3.0'

# Ignore this`expectation` block. It is required but not used, and will be removed in future versions.
expectations:
  population_size: 10000

actions:
  generate_dataset_prepandemic:
    run: ehrql:v1 generate-dataset analysis/dataset_definition_prepandemic.py --output output/dataset_prepandemic.csv
    outputs:
      highly_sensitive:
        dataset: output/dataset_prepandemic.csv

  generate_dataset_pandemic:
    run: ehrql:v1 generate-dataset analysis/dataset_definition_pandemic.py --output output/dataset_pandemic.csv
    outputs:
      highly_sensitive:
        dataset: output/dataset_pandemic.csv

  generate_dataset_postpandemic:
    run: ehrql:v1 generate-dataset analysis/dataset_definition_postpandemic.py --output output/dataset_postpandemic.csv
    outputs:
      highly_sensitive:
        dataset: output/dataset_postpandemic.csv

  # Generate datasets for analysis   001  
  generate_analysis_datasets:
    run: stata-mp:latest analysis/001_cr_define_covariates_cohorts.do
    needs: [generate_dataset_prepandemic, generate_dataset_pandemic, generate_dataset_postpandemic]
    outputs:
      highly_sensitive:
        log1: logs/001_cr_define_covariates_cohorts.log
        data1: output/prepandemic.dta 
        data2: output/pandemic.dta 
        data3: output/postpandemic.dta 

  # Drug prevalence dataset: 102
  generate_drugprevalence:
    run: stata-mp:latest analysis/102_cr_drug_prevalence_contraind.do
    needs: [generate_dataset_prepandemic, generate_dataset_pandemic, generate_dataset_postpandemic, generate_analysis_datasets]
    outputs:
      highly_sensitive:
        log1: logs/102_cr_drug_prevalence.log
        data1: output/tabfig/prevalences_summary_*.dta
        data2: output/tabfig/prevalences_summary_redacted_rounded_overall.dta 

  # Drug prevalence dataset: 102A
  generate_drugprevalence_coms:
    run: stata-mp:latest analysis/102_A_cr_drug_prevalence_contraind_combinations.do
    needs: [generate_dataset_prepandemic, generate_dataset_pandemic, generate_dataset_postpandemic, generate_analysis_datasets]
    outputs:
      highly_sensitive:
        log1: logs/102_A_cr_drug_prevalence_combinations.log
        data1: output/tabfig/drugcombinations.csv

  # Cohort rates: 103
  generate_rates:
    run: stata-mp:latest analysis/103_cr_cohort_rates_repeated.do
    needs: [generate_dataset_prepandemic, generate_dataset_pandemic, generate_dataset_postpandemic, generate_analysis_datasets]
    outputs:
      highly_sensitive:
        log1: logs/103_cohort_rates_repeated.log
        data1: output/tabfig/rates_repeated_pandemic_redacted_rounded.dta 
        data2: output/tabfig/rates_repeated_prepandemic_redacted_rounded.dta 
        data3: output/tabfig/rates_repeated_postpandemic_redacted_rounded.dta 

  # Cohort rates: 103A
  generate_rates_A:
    run: stata-mp:latest analysis/103_A_cr_cohort_rates_repeated_diabetes.do
    needs: [generate_dataset_prepandemic, generate_dataset_pandemic, generate_dataset_postpandemic, generate_analysis_datasets]
    outputs:
      highly_sensitive:
        log1: logs/103_A_cohort_rates_repeated_diabetes.log
        data1: output/tabfig/rates_repeated_pandemic_redacted_rounded_diabetes.dta 
        data2: output/tabfig/rates_repeated_prepandemic_redacted_rounded_diabetes.dta 
        data3: output/tabfig/rates_repeated_postpandemic_redacted_rounded_diabetes.dta 

  # Generate table 1 :104  
  generate_table1:
    run: stata-mp:latest analysis/104_cr_table1.do
    needs: [generate_dataset_prepandemic, generate_dataset_pandemic, generate_dataset_postpandemic, generate_analysis_datasets]
    outputs:
      highly_sensitive:
        log1: logs/104_cr_table1.log
        data1: output/tabfig/table1_prepandemic_redacted_rounded.csv 
        data2: output/tabfig/table1_pandemic_redacted_rounded.csv 
        data3: output/tabfig/table1_postpandemic_redacted_rounded.csv 

  # Drug graphs: 105
  generate_druggraphs:
    run: stata-mp:latest analysis/105_cr_drug_prevalence_graphs.do
    needs: [generate_dataset_prepandemic, generate_dataset_pandemic, generate_dataset_postpandemic, generate_analysis_datasets, generate_drugprevalence]
    outputs:
      highly_sensitive:
        log1: logs/105_cr_graphs.log
      moderately_sensitive:
        Figures1: output/tabfig/*_prevalences_by_drug_*.svg 
        Figures2: output/tabfig/prevalences_*.svg 

  # Drug graphs: 106
  generate_rategraphs:
    run: stata-mp:latest analysis/106_cr_graphs_rates.do
    needs: [generate_dataset_prepandemic, generate_dataset_pandemic, generate_dataset_postpandemic, generate_analysis_datasets, generate_rates]
    outputs:
      highly_sensitive:
        log1: logs/106_cr_graphs_rates.log
      moderately_sensitive:
        Figures1: output/tabfig/rates_*.svg 

  # Drug graphs: 107
  generate_drugescalation:
    run: stata-mp:latest analysis/107_cr_cohorts_escalation.do
    needs: [generate_dataset_prepandemic, generate_dataset_pandemic, generate_dataset_postpandemic, generate_analysis_datasets]
    outputs:
      highly_sensitive:
        log1: logs/107_cr_cohorts_escalation.log
      moderately_sensitive:
        Figures1: output/tabfig/escalation_*.svg 

  # Drug graphs: 108
  generate_30d_mort:
    run: stata-mp:latest analysis/108_cr_cohorts_posthosp_mortality.do
    needs: [generate_dataset_prepandemic, generate_dataset_pandemic, generate_dataset_postpandemic, generate_analysis_datasets]
    outputs:
      highly_sensitive:
        log1: logs/108_cr_cohorts_posthosp_mortality.log
      moderately_sensitive:
        Figures1: output/tabfig/30d_mort_*.svg

Timeline

  • Created:

  • Finished:

  • Runtime:

These timestamps are generated and stored using the UTC timezone on the TPP backend.

Job information

Status
Failed
Backend
TPP
Requested by
Emily Herrett
Branch
main
Force run dependencies
Yes
Git commit hash
099638e
Requested actions
  • generate_analysis_datasets
  • generate_drugprevalence
  • generate_drugprevalence_coms
  • generate_rates
  • generate_rates_A
  • generate_table1
  • run_all

Code comparison

Compare the code used in this Job Request