Skip to content

Job request: 23812

Organisation:
The London School of Hygiene & Tropical Medicine
Workspace:
covid_collateral_hf_update
ID:
wrcss2rlnqs32fby

This page shows the technical details of what happened when the authorised researcher Emily Herrett requested one or more actions to be run against real patient data in the project, within a secure environment.

By cross-referencing the list of jobs with the pipeline section below, you can infer what security level various outputs were written to. Researchers can never directly view outputs marked as highly_sensitive ; they can only request that code runs against them. Outputs marked as moderately_sensitive can be viewed by an approved researcher by logging into a highly secure environment. Only outputs marked as moderately_sensitive can be requested for release to the public, via a controlled output review service.

Jobs

  • Action:
    generate_analysis_datasets
    Status:
    Status: Failed
    Job identifier:
    yeggw2ho7yteajon
    Error:
    nonzero_exit: Job exited with an error
  • Action:
    generate_rates_C
    Status:
    Status: Failed
    Job identifier:
    d2fvpju6fgyowxh4
    Error:
    dependency_failed: Not starting as dependency failed
  • Action:
    generate_rates_D
    Status:
    Status: Failed
    Job identifier:
    pohl5xu6ygojarj4
    Error:
    dependency_failed: Not starting as dependency failed
  • Action:
    generate_table1
    Status:
    Status: Failed
    Job identifier:
    2aiprviq6yy3cqpt
    Error:
    dependency_failed: Not starting as dependency failed
  • Action:
    generate_drugprevalence
    Status:
    Status: Failed
    Job identifier:
    3dzng3ffducc5nf5
    Error:
    dependency_failed: Not starting as dependency failed
  • Action:
    generate_drugprevalence_coms
    Status:
    Status: Failed
    Job identifier:
    rqewdxkzj4iiunh3
    Error:
    dependency_failed: Not starting as dependency failed
  • Action:
    generate_drugprevalence_duration
    Status:
    Status: Failed
    Job identifier:
    ckya5zeww3tapdiz
    Error:
    dependency_failed: Not starting as dependency failed
  • Action:
    generate_rates
    Status:
    Status: Failed
    Job identifier:
    upqlnvalymcvfq6t
    Error:
    dependency_failed: Not starting as dependency failed
  • Action:
    generate_drugescalation23
    Status:
    Status: Failed
    Job identifier:
    6c3hiogckgqngbuj
    Error:
    dependency_failed: Not starting as dependency failed
  • Action:
    generate_drugescalation34
    Status:
    Status: Failed
    Job identifier:
    eekhhv2t74zb2qxx
    Error:
    dependency_failed: Not starting as dependency failed
  • Action:
    generate_rates_A
    Status:
    Status: Failed
    Job identifier:
    atjmc4ycyvp67s5s
    Error:
    dependency_failed: Not starting as dependency failed
  • Action:
    generate_rates_B
    Status:
    Status: Failed
    Job identifier:
    4bpk4fjeznvu2cvp
    Error:
    dependency_failed: Not starting as dependency failed
  • Action:
    generate_table1_escalation
    Status:
    Status: Failed
    Job identifier:
    d5mc54ukj63c2637
    Error:
    dependency_failed: Not starting as dependency failed

Pipeline

Show project.yaml
version: '3.0'

# Ignore this`expectation` block. It is required but not used, and will be removed in future versions.
expectations:
  population_size: 10000

actions:
  generate_dataset_prepandemic:
    run: ehrql:v1 generate-dataset analysis/dataset_definition_prepandemic.py --output output/dataset_prepandemic.csv
    outputs:
      highly_sensitive:
        dataset: output/dataset_prepandemic.csv

  generate_dataset_pandemic:
    run: ehrql:v1 generate-dataset analysis/dataset_definition_pandemic.py --output output/dataset_pandemic.csv
    outputs:
      highly_sensitive:
        dataset: output/dataset_pandemic.csv

  generate_dataset_postpandemic:
    run: ehrql:v1 generate-dataset analysis/dataset_definition_postpandemic.py --output output/dataset_postpandemic.csv
    outputs:
      highly_sensitive:
        dataset: output/dataset_postpandemic.csv

  generate_dataset_escalation:
    run: ehrql:v1 generate-dataset analysis/dataset_definition_drug_escalation.py --output output/dataset_drug_escalation.csv
    outputs:
      highly_sensitive:
        dataset: output/dataset_drug_escalation.csv

  # Generate datasets for analysis   001  
  generate_analysis_datasets:
    run: stata-mp:latest analysis/001_cr_define_covariates_cohorts.do
    needs: [generate_dataset_prepandemic, generate_dataset_pandemic, generate_dataset_postpandemic, generate_dataset_escalation]
    outputs:
      highly_sensitive:
        log1: logs/001_cr_define_covariates_cohorts.log
        data1: output/prepandemic.dta 
        data2: output/pandemic.dta 
        data3: output/postpandemic.dta 
        data4: output/drug_escalation.dta 

  # Drug prevalence dataset: 102
  generate_drugprevalence:
    run: stata-mp:latest analysis/102_cr_drug_prevalence_or.do
    needs: [generate_dataset_prepandemic, generate_dataset_pandemic, generate_dataset_postpandemic, generate_analysis_datasets]
    outputs:
      moderately_sensitive:
        log1: logs/102_cr_drug_prevalence_or.log
        data1: output/tabfig/prevalences_summary_prepandemic_redacted_rounded_or.csv
        data2: output/tabfig/prevalences_summary_pandemic_redacted_rounded_or.csv 
        data3: output/tabfig/prevalences_summary_postpandemic_redacted_rounded_or.csv

  # Drug prevalence dataset: 102A
  generate_drugprevalence_coms:
    run: stata-mp:latest analysis/102_A_cr_drug_prevalence_contraind_combinations.do
    needs: [generate_dataset_prepandemic, generate_dataset_pandemic, generate_dataset_postpandemic, generate_analysis_datasets]
    outputs:
      moderately_sensitive:
        log1: logs/102_A_cr_drug_prevalence_combinations.log
        data1: output/tabfig/combinations*.csv
        data2: output/tabfig/pillars*.csv

  # Drug prevalence dataset: 102B
  generate_drugprevalence_duration:
    run: stata-mp:latest analysis/102_B_cr_drug_prevalence_duration.do
    needs: [generate_dataset_prepandemic, generate_dataset_pandemic, generate_dataset_postpandemic, generate_analysis_datasets]
    outputs:
      moderately_sensitive:
        log1: logs/102_cr_drug_prevalence_duration.log
        data1: output/tabfig/prevalences_summary_prepandemic_redacted_rounded_duration.csv
        data2: output/tabfig/prevalences_summary_pandemic_redacted_rounded_duration.csv 
        data3: output/tabfig/prevalences_summary_postpandemic_redacted_rounded_duration.csv


  # Cohort rates: 103
  generate_rates:
    run: stata-mp:latest analysis/103_cr_cohort_rates_repeated.do
    needs: [generate_dataset_prepandemic, generate_dataset_pandemic, generate_dataset_postpandemic, generate_analysis_datasets]
    outputs:
      moderately_sensitive:
        log1: logs/103_cohort_rates_repeated.log
        data1: output/tabfig/rates_repeated_prepandemic_redacted_rounded.csv 
        #data2: output/tabfig/rates_repeated_pandemic_redacted_rounded.csv 
        #data3: output/tabfig/rates_repeated_postpandemic_redacted_rounded.csv 

  # Cohort rates in diabetes: 103A
  generate_rates_A:
    run: stata-mp:latest analysis/103_A_cr_cohort_rates_repeated_diabetes.do
    needs: [generate_dataset_prepandemic, generate_dataset_pandemic, generate_dataset_postpandemic, generate_analysis_datasets]
    outputs:
      moderately_sensitive:
        log1: logs/103_A_cohort_rates_repeated_diabetes.log
        data1: output/tabfig/rates_repeated_pandemic_redacted_rounded_diabetes.csv 
        data2: output/tabfig/rates_repeated_prepandemic_redacted_rounded_diabetes.csv 
        data3: output/tabfig/rates_repeated_postpandemic_redacted_rounded_diabetes.csv 

  # Cohort rates in those without diabetes: 103B
  generate_rates_B:
    run: stata-mp:latest analysis/103_B_cr_cohort_rates_repeated_nodiabetes.do
    needs: [generate_dataset_prepandemic, generate_dataset_pandemic, generate_dataset_postpandemic, generate_analysis_datasets]
    outputs:
      moderately_sensitive:
        log1: logs/103_B_cohort_rates_repeated_nodiabetes.log
        data1: output/tabfig/rates_repeated_pandemic_redacted_rounded_nodiabetes.csv 
        data2: output/tabfig/rates_repeated_prepandemic_redacted_rounded_nodiabetes.csv 
        data3: output/tabfig/rates_repeated_postpandemic_redacted_rounded_nodiabetes.csv 

  # Cohort rates in each overall cohort: 103C
  generate_rates_C:
    run: stata-mp:latest analysis/103_C_cr_cohort_rates_repeated_stratified.do
    needs: [generate_dataset_prepandemic, generate_dataset_pandemic, generate_dataset_postpandemic, generate_analysis_datasets]
    outputs:
      moderately_sensitive:
        log1: logs/103_C_cohort_rates_repeated_stratified.log
        data1: output/tabfig/rates_repeated_pandemic_redacted_rounded_stratified.csv 
        data2: output/tabfig/rates_repeated_prepandemic_redacted_rounded_stratified.csv 
        data3: output/tabfig/rates_repeated_postpandemic_redacted_rounded_stratified.csv 


  # Cohort rates of mortality in each overall cohort: 103D
  generate_rates_D:
    run: stata-mp:latest analysis/103_D_cohort_rates_mortality.do
    needs: [generate_dataset_prepandemic, generate_dataset_pandemic, generate_dataset_postpandemic, generate_analysis_datasets]
    outputs:
      moderately_sensitive:
        log1: logs/103_D_cohort_rates_mortality.log
        data1: output/tabfig/mort_rates_pandemic_redacted_rounded.csv 
        data2: output/tabfig/mort_rates_prepandemic_redacted_rounded.csv 
        data3: output/tabfig/mort_rates_postpandemic_redacted_rounded.csv 

  # Generate table 1 :104  
  generate_table1:
    run: stata-mp:latest analysis/104_cr_table1.do
    needs: [generate_dataset_prepandemic, generate_dataset_pandemic, generate_dataset_postpandemic, generate_analysis_datasets]
    outputs:
      moderately_sensitive:
        log1: logs/104_cr_table1.log
        data1: output/tabfig/table1_prepandemic_redacted_rounded.csv 
        data2: output/tabfig/table1_pandemic_redacted_rounded.csv 
        data3: output/tabfig/table1_postpandemic_redacted_rounded.csv 

  # Generate table 1 :104B  
  generate_table1_escalation:
    run: stata-mp:latest analysis/104_B_cr_table1_escalation.do
    needs: [generate_dataset_prepandemic, generate_dataset_pandemic, generate_dataset_postpandemic, generate_analysis_datasets]
    outputs:
      moderately_sensitive:
        log1: logs/104_B_cr_table1_escalation.log
        data1: output/tabfig/table1_drug_escalation_redacted_rounded.csv 


  # Drug graphs: 107
  generate_drugescalation23:
    run: stata-mp:latest analysis/107_cr_cohorts_escalation_2_3.do
    needs: [generate_dataset_escalation, generate_analysis_datasets]
    outputs:
      moderately_sensitive:
        log1: logs/107_cr_cohorts_escalation.log
        dataset1: output/tabfig/escalation_rates_2_3_prepandemic_redacted_rounded.csv
        dataset2: output/tabfig/escalation_rates_2_3_pandemic_redacted_rounded.csv
        dataset3: output/tabfig/escalation_rates_2_3_postpandemic_redacted_rounded.csv
        dataset4: output/tabfig/escalation_2_3_km.csv
        Figures1: output/tabfig/escalation_2_3_*.svg 

  # Drug graphs: 108
  generate_drugescalation34:
    run: stata-mp:latest analysis/108_cr_cohorts_escalation_3_4.do
    needs: [generate_dataset_escalation, generate_analysis_datasets]
    outputs:
      moderately_sensitive:
        log1: logs/107_cr_cohorts_escalation_3_4.log
        dataset1: output/tabfig/escalation_rates_3_4_prepandemic_redacted_rounded.csv
        dataset2: output/tabfig/escalation_rates_3_4_pandemic_redacted_rounded.csv
        dataset3: output/tabfig/escalation_rates_3_4_postpandemic_redacted_rounded.csv
        dataset4: output/tabfig/escalation_3_4_km.csv
        Figures1: output/tabfig/escalation_3_4_*.svg 

# TIME SERIES
  generate_dataset_timeseries:
    run: ehrql:v1 generate-dataset analysis/dataset_timeseries.py --output output/dataset_timeseries.csv
    outputs:
      highly_sensitive:
        dataset: output/dataset_timeseries.csv

# Measures 
  measures:
    run: ehrql:v1 generate-measures analysis/measures.py 
      --output output/measures/measures.csv
      --
      --start-date "2018-01-01"
      --intervals 64
    outputs:
      highly_sensitive:
        measure_csv: output/measures/measures.csv

# Time series do file
  run_timeseries:
    run: stata-mp:latest analysis/109_time_series.do 
    needs: [generate_dataset_timeseries, measures]
    outputs:
      moderately_sensitive:
        log1: logs/time_series.log
        dataset: output/tabfig/measures_redacted_rounded.csv
        Figures1: output/tabfig/time_series_*.svg

Timeline

  • Created:

  • Started:

  • Finished:

  • Runtime: 00:06:12

These timestamps are generated and stored using the UTC timezone on the TPP backend.

Job information

Status
Failed
Backend
TPP
Requested by
Emily Herrett
Branch
main
Force run dependencies
No
Git commit hash
226a2d9
Requested actions
  • generate_analysis_datasets
  • generate_drugprevalence
  • generate_drugprevalence_coms
  • generate_drugprevalence_duration
  • generate_rates
  • generate_rates_A
  • generate_rates_B
  • generate_rates_C
  • generate_rates_D
  • generate_table1
  • generate_table1_escalation
  • generate_drugescalation23
  • generate_drugescalation34

Code comparison

Compare the code used in this Job Request