Job request: 24841

Organisation:: University of Bristol
Workspace:: metformin-covid-main
ID:: cgjvfnqoqfdsps7s

This page shows the technical details of what happened when the authorised researcher Alain Amstutz requested one or more actions to be run against real patient data within a secure environment.

By cross-referencing the list of jobs with the pipeline section below, you can infer what security level the outputs were written to.

The output security levels are:

highly_sensitive
- Researchers can never directly view these outputs
- Researchers can only request code is run against them
moderately_sensitive
- Can be viewed by an approved researcher by logging into a highly secure environment
- These are the only outputs that can be requested for public release via a controlled output review service.

Jobs

Action:

data_process

Status:

Status: Succeeded

Job identifier:

i6mklnffxwqbukh3
Action:

km_primary

Status:

Status: Succeeded

Job identifier:

uujjhqlypwybdorv
Action:

table1

Status:

Status: Succeeded

Job identifier:

sdyfz5obsx63sddy
Action:

cox_primary_scripted

Status:

Status: Succeeded

Job identifier:

knw7smgub6kj6av6
Action:

cox_primary_RA

Status:

Status: Succeeded

Job identifier:

jy2xfgxfyiwtj526
Action:

cox_covid_event_RA

Status:

Status: Succeeded

Job identifier:

s2p3lzief6ldnlzj
Action:

cox_longvirfat_RA

Status:

Status: Succeeded

Job identifier:

gmax2rfzpt6wmvbd
Action:

ps

Status:

Status: Succeeded

Job identifier:

amsmtnp5qs7bcjne
Action:

baseline_tables

Status:

Status: Succeeded

Job identifier:

ajfvp43jf4gkdd3f

Pipeline

Show project.yaml

version: '3.0'

# Ignore this`expectation` block. It is required but not used, and will be removed in future versions.
expectations:
  population_size: 1000

actions:
  study_dates:
    run: r:latest analysis/metadates.R
    outputs:
      highly_sensitive:
        study_dates_json: output/study_dates.json

  generate_dataset_dm_algo:
    run: ehrql:v1 generate-dataset analysis/dataset_definition_dm_algo.py --output output/dataset_dm_algo.arrow
    needs: 
    - study_dates
    outputs:
      highly_sensitive:
        dataset: output/dataset_dm_algo.arrow

  diabetes_algo:
    run: diabetes-algo:v0.0.6
    config:
     df_input: dataset_dm_algo.arrow
     remove_helper: TRUE
     birth_date: qa_num_birth_year
     ethnicity_cat: cov_cat_ethnicity
     t1dm_date: elig_date_t1dm
     tmp_t1dm_ctv3_date: tmp_elig_date_t1dm_ctv3
     tmp_t1dm_count_num: tmp_elig_count_t1dm
     t2dm_date: elig_date_t2dm
     tmp_t2dm_ctv3_date: tmp_elig_date_t2dm_ctv3
     tmp_t2dm_count_num: tmp_elig_count_t2dm
     otherdm_date: elig_date_otherdm
     tmp_otherdm_count_num: tmp_elig_count_otherdm
     gestationaldm_date: elig_date_gestationaldm
     tmp_poccdm_date: tmp_elig_date_poccdm
     tmp_poccdm_ctv3_count_num: tmp_elig_count_poccdm_ctv3
     tmp_max_hba1c_mmol_mol_num: tmp_elig_num_max_hba1c_mmol_mol
     tmp_max_hba1c_date: tmp_elig_date_max_hba1c
     tmp_insulin_dmd_date: tmp_elig_date_insulin_snomed
     tmp_antidiabetic_drugs_dmd_date: tmp_elig_date_antidiabetic_drugs_snomed
     tmp_nonmetform_drugs_dmd_date: tmp_elig_date_nonmetform_drugs_snomed
     tmp_diabetes_medication_date: tmp_elig_date_diabetes_medication
     tmp_first_diabetes_diag_date: tmp_elig_date_first_diabetes_diag
     df_output: data_processed.csv.gz
    needs:
    - generate_dataset_dm_algo
    outputs:
      highly_sensitive:
        csv.gz: output/data_processed.csv.gz

  generate_dataset:
    run: ehrql:v1 generate-dataset analysis/dataset_definition_t2dm.py --output output/dataset.arrow
    needs: 
    - study_dates
    - generate_dataset_dm_algo
    - diabetes_algo
    outputs:
      highly_sensitive:
        dataset: output/dataset.arrow

  data_process:
    run: r:latest analysis/data_process.R
    needs:
    - generate_dataset
    outputs:
      highly_sensitive:
        dataset1: output/data/data_processed.arrow
        dataset2: output/data/data_processed_full.arrow
        dataset3: output/data/data_processed_death_ltfu.arrow
      moderately_sensitive:
        flowchart_tbls: output/data_description/*.csv

  treat_strat_feasibility:
    run: r:latest analysis/treat_strat_feasibility.R
    needs:
    - data_process
    outputs:
      highly_sensitive:
        dataset_plots: output/data/data_plots.feather
      moderately_sensitive:
        desc_tbl1: output/data_description/n_exp_out_midpoint6.csv
        desc_tbl2: output/data_description/n_exp_out.csv
  
  table1:
    run: r:latest analysis/table1.R
    needs:
    - data_process
    outputs:
      moderately_sensitive:
        data_tbl1: output/data_description/table1_main_midpoint6.csv
        data_tbl2: output/data_description/table1_death_ltfu1_midpoint6.csv
        data_tbl3: output/data_description/table1_death_ltfu2_midpoint6.csv

  km_primary:
    run: kaplan-meier-function:v0.0.17 
      --df_input=output/data/data_processed.arrow
      --dir_output=output/te/km
      --exposure=exp_bin_treat
      --origin_date=landmark_date
      --event_date=out_date_severecovid_afterlandmark
      --censor_date=cox_date_severecovid
      --max_fup=730
      --plot=TRUE
    needs:
    - data_process
    outputs:
      moderately_sensitive:
        km_treat_estimates: output/te/km/*.csv
        png: output/te/km/*.png

  cox_primary_scripted:
    run: r:latest analysis/cox_analysis.R
    needs:
    - data_process
    outputs:
      moderately_sensitive:
        csv: output/te/cox_scripted/*.csv
        plot: output/te/cox_scripted/*.png

  ps:
    run: r:latest analysis/ps.R
    needs:
    - data_process
    outputs:
      moderately_sensitive:
        csv: output/ps/*.csv
        plots: output/ps/*.png

  cox_primary_RA:
    run: cox-ipw:v0.0.37 --df_input=data/data_processed.arrow
      --ipw=FALSE --exposure=cox_date_metfin_start_within6m --outcome=out_date_severecovid_afterlandmark --strata=strat_cat_region
      --covariate_sex=cov_cat_sex --covariate_age=cov_num_age --covariate_other=cov_cat_ethnicity;cov_cat_deprivation_5;cov_cat_rural_urban;cov_bin_healthcare_worker;cov_num_consrate;cov_cat_smoking_status;cov_bin_obesity;cov_cat_hba1c_mmol_mol;cov_cat_tc_hdl_ratio;cov_bin_ami;cov_bin_all_stroke;cov_bin_other_arterial_embolism;cov_bin_vte;cov_bin_hf;cov_bin_angina;cov_bin_dementia;cov_bin_cancer;cov_bin_hypertension;cov_bin_depression;cov_bin_copd;cov_bin_liver_disease;cov_bin_chronic_kidney_disease;cov_bin_pcos;cov_bin_prediabetes;cov_bin_diabetescomp;cov_num_period_month
      --cox_start=landmark_date --cox_stop=cox_date_severecovid --study_start="2018-08-01"
      --study_stop="2022-04-01" --cut_points=730
      --age_spline=TRUE --save_analysis_ready=FALSE --df_output=results_cox_primary_ra.csv
    needs:
    - data_process
    outputs:
      moderately_sensitive:
        model_output: output/results_cox_primary_ra.csv

  cox_covid_event_RA:
    run: cox-ipw:v0.0.37 --df_input=data/data_processed.arrow
      --ipw=FALSE --exposure=cox_date_metfin_start_within6m --outcome=out_date_covid_afterlandmark --strata=strat_cat_region
      --covariate_sex=cov_cat_sex --covariate_age=cov_num_age --covariate_other=cov_cat_ethnicity;cov_cat_deprivation_5;cov_cat_rural_urban;cov_bin_healthcare_worker;cov_num_consrate;cov_cat_smoking_status;cov_bin_obesity;cov_cat_hba1c_mmol_mol;cov_cat_tc_hdl_ratio;cov_bin_ami;cov_bin_all_stroke;cov_bin_other_arterial_embolism;cov_bin_vte;cov_bin_hf;cov_bin_angina;cov_bin_dementia;cov_bin_cancer;cov_bin_hypertension;cov_bin_depression;cov_bin_copd;cov_bin_liver_disease;cov_bin_chronic_kidney_disease;cov_bin_pcos;cov_bin_prediabetes;cov_bin_diabetescomp;cov_num_period_month
      --cox_start=landmark_date --cox_stop=cox_date_covid --study_start="2018-08-01"
      --study_stop="2022-04-01" --cut_points=730
      --age_spline=TRUE --save_analysis_ready=FALSE --df_output=results_cox_covid_event_ra.csv
    needs:
    - data_process
    outputs:
      moderately_sensitive:
        model_output: output/results_cox_covid_event_ra.csv

  cox_longvirfat_RA:
    run: cox-ipw:v0.0.37 --df_input=data/data_processed.arrow
      --ipw=FALSE --exposure=cox_date_metfin_start_within6m --outcome=out_date_longcovid_virfat_afterlandmark --strata=strat_cat_region
      --covariate_sex=cov_cat_sex --covariate_age=cov_num_age --covariate_other=cov_cat_ethnicity;cov_cat_deprivation_5;cov_cat_rural_urban;cov_bin_healthcare_worker;cov_num_consrate;cov_cat_smoking_status;cov_bin_obesity;cov_cat_hba1c_mmol_mol;cov_cat_tc_hdl_ratio;cov_bin_ami;cov_bin_all_stroke;cov_bin_other_arterial_embolism;cov_bin_vte;cov_bin_hf;cov_bin_angina;cov_bin_dementia;cov_bin_cancer;cov_bin_hypertension;cov_bin_depression;cov_bin_copd;cov_bin_liver_disease;cov_bin_chronic_kidney_disease;cov_bin_pcos;cov_bin_prediabetes;cov_bin_diabetescomp;cov_num_period_month
      --cox_start=landmark_date --cox_stop=cox_date_longcovid_virfat --study_start="2018-08-01"
      --study_stop="2022-04-01" --cut_points=730 --total_event_threshold=20
      --age_spline=TRUE --save_analysis_ready=FALSE --df_output=results_cox_longvirfat_ra.csv
    needs:
    - data_process
    outputs:
      moderately_sensitive:
        model_output: output/results_cox_longvirfat_ra.csv

  baseline_tables:
    run: r:latest analysis/baseline_tables.R
    needs:
    - table1
    outputs:
      moderately_sensitive:
        tbl1: output/data_description/tbl_csv_main.csv
        tbl2: output/data_description/tbl_csv_death_ltfu1.csv
        tbl3: output/data_description/tbl_csv_death_ltfu2.csv

Timeline

Created: 7 months, 1 week ago 21 Jul 2025 20:12:24 UTC
Started: 7 months, 1 week ago 21 Jul 2025 20:14:01 UTC
Finished: 7 months, 1 week ago 21 Jul 2025 20:30:24 UTC
Runtime: 00:20:18

These timestamps are generated and stored using the UTC timezone on the TPP backend.

Job request

Status: Succeeded
Backend: TPP
Workspace: metformin-covid-main
Requested by: Alain Amstutz
Branch: main
Force run dependencies: No
Git commit hash: 823a8dd
Requested actions: data_process

table1

km_primary

cox_primary_scripted

ps

cox_primary_RA

cox_covid_event_RA

cox_longvirfat_RA

baseline_tables

Code comparison

Compare the code used in this job request