Skip to content

Job request: 24413

Organisation:
University of Bristol
Workspace:
metformin-covid-main
ID:
dzjp2boeja3yjj2q

This page shows the technical details of what happened when the authorised researcher Alain Amstutz requested one or more actions to be run against real patient data in the project, within a secure environment.

By cross-referencing the list of jobs with the pipeline section below, you can infer what security level various outputs were written to. Researchers can never directly view outputs marked as highly_sensitive ; they can only request that code runs against them. Outputs marked as moderately_sensitive can be viewed by an approved researcher by logging into a highly secure environment. Only outputs marked as moderately_sensitive can be requested for release to the public, via a controlled output review service.

Jobs

  • Action:
    diabetes_algo
    Status:
    Status: Succeeded
    Job identifier:
    xek4c3k5jtsg5v7m
  • Action:
    generate_dataset
    Status:
    Status: Succeeded
    Job identifier:
    5c5w5uino6qrbqrv
  • Action:
    data_process
    Status:
    Status: Succeeded
    Job identifier:
    ovfta554y66cxzin
  • Action:
    table1
    Status:
    Status: Succeeded
    Job identifier:
    ndthmdkjrzkprfnl

Pipeline

Show project.yaml
version: '3.0'

# Ignore this`expectation` block. It is required but not used, and will be removed in future versions.
expectations:
  population_size: 1000

actions:
  study_dates:
    run: r:latest analysis/metadates.R
    outputs:
      highly_sensitive:
        study_dates_json: output/study_dates.json

  generate_dataset_dm_algo:
    run: ehrql:v1 generate-dataset analysis/dataset_definition_dm_algo.py --output output/dataset_dm_algo.arrow
    needs: 
    - study_dates
    outputs:
      highly_sensitive:
        dataset: output/dataset_dm_algo.arrow

  diabetes_algo:
    run: diabetes-algo:v0.0.4
    config:
     df_input: dataset_dm_algo.arrow
     remove_helper: TRUE
     birth_date: qa_num_birth_year
     ethnicity_cat: cov_cat_ethnicity
     t1dm_date: elig_date_t1dm
     tmp_t1dm_ctv3_date: tmp_elig_date_t1dm_ctv3
     tmp_t1dm_count_num: tmp_elig_count_t1dm
     t2dm_date: elig_date_t2dm
     tmp_t2dm_ctv3_date: tmp_elig_date_t2dm_ctv3
     tmp_t2dm_count_num: tmp_elig_count_t2dm
     otherdm_date: elig_date_otherdm
     tmp_otherdm_count_num: tmp_elig_count_otherdm
     gestationaldm_date: elig_date_gestationaldm
     tmp_poccdm_date: tmp_elig_date_poccdm
     tmp_poccdm_ctv3_count_num: tmp_elig_count_poccdm_ctv3
     tmp_max_hba1c_mmol_mol_num: tmp_elig_num_max_hba1c_mmol_mol
     tmp_max_hba1c_date: tmp_elig_date_max_hba1c
     tmp_insulin_dmd_date: tmp_elig_date_insulin_snomed
     tmp_antidiabetic_drugs_dmd_date: tmp_elig_date_antidiabetic_drugs_snomed
     tmp_nonmetform_drugs_dmd_date: tmp_elig_date_nonmetform_drugs_snomed
     tmp_diabetes_medication_date: tmp_elig_date_diabetes_medication
     tmp_first_diabetes_diag_date: tmp_elig_date_first_diabetes_diag
     df_output: data_processed.csv.gz
    needs:
    - generate_dataset_dm_algo
    outputs:
      highly_sensitive:
        csv.gz: output/data_processed.csv.gz

  generate_dataset:
    run: ehrql:v1 generate-dataset analysis/dataset_definition_t2dm.py --output output/dataset.arrow
    needs: 
    - study_dates
    - generate_dataset_dm_algo
    - diabetes_algo
    outputs:
      highly_sensitive:
        dataset: output/dataset.arrow

  data_process:
    run: r:latest analysis/data_process.R
    needs:
    - generate_dataset
    outputs:
      highly_sensitive:
        dataset: output/data/data_processed.arrow
        #dataset_plots: output/data/data_plots.feather
      moderately_sensitive:
        csv: output/data_description/*.csv

  table1:
    run: r:latest analysis/table1.R
    needs:
    - data_process
    outputs:
      moderately_sensitive:
        table1: output/data_description/table1_midpoint6.csv

  #ps:
  #  run: r:latest analysis/ps.R
  #  needs:
  #  - data_process
  #  outputs:
  #    moderately_sensitive:
  #      csv: output/ps/*.csv
  #      plots: output/ps/*.png


  # km_estimates_metfin_RA:
  #   run: kaplan-meier-function:v0.0.8
  #     --df_input=output/data/data_plots.feather
  #     --dir_output=output/metfin
  #     --origin_date=elig_date_t2dm
  #     --event_date=exp_date_metfin_anytime
  #     --censor_date=qa_date_of_death
  #     --max_fup=548
  #     --plot=TRUE
  #   needs:
  #   - data_process
  #   outputs:
  #     moderately_sensitive:
  #       km_estimates: output/metfin/*.csv
  #       plot: output/metfin/*.png

  # km_estimates_metfin_mono_RA:
  #   run: kaplan-meier-function:v0.0.8
  #     --df_input=output/data/data_plots.feather
  #     --dir_output=output/metfin_mono
  #     --origin_date=elig_date_t2dm
  #     --event_date=exp_date_metfin_mono_anytime
  #     --censor_date=qa_date_of_death
  #     --max_fup=548
  #     --plot=TRUE
  #   needs:
  #   - data_process
  #   outputs:
  #     moderately_sensitive:
  #       km_estimates: output/metfin_mono/*.csv
  #       plot: output/metfin_mono/*.png

  # km_estimates_dpp4_mono_RA:
  #   run: kaplan-meier-function:v0.0.8
  #     --df_input=output/data/data_plots.feather
  #     --dir_output=output/dpp4_mono
  #     --origin_date=elig_date_t2dm
  #     --event_date=exp_date_dpp4_mono_anytime
  #     --censor_date=qa_date_of_death
  #     --max_fup=548
  #     --plot=TRUE
  #   needs:
  #   - data_process
  #   outputs:
  #     moderately_sensitive:
  #       km_estimates: output/dpp4_mono/*.csv
  #       plot: output/dpp4_mono/*.png

  # km_estimates_tzd_mono_RA:
  #   run: kaplan-meier-function:v0.0.8
  #     --df_input=output/data/data_plots.feather
  #     --dir_output=output/tzd_mono
  #     --origin_date=elig_date_t2dm
  #     --event_date=exp_date_tzd_mono_anytime
  #     --censor_date=qa_date_of_death
  #     --max_fup=548
  #     --plot=TRUE
  #   needs:
  #   - data_process
  #   outputs:
  #     moderately_sensitive:
  #       km_estimates: output/tzd_mono/*.csv
  #       plot: output/tzd_mono/*.png

  # km_estimates_sglt2_mono_RA:
  #   run: kaplan-meier-function:v0.0.8
  #     --df_input=output/data/data_plots.feather
  #     --dir_output=output/sglt2_mono
  #     --origin_date=elig_date_t2dm
  #     --event_date=exp_date_sglt2_mono_anytime
  #     --censor_date=qa_date_of_death
  #     --max_fup=548
  #     --plot=TRUE
  #   needs:
  #   - data_process
  #   outputs:
  #     moderately_sensitive:
  #       km_estimates: output/sglt2_mono/*.csv
  #       plot: output/sglt2_mono/*.png

  # km_estimates_sulfo_mono_RA:
  #   run: kaplan-meier-function:v0.0.8
  #     --df_input=output/data/data_plots.feather
  #     --dir_output=output/sulfo_mono
  #     --origin_date=elig_date_t2dm
  #     --event_date=exp_date_sulfo_mono_anytime
  #     --censor_date=qa_date_of_death
  #     --max_fup=548
  #     --plot=TRUE
  #   needs:
  #   - data_process
  #   outputs:
  #     moderately_sensitive:
  #       km_estimates: output/sulfo_mono/*.csv
  #       plot: output/sulfo_mono/*.png

  # km_estimates_glp1_mono_RA:
  #   run: kaplan-meier-function:v0.0.8
  #     --df_input=output/data/data_plots.feather
  #     --dir_output=output/glp1_mono
  #     --origin_date=elig_date_t2dm
  #     --event_date=exp_date_glp1_mono_anytime
  #     --censor_date=qa_date_of_death
  #     --max_fup=548
  #     --plot=TRUE
  #   needs:
  #   - data_process
  #   outputs:
  #     moderately_sensitive:
  #       km_estimates: output/glp1_mono/*.csv
  #       plot: output/glp1_mono/*.png

  # km_estimates_megli_mono_RA:
  #   run: kaplan-meier-function:v0.0.8
  #     --df_input=output/data/data_plots.feather
  #     --dir_output=output/megli_mono
  #     --origin_date=elig_date_t2dm
  #     --event_date=exp_date_megli_mono_anytime
  #     --censor_date=qa_date_of_death
  #     --max_fup=548
  #     --plot=TRUE
  #   needs:
  #   - data_process
  #   outputs:
  #     moderately_sensitive:
  #       km_estimates: output/megli_mono/*.csv
  #       plot: output/megli_mono/*.png

  # km_estimates_agi_mono_RA:
  #   run: kaplan-meier-function:v0.0.8
  #     --df_input=output/data/data_plots.feather
  #     --dir_output=output/agi_mono
  #     --origin_date=elig_date_t2dm
  #     --event_date=exp_date_agi_mono_anytime
  #     --censor_date=qa_date_of_death
  #     --max_fup=548
  #     --plot=TRUE
  #   needs:
  #   - data_process
  #   outputs:
  #     moderately_sensitive:
  #       km_estimates: output/agi_mono/*.csv
  #       plot: output/agi_mono/*.png

  # km_estimates_insulin_mono_RA:
  #   run: kaplan-meier-function:v0.0.8
  #     --df_input=output/data/data_plots.feather
  #     --dir_output=output/insulin_mono
  #     --origin_date=elig_date_t2dm
  #     --event_date=exp_date_insulin_mono_anytime
  #     --censor_date=qa_date_of_death
  #     --max_fup=548
  #     --plot=TRUE
  #   needs:
  #   - data_process
  #   outputs:
  #     moderately_sensitive:
  #       km_estimates: output/insulin_mono/*.csv
  #       plot: output/insulin_mono/*.png

  # plot_km_estimates:
  #   run: r:latest analysis/km_plot.R 
  #   needs:
  #   - km_estimates_metfin_RA
  #   - km_estimates_metfin_mono_RA
  #   - km_estimates_dpp4_mono_RA
  #   - km_estimates_tzd_mono_RA
  #   - km_estimates_sglt2_mono_RA
  #   - km_estimates_sulfo_mono_RA
  #   - km_estimates_glp1_mono_RA
  #   - km_estimates_megli_mono_RA
  #   - km_estimates_agi_mono_RA
  #   - km_estimates_insulin_mono_RA
  #   outputs:
  #     moderately_sensitive:
  #       plot: output/data_description/*.png

  # km_treat:
  #   run: kaplan-meier-function:v0.0.8 
  #     --df_input=output/data/data_processed.arrow
  #     --dir_output=output/treat
  #     --exposure=exp_bin_treat
  #     --origin_date=elig_date_t2dm
  #     --event_date=out_date_severecovid
  #     --censor_date=qa_date_of_death
  #     --max_fup=548
  #     --plot=TRUE
  #   needs:
  #   - data_process
  #   outputs:
  #     moderately_sensitive:
  #       km_treat_estimates: output/treat/*.csv
  #       png: output/treat/*.png

Timeline

  • Created:

  • Started:

  • Finished:

  • Runtime: 06:48:17

These timestamps are generated and stored using the UTC timezone on the TPP backend.

Job information

Status
Succeeded
Backend
TPP
Requested by
Alain Amstutz
Branch
main
Force run dependencies
No
Git commit hash
f9f3aa3
Requested actions
  • generate_dataset
  • data_process
  • table1

Code comparison

Compare the code used in this Job Request