Job request: 7254

View Repo View project.yaml

This page shows the technical details of what happened when authorised researcher Miriam Samuel requested one or more actions to be run against real patient data in the BMI and HbA1c project, within a secure environment.

By cross-referencing the indicated Requested Actions with the Pipeline section below, you can infer what security level various outputs were written to. Outputs marked as highly_sensitive can never be viewed directly by a researcher; they can only request that code runs against them. Outputs marked as moderately_sensitive can be viewed by an approved researcher by logging into a highly secure environment. Only outputs marked as moderately_sensitive can be requested for release to the public, via a controlled output review service.

Jobs

ID Status Action
q2e2fdqmbpkwnghu succeeded complete_bmi_trajectory_data

Pipeline

Show Hide project.yaml
version: '3.0'

expectations:
  population_size: 50000

actions:

### 1.  Extract the cohort

  ## use the Measure function to allow data for different time periods to be extracted in the same study population:  --index-date-range ""yyyy-mm-dd"" --output-format feather 
  generate_study_population_1:
    run: cohortextractor:latest generate_cohort --study-definition study_definition_all --output-dir=output/data --index-date-range "2015-03-01" --output-format feather
    outputs:
      highly_sensitive:
        cohort: output/data/input_all_2015-03-01.feather  

  generate_study_population_2:
    run: cohortextractor:latest generate_cohort --study-definition study_definition_all --output-dir=output/data --index-date-range "2016-03-01" --output-format feather
    outputs:
      highly_sensitive:
        cohort: output/data/input_all_2016-03-01.feather  

  generate_study_population_3:
    run: cohortextractor:latest generate_cohort --study-definition study_definition_all --output-dir=output/data --index-date-range "2017-03-01" --output-format feather
    outputs:
      highly_sensitive:
        cohort: output/data/input_all_2017-03-01.feather  
        
  generate_study_population_4:
    run: cohortextractor:latest generate_cohort --study-definition study_definition_all --output-dir=output/data --index-date-range "2018-03-01" --output-format feather 
    outputs:
      highly_sensitive:
        cohort: output/data/input_all_2018-03-01.feather  
        
        
  generate_study_population_5:
    run: cohortextractor:latest generate_cohort --study-definition study_definition_all --output-dir=output/data --index-date-range "2019-03-01" --output-format feather
    outputs:
      highly_sensitive:
        cohort: output/data/input_all_2019-03-01.feather        
        
  generate_study_population_6:
    run: cohortextractor:latest generate_cohort --study-definition study_definition_all --output-dir=output/data --index-date-range "2020-03-01" --output-format feather
    outputs:
      highly_sensitive:
        cohort: output/data/input_all_2020-03-01.feather  
   
  generate_study_population_7:
    run: cohortextractor:latest generate_cohort --study-definition study_definition_all --output-dir=output/data --index-date-range "2021-03-01" --output-format feather
    outputs:
      highly_sensitive:
        cohort: output/data/input_all_2021-03-01.feather  
 
 

  generate_study_population_ethnicity:
    run: cohortextractor:latest generate_cohort --study-definition study_definition_ethnicity --output-dir=output/data --output-format=feather 
    outputs:
      highly_sensitive:
        cohort: output/data/input_ethnicity.feather
        
      
        
  join_ethnicity:
    run: python:latest python analysis/join_ethnicity.py --output-dir=output/data --output-format feather
    needs: [generate_study_population_1, generate_study_population_2, generate_study_population_3, generate_study_population_4, generate_study_population_5, generate_study_population_6, generate_study_population_7, generate_study_population_ethnicity]
    outputs:
      highly_sensitive:
        cohort: output/data/input*.feather

  generate_study_population_dm1:
    run: cohortextractor:latest generate_cohort --study-definition study_definition_dm_meds --output-dir=output/data --index-date-range "2015-03-01" 
    outputs:
      highly_sensitive:
        cohort: output/data/input_dm_meds_2015-03-01.csv

  generate_study_population_dm2:
    run: cohortextractor:latest generate_cohort --study-definition study_definition_dm_meds --output-dir=output/data --index-date-range "2016-03-01" 
    outputs:
      highly_sensitive:
        cohort: output/data/input_dm_meds_2016-03-01.csv

  generate_study_population_dm3:
    run: cohortextractor:latest generate_cohort --study-definition study_definition_dm_meds --output-dir=output/data --index-date-range "2017-03-01" 
    outputs:
      highly_sensitive:
        cohort: output/data/input_dm_meds_2017-03-01.csv

  generate_study_population_dm4:
    run: cohortextractor:latest generate_cohort --study-definition study_definition_dm_meds --output-dir=output/data --index-date-range "2018-03-01" 
    outputs:
      highly_sensitive:
        cohort: output/data/input_dm_meds_2018-03-01.csv

  generate_study_population_dm5:
    run: cohortextractor:latest generate_cohort --study-definition study_definition_dm_meds --output-dir=output/data --index-date-range "2019-03-01" 
    outputs:
      highly_sensitive:
        cohort: output/data/input_dm_meds_2019-03-01.csv

  generate_study_population_dm6:
    run: cohortextractor:latest generate_cohort --study-definition study_definition_dm_meds --output-dir=output/data --index-date-range "2020-03-01" 
    outputs:
      highly_sensitive:
        cohort: output/data/input_dm_meds_2020-03-01.csv

  generate_study_population_dm7:
    run: cohortextractor:latest generate_cohort --study-definition study_definition_dm_meds --output-dir=output/data --index-date-range "2021-03-01" 
    outputs:
      highly_sensitive:
        cohort: output/data/input_dm_meds_2021-03-01.csv
###  Develop yearly BMI data sets to de-duplicate data and calculate median BMI each year.   Feather format to reduce data quantity.


  join_meds:
    run: r:latest analysis/add_medication.R --output-dir=output/data --output-format feather
    needs: [join_ethnicity, generate_study_population_dm1, generate_study_population_dm2, generate_study_population_dm3, generate_study_population_dm4, generate_study_population_dm5, generate_study_population_dm6, generate_study_population_dm7]
    outputs:
      highly_sensitive:
        cohort1: output/data/complete_meds_2015.feather
        cohort2: output/data/complete_meds_2016.feather
        cohort3: output/data/complete_meds_2017.feather
        cohort4: output/data/complete_meds_2018.feather
        cohort5: output/data/complete_meds_2019.feather
        cohort6: output/data/complete_meds_2020.feather
        cohort7: output/data/complete_meds_2021.feather





  generate_BMI_2015_data:
    run: r:latest analysis/BMI_2015.R --output-dir=output/data --output-format feather
    needs: [join_meds]
    outputs:
      highly_sensitive:
        cohort1: output/data/BMI_complete_median_2015.feather
        cohort2: output/data/BMI_complete_long_2015.feather
      moderately_sensitive:
        table1:  output/data/BMI_data_checks_2015.csv


  generate_BMI_2016_data:
    run: r:latest analysis/BMI_2016.R --output-dir=output/data --output-format feather
    needs: [join_meds]
    outputs:
      highly_sensitive:
        cohort1: output/data/BMI_complete_median_2016.feather
        cohort2: output/data/BMI_complete_long_2016.feather
      moderately_sensitive:
        table1:  output/data/BMI_data_checks_2016.csv

  
  generate_BMI_2017_data:
    run: r:latest analysis/BMI_2017.R --output-dir=output/data --output-format feather
    needs: [join_meds]
    outputs:
      highly_sensitive:
        cohort1: output/data/BMI_complete_median_2017.feather
        cohort2: output/data/BMI_complete_long_2017.feather
      moderately_sensitive:
        table1:  output/data/BMI_data_checks_2017.csv
        
   
   
   
  generate_BMI_2019_data:
    run: r:latest analysis/BMI_2019.R --output-dir=output/data --output-format feather
    needs: [join_meds]
    outputs:
      highly_sensitive:
        cohort1: output/data/BMI_complete_median_2019.feather
        cohort2: output/data/BMI_complete_long_2019.feather
      moderately_sensitive:
        table1:  output/data/BMI_data_checks_2019.csv


  generate_BMI_2021_data:
    run: r:latest analysis/BMI_2021.R --output-dir=output/data --output-format feather
    needs: [join_meds]
    outputs:
      highly_sensitive:
        cohort1: output/data/BMI_complete_median_2021.feather
        cohort2: output/data/BMI_complete_long_2021.feather
      moderately_sensitive:
        table1:  output/data/BMI_data_checks_2021.csv


  generate_BMI_2018_data:
    run: r:latest analysis/BMI_2018.R --output-dir=output/data --output-format feather
    needs: [join_meds]
    outputs:
      highly_sensitive:
        cohort1: output/data/BMI_complete_median_2018.feather
        cohort2: output/data/BMI_complete_long_2018.feather
      moderately_sensitive:
        table1:  output/data/BMI_data_checks_2018.csv


  generate_BMI_2020_data:
    run: r:latest analysis/BMI_2020.R --output-dir=output/data --output-format feather
    needs: [join_meds]
    outputs:
      highly_sensitive:
        cohort1: output/data/BMI_complete_median_2020.feather
        cohort2: output/data/BMI_complete_long_2020.feather
      moderately_sensitive:
        table1:  output/data/BMI_data_checks_2020.csv


# Append yearly data sets to produce a complete data set for analysis of change in trends

  generate_complete_median_BMI_data:
    run: r:latest analysis/BMI_median_combine_datasets.R --output-dir=output/data --output-format feather
    needs: [generate_BMI_2015_data, generate_BMI_2016_data, generate_BMI_2017_data, generate_BMI_2018_data, generate_BMI_2019_data, generate_BMI_2020_data, generate_BMI_2021_data ]
    outputs:
      highly_sensitive:
        cohort1: output/data/BMI_complete_median.feather


  generate_complete_long_BMI_data:
    run: r:latest analysis/BMI_complete_long.R --output-dir=output/data --output-format feather
    needs: [generate_BMI_2015_data, generate_BMI_2016_data, generate_BMI_2017_data, generate_BMI_2018_data, generate_BMI_2019_data, generate_BMI_2020_data, generate_BMI_2021_data ]
    outputs:
      highly_sensitive:
        cohort1: output/data/all_bmi_long.feather

  generate_BMI_trajectories_data1:
    run: r:latest analysis/BMI_change_1.R --output-dir=output/data --output-format feather
    needs: [generate_complete_long_BMI_data]
    outputs:
      highly_sensitive:
        cohort1: output/data/BMI_demog.feather     
        cohort2: output/data/BMI_all_long_sample.feather 

  generate_BMI_trajectories_data2:
    run: r:latest analysis/BMI_change_2.R --output-dir=output/data --output-format feather
    needs: [generate_BMI_trajectories_data1]
    outputs:
      highly_sensitive:
        cohort1: output/data/BMI_trajectories.feather     

  generate_BMI_trajectories_data3:
    run: r:latest analysis/BMI_change_3.R --output-dir=output/data --output-format feather
    needs: [generate_BMI_trajectories_data2, generate_BMI_trajectories_data1]
    outputs:
      highly_sensitive:
        cohort1: output/data/BMI_trajectories_final.feather 



##################################################################
### >> HAD_BMI ANALYSIS
######################################################################
# who had_bmi measured: 
  generate_had_bmi_proportion_2019:
    run: r:latest analysis/had_bmi_proportions_2019.R --output-dir=output/data 
    needs: [generate_complete_median_BMI_data]
    outputs:
      moderately_sensitive:
        table1: output/data/proportion_had_bmi_2019.csv
  
  generate_had_bmi_proportion_2020:
    run: r:latest analysis/had_bmi_proportions_2020.R --output-dir=output/data 
    needs: [generate_complete_median_BMI_data]
    outputs:
      moderately_sensitive:
        table1: output/data/proportion_had_bmi_2020.csv
  
  generate_had_bmi_proportion_2021:
    run: r:latest analysis/had_bmi_proportions_2021.R --output-dir=output/data 
    needs: [generate_complete_median_BMI_data]
    outputs:
      moderately_sensitive:
        table1: output/data/proportion_had_bmi_2021.csv
 
#########################################################
## PROPORTION OBESE ANALYSIS

  generate_obese_proportion_2019:
    run: r:latest analysis/obese_proportions_2019.R --output-dir=output/data 
    needs: [generate_complete_median_BMI_data]
    outputs:
      moderately_sensitive:
        table1: output/data/proportion_obese_2019.csv


  generate_obese_proportion_2020:
    run: r:latest analysis/obese_proportions_2020.R --output-dir=output/data 
    needs: [generate_complete_median_BMI_data]
    outputs:
      moderately_sensitive:
        table1: output/data/proportion_obese_2020.csv

  generate_obese_proportion_2021:
    run: r:latest analysis/obese_proportions_2021.R --output-dir=output/data 
    needs: [generate_complete_median_BMI_data]
    outputs:
      moderately_sensitive:
        table1: output/data/proportion_obese_2021.csv






 ######################################################################################
 # >> Median BMI Analysis
 
 #######################################################################################

  
  
  # generate summary stats of median BMI by exposures
  generate_median_summary_stats:
    run: r:latest analysis/BMI_median_summary_stats.R --output-dir=output/data 
    needs: [generate_complete_median_BMI_data]
    outputs:
      moderately_sensitive:
        table1: output/data/median_bmi_summary_table_demographic.csv
        table2: output/data/median_bmi_summary_table_covariates.csv
        
        
        
  ## tables of median BMI and range
  generate_median_bmi_range_2019:
    run: r:latest analysis/median_range_2019.R --output-dir=output/data 
    needs: [generate_complete_median_BMI_data]
    outputs:
      moderately_sensitive:
        table1: output/data/median_range_bmi_2019.csv

  generate_median_bmi_range_2020:
    run: r:latest analysis/median_range_2020.R --output-dir=output/data 
    needs: [generate_complete_median_BMI_data]
    outputs:
      moderately_sensitive:
        table1: output/data/median_range_bmi_2020.csv


  generate_median_bmi_range_2021:
    run: r:latest analysis/median_range_2021.R --output-dir=output/data 
    needs: [generate_complete_median_BMI_data]
    outputs:
      moderately_sensitive:
        table1: output/data/median_range_bmi_2021.csv

  complete_bmi_trajectory_data:
    run: r:latest analysis/proportion_complete_BMI_trajectory.R --output-dir=output/data 
    needs: [generate_BMI_trajectories_data3]
    outputs:
      moderately_sensitive:
        table1: output/data/complete_bmi_trajectories_proportions.csv


###########################  
## DWMP Eligible



  generate_DWMP_proportion:
    run: r:latest analysis/DWMP_proportion.R --output-dir=output/data 
    needs: [generate_complete_median_BMI_data]
    outputs:
      moderately_sensitive:
        table1: output/data/proportion_DWMP_eligible_2019.csv
        table2: output/data/proportion_DWMP_eligible_2020.csv
        table3: output/data/proportion_DWMP_eligible_2021.csv

  generate_DWMP_hypertension_proportion:
    run: r:latest analysis/DWMP_proportion_hypertension.R --output-dir=output/data 
    needs: [generate_complete_median_BMI_data]
    outputs:
      moderately_sensitive:
        table1: output/data/proportion_DWMP_hypertension_2019.csv
        table2: output/data/proportion_DWMP_hypertension_2020.csv
        table3: output/data/proportion_DWMP_hypertension_2021.csv

  generate_DWMP_t2dm_proportion:
    run: r:latest analysis/DWMP_proportion_t2dm.R --output-dir=output/data 
    needs: [generate_complete_median_BMI_data]
    outputs:
      moderately_sensitive:
        table1: output/data/proportion_DWMP_T2DM_2019.csv
        table2: output/data/proportion_DWMP_T2DM_2020.csv
        table3: output/data/proportion_DWMP_T2DM_2021.csv


  generate_DWMP_t1dm_proportion:
    run: r:latest analysis/DWMP_proportion_t1dm.R --output-dir=output/data 
    needs: [generate_complete_median_BMI_data]
    outputs:
      moderately_sensitive:
        table1: output/data/proportion_DWMP_T1DM_2019.csv
        table2: output/data/proportion_DWMP_T1DM_2020.csv
        table3: output/data/proportion_DWMP_T1DM_2021.csv


##################################################################
### >> HAD_SBP ANALYSIS
######################################################################
# who had_bmi measured: 
  generate_had_sbp_proportion_2019:
    run: r:latest analysis/had_sbp_proportions_2019.R --output-dir=output/data 
    needs: [generate_complete_median_BMI_data]
    outputs:
      moderately_sensitive:
        table1: output/data/proportion_had_sbp_2019.csv


  generate_had_sbp_proportion_2020:
    run: r:latest analysis/had_sbp_proportions_2020.R --output-dir=output/data 
    needs: [generate_complete_median_BMI_data]
    outputs:
      moderately_sensitive:
        table1: output/data/proportion_had_sbp_2020.csv


  generate_had_sbp_proportion_2021:
    run: r:latest analysis/had_sbp_proportions_2021.R --output-dir=output/data 
    needs: [generate_complete_median_BMI_data]
    outputs:
      moderately_sensitive:
        table1: output/data/proportion_had_sbp_2021.csv


############################################################
## Hba1c
##############################################################

  generate_hba1c_2018_2019_data:
    run: r:latest analysis/hba1c_2018_2019.R --output-dir=output/data --output-format feather
    needs: [join_meds]
    outputs:
      highly_sensitive:
        cohort1: output/data/hba1c_2019_summary.feather
        cohort2: output/data/hba1c_2018_summary.feather
        cohort3: output/data/hba1c_2019_long.feather
        cohort4: output/data/hba1c_2018_long.feather
        cohort5: output/data/precovid_hba1c_control.feather
       
  generate_hba1c_2020_2021_data:
    run: r:latest analysis/hba1c_2020_2021.R --output-dir=output/data --output-format feather
    needs: [join_meds]
    outputs:
      highly_sensitive:
        cohort1: output/data/hba1c_2020_summary.feather
        cohort2: output/data/hba1c_2021_summary.feather
        cohort3: output/data/hba1c_2021_long.feather
        cohort4: output/data/hba1c_2020_long.feather




  T2DM_hba1c_proportions_2019:
    run: r:latest analysis/hba1c_had_proportion_2019_t2dm.R --output-dir=output/data --output-format feather
    needs: [generate_hba1c_2018_2019_data]
    outputs:
      moderately_sensitive:
        table1: output/data/T2DM_proportion_hba1c_2019.csv

  T2DM_hba1c_proportions_2020:
    run: r:latest analysis/hba1c_had_proportion_2020_t2dm.R --output-dir=output/data --output-format feather
    needs: [generate_hba1c_2018_2019_data, generate_hba1c_2020_2021_data]
    outputs:
      moderately_sensitive:
        table1: output/data/T2DM_proportion_hba1c_2020.csv

  T2DM_hba1c_proportions_2021:
    run: r:latest analysis/hba1c_had_proportion_2021_t2dm.R --output-dir=output/data --output-format feather
    needs: [generate_hba1c_2018_2019_data, generate_hba1c_2020_2021_data]
    outputs:
      moderately_sensitive:
        table1: output/data/T2DM_proportion_hba1c_2021.csv


  check_hba1c:
    run: r:latest analysis/check_hba1c.R --output-dir=output/data --output-format feather
    needs: [generate_study_population_1]
    outputs:
      moderately_sensitive:
        table1: output/data/check_hba1c.csv

State

State is inferred from the related Jobs.

Status: Succeeded

Timings

Timings set to UTC timezone.

  • Created:
  • Started:
  • Finished:
  • Runtime: 04:28:45

Config

  • Backend:
    TPP
  • Workspace:
    bmi_and_hba1c
  • Branch:
    main
  • Creator:
    Miriam-S-git
  • Force run dependencies:
    False
  • Git Commit Hash:
    de2cde8
  • Requested actions:
    • complete_bmi_trajectory_data