Job request: 22354
- Organisation:
 - The London School of Hygiene & Tropical Medicine
 - Workspace:
 - openprompt-hrqol
 - ID:
 - gjzzlmrjq476ybep
 
This page shows the technical details of what happened when the authorised researcher Oliver Carlile requested one or more actions to be run against real patient data within a secure environment.
By cross-referencing the list of jobs with the pipeline section below, you can infer what security level the outputs were written to.
The output security levels are:
- 
                highly_sensitive
                
- Researchers can never directly view these outputs
 - Researchers can only request code is run against them
 
 - 
                moderately_sensitive
                
- Can be viewed by an approved researcher by logging into a highly secure environment
 - These are the only outputs that can be requested for public release via a controlled output review service.
 
 
Jobs
- 
                
- Job identifier:
 - 
                    
                    
wq24u655em27xhye - Error:
 - cancelled_by_user: Cancelled by user
 
 
Pipeline
Show project.yaml
version: '3.0'
expectations:
 population_size: 10000
actions:
  create_dummy_data: 
    run: >
      ehrql:v0
        create-dummy-tables 
        analysis/model_questions/dataset_definition.py output/dummydata 
        -- 
        --day=0
    outputs: 
      highly_sensitive:
        openprompt_dummy: output/dummydata/open_prompt.csv
  edit_dummy_data:
    run: > 
      r:latest
        analysis/dummy_data_editing/edit_automatic_dummy_data.R
    needs: [create_dummy_data]
    outputs: 
      highly_sensitive: 
        openprompt_dummy_edited: output/dummydata/dummy_edited/open_prompt.csv
  generate_openprompt_survey1: 
    run: >
      ehrql:v0
        generate-dataset 
        analysis/model_questions/dataset_definition.py 
        --output output/openprompt_survey1.csv
        --dummy-tables output/dummydata/dummy_edited
        --
        --day=0
        --window=5
    needs: [edit_dummy_data]
    outputs:
      highly_sensitive:
        openprompt_survey1: output/openprompt_survey1.csv
  generate_openprompt_survey2: 
    run: >
      ehrql:v0
        generate-dataset 
        analysis/model_questions/dataset_definition.py 
        --output output/openprompt_survey2.csv
        --dummy-tables output/dummydata/dummy_edited
        --
        --day=30
        --window=5
    needs: [edit_dummy_data]
    outputs:
      highly_sensitive:
        openprompt_survey2: output/openprompt_survey2.csv
  generate_openprompt_survey3: 
    run: >
      ehrql:v0
        generate-dataset 
        analysis/model_questions/dataset_definition.py 
        --output output/openprompt_survey3.csv
        --dummy-tables output/dummydata/dummy_edited
        --
        --day=60
        --window=5
    needs: [edit_dummy_data]
    outputs:
      highly_sensitive:
        openprompt_survey3: output/openprompt_survey3.csv
  generate_openprompt_survey4: 
    run: >
      ehrql:v0
        generate-dataset 
        analysis/model_questions/dataset_definition.py 
        --output output/openprompt_survey4.csv
        --dummy-tables output/dummydata/dummy_edited
        --
        --day=90
        --window=5
    needs: [edit_dummy_data]
    outputs:
      highly_sensitive:
        openprompt_survey4: output/openprompt_survey4.csv
  combine_openprompt:
    run: >
      r:latest analysis/001_datacombine.R
    needs: [generate_openprompt_survey1, generate_openprompt_survey2, generate_openprompt_survey3, generate_openprompt_survey4]
    outputs: 
      highly_sensitive: 
        openprompt_combined: output/openprompt_raw.csv.gz
        openprompt_combined_stata: output/op_stata.dta
      moderately_sensitive:
        openprompt_raw_skim: output/data_properties/op_raw_skim.txt
        openprompt_raw_tab: output/data_properties/op_raw_tabulate.txt
        openprompt_mapped_skim: output/data_properties/op_mapped_skim.txt
        openprompt_mapped_tab: output/data_properties/op_mapped_tabulate.txt
        check_days_after_baseline: output/data_properties/sample_day_lags.pdf
        indexdates: output/data_properties/index_dates.pdf
        table1: output/tab1_baseline_description.html
        raw_summ_base_s: output/data_properties/op_baseline_skim.txt
        raw_summ_base_t: output/data_properties/op_baseline_tabulate.txt
        raw_summ_survey1_s: output/data_properties/op_survey1_skim.txt
        raw_summ_survey1_t: output/data_properties/op_survey1_tabulate.txt
        raw_summ_survey2_s: output/data_properties/op_survey2_skim.txt
        raw_summ_survey2_t: output/data_properties/op_survey2_tabulate.txt
        raw_summ_survey3_s: output/data_properties/op_survey3_skim.txt
        raw_summ_survey3_t: output/data_properties/op_survey3_tabulate.txt
        raw_summ_survey4_s: output/data_properties/op_survey4_skim.txt
        raw_summ_survey4_t: output/data_properties/op_survey4_tabulate.txt
        survey_date_consistency: output/data_properties/survey_date_consistency.csv
        survey_date_consistency_summary: output/data_properties/survey_date_consistency_summary.csv
  generate_openprompt_dataset:
    run: >
      stata-mp:latest analysis/op_combined.do
    needs: [combine_openprompt]
    outputs:
      highly_sensitive:
        data: output/openprompt_dataset.dta
        log: output/open-prompt-combine.log
  extract_linked_tpp_info: 
    run: >
      ehrql:v0
        generate-dataset 
        analysis/add_tpp_data.py 
        --output output/openprompt_linked_tpp.csv.gz
    outputs:
      highly_sensitive:
        openprompt_survey1: output/openprompt_linked_tpp.csv.gz
  import_linked_tpp: 
    run:   
      r:latest analysis/002_import_linked_tpp.R
    needs: [extract_linked_tpp_info]
    outputs: 
      moderately_sensitive:
        openprompt_tpp_skim: output/data_properties/op_tpp_skim.txt
        openprompt_tpp_tab: output/data_properties/op_tpp_tabulate.txt
  combine_linked_tpp:
    run: > 
      stata-mp:latest analysis/op_tpp_linked.do
    needs: [generate_openprompt_dataset, extract_linked_tpp_info]
    outputs: 
      highly_sensitive:
        data: output/op_tpp_linked.dta
        logs: output/linked-tpp.log
  gen_baseline_tables:
    run: >
      stata-mp:latest analysis/op_table1.do
    needs: [combine_linked_tpp]
    outputs:
      moderately_sensitive:
        demographic_data: output/tables/table1_demographic.csv
        rounded_demograpics: output/tables/table1_demographic_rounded.csv
        questionnaire_data: output/tables/table1_questions.csv
        rounded_questions: output/tables/table1_questions_rounded.csv
        long_covid_dx: output/tables/long-covid-dx.csv
        rounded_dx: output/tables/long-covid-dx-rounded.csv
        utility_score: output/figures/baseline_EQ5D_utility.svg
        disutility_score: output/figures/baseline_EQ5D_disutility.svg
        question_responses: output/figures/baseline_EQ5D_responses.svg
        question_percents: output/figures/baseline_EQ5D_percentage.svg
        vas_ncovids: output/figures/VAS_by_covids.svg
        vas_nvaccines: output/figures/VAS_by_vaccines.svg
        facit_fscore: output/figures/facit_baseline.svg
        log_tables: output/op-baseline-table1.log
  twopart_models:
    run: >
      stata-mp:latest analysis/op_modelling.do
    needs: [combine_linked_tpp]
    outputs: 
      moderately_sensitive:
        demographic_indicators: output/figures/mixed_odds_ratio.svg
        demographic_coefficients: output/figures/mixed_coefs.svg
        socioeconomic_indicators: output/figures/socio_odds.svg
        socioeconomic_coefficients: output/figures/socio_coefs.svg
        baseline_regression: output/tables/twopart-model.csv
        baseline_regression_missing: output/tables/twopart-model-missing.csv
        longitudinal_models: output/tables/longit-model.csv
        attrition_models: output/tables/selective_attrition.csv
        log_regression: output/models.log
  mixed_models:
    run: >
      stata-mp:latest analysis/op_mixed_models.do
    needs: [combine_linked_tpp]
    outputs:
      moderately_sensitive:
        mixed_linear: output/tables/mixed-models.csv
        proms_mixed: output/tables/mixed-proms.csv
        proms_odds: output/figures/proms_odds.svg
        proms_demographic: output/figures/demo_odds.svg
        demographics_adjust_coefs: output/figures/demo_proms_coefs.svg
        proms_adjust_ceofs: output/figures/proms_coefs.svg
        glm_log: output/mixed-glm.log
  estimate_qalys:
    run: >
      stata-mp:latest analysis/op_qalys.do
    needs: [combine_linked_tpp]
    outputs:
      moderately_sensitive:
        utility_scores: output/tables/utility-scores.csv
        EQ5D_by_longcovd: output/figures/EQ5D_longcovid.svg
        EQ5D_by_surveys: output/figures/EQ5D_surveys.svg
        utility_by_survey: output/figures/utility_survey_response.svg
        selective_attrition: output/figures/EQ5D_surveys_att.svg
        QALY_by_age: output/figures/QALM_losses_age.svg
        qaly_log: output/qaly-estimates.log
  imputed_models:
    run: >
      stata-mp:latest analysis/op_imputed.do
    needs: [combine_linked_tpp]
    outputs:
      moderately_sensitive:
        mi_log: output/mi_models.log
        mi_socio_odds: output/figures/socio_miodds.svg
        mi_melogit: output/figures/mi_melogit.svg
        mixed_mi_demographic: output/figures/mixed_micoefs.svg
        mixed_mi_socio: output/figures/socio_micoefs.svg
        mi_models: output/tables/mi-model.csv
        mi_prom_odds: output/figures/demo_miodds.svg
        mi_prom_coefs: output/figures/proms_micoefs.svg
        mi_prom_models: output/tables/mi-proms.csv
Timeline
- 
  
    
  
  
Created:
 - 
  
    
  
  
Started:
 - 
  
    
  
  
Finished:
 - 
  
  
Runtime: 00:41:09
 
These timestamps are generated and stored using the UTC timezone on the TPP backend.
Job request
- Status
 - 
            Failed
 - Backend
 - TPP
 - Workspace
 - openprompt-hrqol
 - Requested by
 - Oliver Carlile
 - Branch
 - main
 - Force run dependencies
 - No
 - Git commit hash
 - 7eb97e1
 - Requested actions
 - 
            
- 
                  
imputed_models 
 - 
                  
 
Code comparison
Compare the code used in this job request