Job request: 22293
- Organisation:
- The London School of Hygiene & Tropical Medicine
- Workspace:
- openprompt-hrqol
- ID:
- 7g7g3mxuej7dtyiw
This page shows the technical details of what happened when the authorised researcher Oliver Carlile requested one or more actions to be run against real patient data within a secure environment.
By cross-referencing the list of jobs with the pipeline section below, you can infer what security level the outputs were written to.
The output security levels are:
-
highly_sensitive
- Researchers can never directly view these outputs
- Researchers can only request code is run against them
-
moderately_sensitive
- Can be viewed by an approved researcher by logging into a highly secure environment
- These are the only outputs that can be requested for public release via a controlled output review service.
Jobs
-
- Job identifier:
-
4djrvavxv4xdwabr
Pipeline
Show project.yaml
version: '3.0'
expectations:
population_size: 10000
actions:
create_dummy_data:
run: >
ehrql:v0
create-dummy-tables
analysis/model_questions/dataset_definition.py output/dummydata
--
--day=0
outputs:
highly_sensitive:
openprompt_dummy: output/dummydata/open_prompt.csv
edit_dummy_data:
run: >
r:latest
analysis/dummy_data_editing/edit_automatic_dummy_data.R
needs: [create_dummy_data]
outputs:
highly_sensitive:
openprompt_dummy_edited: output/dummydata/dummy_edited/open_prompt.csv
generate_openprompt_survey1:
run: >
ehrql:v0
generate-dataset
analysis/model_questions/dataset_definition.py
--output output/openprompt_survey1.csv
--dummy-tables output/dummydata/dummy_edited
--
--day=0
--window=5
needs: [edit_dummy_data]
outputs:
highly_sensitive:
openprompt_survey1: output/openprompt_survey1.csv
generate_openprompt_survey2:
run: >
ehrql:v0
generate-dataset
analysis/model_questions/dataset_definition.py
--output output/openprompt_survey2.csv
--dummy-tables output/dummydata/dummy_edited
--
--day=30
--window=5
needs: [edit_dummy_data]
outputs:
highly_sensitive:
openprompt_survey2: output/openprompt_survey2.csv
generate_openprompt_survey3:
run: >
ehrql:v0
generate-dataset
analysis/model_questions/dataset_definition.py
--output output/openprompt_survey3.csv
--dummy-tables output/dummydata/dummy_edited
--
--day=60
--window=5
needs: [edit_dummy_data]
outputs:
highly_sensitive:
openprompt_survey3: output/openprompt_survey3.csv
generate_openprompt_survey4:
run: >
ehrql:v0
generate-dataset
analysis/model_questions/dataset_definition.py
--output output/openprompt_survey4.csv
--dummy-tables output/dummydata/dummy_edited
--
--day=90
--window=5
needs: [edit_dummy_data]
outputs:
highly_sensitive:
openprompt_survey4: output/openprompt_survey4.csv
combine_openprompt:
run: >
r:latest analysis/001_datacombine.R
needs: [generate_openprompt_survey1, generate_openprompt_survey2, generate_openprompt_survey3, generate_openprompt_survey4]
outputs:
highly_sensitive:
openprompt_combined: output/openprompt_raw.csv.gz
openprompt_combined_stata: output/op_stata.dta
moderately_sensitive:
openprompt_raw_skim: output/data_properties/op_raw_skim.txt
openprompt_raw_tab: output/data_properties/op_raw_tabulate.txt
openprompt_mapped_skim: output/data_properties/op_mapped_skim.txt
openprompt_mapped_tab: output/data_properties/op_mapped_tabulate.txt
check_days_after_baseline: output/data_properties/sample_day_lags.pdf
indexdates: output/data_properties/index_dates.pdf
table1: output/tab1_baseline_description.html
raw_summ_base_s: output/data_properties/op_baseline_skim.txt
raw_summ_base_t: output/data_properties/op_baseline_tabulate.txt
raw_summ_survey1_s: output/data_properties/op_survey1_skim.txt
raw_summ_survey1_t: output/data_properties/op_survey1_tabulate.txt
raw_summ_survey2_s: output/data_properties/op_survey2_skim.txt
raw_summ_survey2_t: output/data_properties/op_survey2_tabulate.txt
raw_summ_survey3_s: output/data_properties/op_survey3_skim.txt
raw_summ_survey3_t: output/data_properties/op_survey3_tabulate.txt
raw_summ_survey4_s: output/data_properties/op_survey4_skim.txt
raw_summ_survey4_t: output/data_properties/op_survey4_tabulate.txt
survey_date_consistency: output/data_properties/survey_date_consistency.csv
survey_date_consistency_summary: output/data_properties/survey_date_consistency_summary.csv
generate_openprompt_dataset:
run: >
stata-mp:latest analysis/op_combined.do
needs: [combine_openprompt]
outputs:
highly_sensitive:
data: output/openprompt_dataset.dta
log: output/open-prompt-combine.log
extract_linked_tpp_info:
run: >
ehrql:v0
generate-dataset
analysis/add_tpp_data.py
--output output/openprompt_linked_tpp.csv.gz
outputs:
highly_sensitive:
openprompt_survey1: output/openprompt_linked_tpp.csv.gz
import_linked_tpp:
run:
r:latest analysis/002_import_linked_tpp.R
needs: [extract_linked_tpp_info]
outputs:
moderately_sensitive:
openprompt_tpp_skim: output/data_properties/op_tpp_skim.txt
openprompt_tpp_tab: output/data_properties/op_tpp_tabulate.txt
combine_linked_tpp:
run: >
stata-mp:latest analysis/op_tpp_linked.do
needs: [generate_openprompt_dataset, extract_linked_tpp_info]
outputs:
highly_sensitive:
data: output/op_tpp_linked.dta
logs: output/linked-tpp.log
gen_baseline_tables:
run: >
stata-mp:latest analysis/op_table1.do
needs: [combine_linked_tpp]
outputs:
moderately_sensitive:
demographic_data: output/tables/table1_demographic.csv
rounded_demograpics: output/tables/table1_demographic_rounded.csv
questionnaire_data: output/tables/table1_questions.csv
rounded_questions: output/tables/table1_questions_rounded.csv
long_covid_dx: output/tables/long-covid-dx.csv
rounded_dx: output/tables/long-covid-dx-rounded.csv
utility_score: output/figures/baseline_EQ5D_utility.svg
disutility_score: output/figures/baseline_EQ5D_disutility.svg
question_responses: output/figures/baseline_EQ5D_responses.svg
question_percents: output/figures/baseline_EQ5D_percentage.svg
vas_ncovids: output/figures/VAS_by_covids.svg
vas_nvaccines: output/figures/VAS_by_vaccines.svg
facit_fscore: output/figures/facit_baseline.svg
log_tables: output/op-baseline-table1.log
twopart_models:
run: >
stata-mp:latest analysis/op_modelling.do
needs: [combine_linked_tpp]
outputs:
moderately_sensitive:
demographic_indicators: output/figures/mixed_odds_ratio.svg
demographic_coefficients: output/figures/mixed_coefs.svg
socioeconomic_indicators: output/figures/socio_odds.svg
socioeconomic_coefficients: output/figures/socio_coefs.svg
baseline_regression: output/tables/twopart-model.csv
baseline_regression_missing: output/tables/twopart-model-missing.csv
longitudinal_models: output/tables/longit-model.csv
attrition_models: output/tables/selective_attrition.csv
log_regression: output/models.log
mixed_models:
run: >
stata-mp:latest analysis/op_mixed_models.do
needs: [combine_linked_tpp]
outputs:
moderately_sensitive:
mixed_linear: output/tables/mixed-models.csv
proms_mixed: output/tables/mixed-proms.csv
proms_odds: output/figures/proms_odds.svg
proms_demographic: output/figures/demo_odds.svg
demographics_adjust_coefs: output/figures/demo_proms_coefs.svg
proms_adjust_ceofs: output/figures/proms_coefs.svg
glm_log: output/mixed-glm.log
estimate_qalys:
run: >
stata-mp:latest analysis/op_qalys.do
needs: [combine_linked_tpp]
outputs:
moderately_sensitive:
utility_scores: output/tables/utility-scores.csv
EQ5D_by_longcovd: output/figures/EQ5D_longcovid.svg
EQ5D_by_surveys: output/figures/EQ5D_surveys.svg
utility_by_survey: output/figures/utility_survey_response.svg
selective_attrition: output/figures/EQ5D_surveys_att.svg
QALY_by_age: output/figures/QALM_losses_age.svg
qaly_log: output/qaly-estimates.log
imputed_models:
run: >
stata-mp:latest analysis/op_imputed.do
needs: [combine_linked_tpp]
outputs:
moderately_sensitive:
mi_log: output/mi_models.log
Timeline
-
Created:
-
Started:
-
Finished:
-
Runtime: 01:44:24
These timestamps are generated and stored using the UTC timezone on the TPP backend.
Job request
- Status
-
Succeeded
- Backend
- TPP
- Workspace
- openprompt-hrqol
- Requested by
- Oliver Carlile
- Branch
- main
- Force run dependencies
- No
- Git commit hash
- 67b38e4
- Requested actions
-
-
imputed_models
-
Code comparison
Compare the code used in this job request