Job request: 19447
- Organisation:
- The London School of Hygiene & Tropical Medicine
- Workspace:
- healthcare_utilisation_openprompt
- ID:
- 7xpicbxdnm3o7hnd
This page shows the technical details of what happened when the authorised researcher Liang-Yu Lin requested one or more actions to be run against real patient data in the project, within a secure environment.
By cross-referencing the list of jobs with the
pipeline section below, you can infer what
security level
various outputs were written to. Researchers can never directly
view outputs marked as
highly_sensitive
;
they can only request that code runs against them. Outputs
marked as
moderately_sensitive
can be viewed by an approved researcher by logging into a highly
secure environment. Only outputs marked as
moderately_sensitive
can be requested for release to the public, via a controlled
output review service.
Jobs
-
- Job identifier:
-
4lrw5guzlsjs65x7
Pipeline
Show project.yaml
version: '3.0'
expectations:
population_size: 5000
actions:
generate_long_covid_exposure_dataset:
run:
databuilder:v0 generate-dataset
analysis/dataset_definition_unmatched_exp_lc.py
--output output/dataset_exp_lc_unmatched.csv
outputs:
highly_sensitive:
cohort: output/dataset_exp_lc_unmatched.csv
generate_list_gp_use_long_covid_dx:
run:
databuilder:v0 generate-dataset
analysis/dataset_definition_lc_gp_list.py
--output output/dataset_lc_gp_list.csv
outputs:
highly_sensitive:
cohort: output/dataset_lc_gp_list.csv
generate_dataset_comparator_exclude_gp_no_long_covid:
needs: [generate_list_gp_use_long_covid_dx]
run:
databuilder:v0 generate-dataset
analysis/dataset_definition_unmatched_comparator.py
--output output/dataset_comparator_unmatched.csv
outputs:
highly_sensitive:
cohort: output/dataset_comparator_unmatched.csv
test_matching:
run:
python:latest python analysis/match_test.py
needs: [generate_dataset_comparator_exclude_gp_no_long_covid, generate_long_covid_exposure_dataset]
outputs:
highly_sensitive:
matched_cases: output/matched_cases_stp.csv
matched_matches: output/matched_matches_stp.csv
matched_all: output/matched_combined_stp.csv
moderately_sensitive:
matching_report: output/matching_report_stp.txt
import_matched_exposure:
run: >
databuilder:v0
generate-dataset analysis/dataset_definition_matched_cases.py
--output output/matched_cases_with_ehr.csv
needs: [test_matching]
outputs:
highly_sensitive:
cohort: output/matched_cases_with_ehr.csv
import_matched_exposure_drug_cost:
run: >
databuilder:v0
generate-dataset analysis/dataset_definition_matched_cases_drug_costs.py
--output output/matched_cases_with_drug_costs.csv
needs: [test_matching]
outputs:
highly_sensitive:
cohort: output/matched_cases_with_drug_costs.csv
import_matched_controls:
run: >
databuilder:v0
generate-dataset analysis/dataset_definition_matched_control.py
--output output/matched_control_with_ehr.csv
needs: [test_matching]
outputs:
highly_sensitive:
cohort: output/matched_control_with_ehr.csv
import_matched_controls_drug_costs:
run: >
databuilder:v0
generate-dataset analysis/dataset_definition_matched_control_drug_costs.py
--output output/matched_control_with_drug_costs.csv
needs: [test_matching]
outputs:
highly_sensitive:
cohort: output/matched_control_with_drug_costs.csv
# Historical comparison:
generate_historical_exp_data:
run:
databuilder:v0 generate-dataset analysis/dataset_definition_hx_unmatched_exp_lc.py
--output output/hx_unmatched_exp.csv
outputs:
highly_sensitive:
hx_cohort: output/hx_unmatched_exp.csv
generate_historical_comp_data_exclude_gp_no_long_covid:
needs: [generate_list_gp_use_long_covid_dx]
run:
databuilder:v0 generate-dataset analysis/dataset_definition_hx_unmatched_com_no_lc.py
--output output/hx_dataset_comp_unmatched.csv
outputs:
highly_sensitive:
hx_cohort: output/hx_dataset_comp_unmatched.csv
historical_matching:
run:
python:latest python analysis/match_historical.py
needs: [generate_historical_exp_data, generate_historical_comp_data_exclude_gp_no_long_covid]
outputs:
highly_sensitive:
matched_cases: output/matched_cases_historical.csv
matched_matches: output/matched_matches_historical.csv
matched_all: output/matched_combined_historical.csv
moderately_sensitive:
matching_report: output/matching_report_historical.txt
import_matched_historical_exposure:
run: >
databuilder:v0
generate-dataset analysis/dataset_definition_hx_matched_exp_lc.py
--output output/hx_matched_cases_with_ehr.csv
needs: [historical_matching]
outputs:
highly_sensitive:
cohort: output/hx_matched_cases_with_ehr.csv
import_matched_historical_controls:
run: >
databuilder:v0
generate-dataset analysis/dataset_definition_hx_matched_comp.py
--output output/hx_matched_control_with_ehr.csv
needs: [historical_matching]
outputs:
highly_sensitive:
cohort: output/hx_matched_control_with_ehr.csv
# Reporting:
report01_matched_datasets:
needs: [import_matched_exposure, import_matched_controls]
run:
r:latest analysis/st01_report_matched.R
outputs:
moderately_sensitive:
matched_table: output/st01_matched_numbers_table.csv
explore_vax_fig: output/st1_exporing_vax_index_date.png
missing_table: output/missing_distribution_table.csv
missing_pattern: output/missing_pattern_current.png
report01_5_explore_outcomes:
needs: [import_matched_exposure, import_matched_controls]
run:
r:latest analysis/st01_5_report_data_exploration.R
outputs:
moderately_sensitive:
zero_percent_fig: output/st1_5_explore_zero_percentage.png
monthly_oucome_tb: output/st1_5_monthly_outcome_distribution.csv
crude_rate_tb: output/st1_5_crude_monthly_rate.csv
visit_explore: output/st1_5_cat_visits_summary.csv
report02_hx_matched_datasets:
needs: [import_matched_historical_exposure, import_matched_historical_controls]
run:
r:latest analysis/st02_report_matched_historical.R
outputs:
moderately_sensitive:
matched_table: output/st02_hx_matched_numbers_table.csv
report_03_model_01_poisson:
needs: [import_matched_exposure, import_matched_controls]
run:
r:latest analysis/st03_model_01_poisson.R
outputs:
moderately_sensitive:
model_01_poisson: output/st03_model_01_poisson_and_nb.csv
report_03_model_02_clustered_analysis:
needs: [import_matched_exposure, import_matched_controls]
run:
r:latest analysis/st03_model_02_cluster_analysis.R
outputs:
moderately_sensitive:
# model_compares_poisson: output/st03_model_02_rm_models.csv
gee_outputs: output/st03_model_02_gee_models.csv
report_03_hurdle_model:
needs: [import_matched_exposure, import_matched_controls]
run:
r:latest analysis/st03_hurdle_model_rate_ratio.R
outputs:
moderately_sensitive:
model_selection: output/sup_st03_0_model_comparison.csv
hurdle_all: output/st_03_result_monthly_visit_hurdle.csv
report_03_1_hurdle_model_predict:
needs: [import_matched_exposure, import_matched_controls]
run:
r:latest analysis/st03_hurdle_model_predict.R
outputs:
moderately_sensitive:
hurdle_all: output/st_03_result_cumulative_visit_hurdle.csv
hurdle_gp: output/st_03_gp_result_cumulative_visit_hurdle.csv
hurdle_hos: output/st_03_hos_result_cumulative_visit_hurdle.csv
hurdle_ae: output/st_03_ae_result_cumulative_visit_hurdle.csv
report_04_hurdle_model_plot:
needs: [report_03_1_hurdle_model_predict]
run:
r:latest analysis/st04_plot_hurdle_visit.R
outputs:
moderately_sensitive:
crude: output/st_04_crude_healthcare_visit.png
partial: output/st_04_partial_adj_healthcare_visit.png
full: output/st_04_full_adj_healthcare_visit.png
report_03_2_hurdle_model_sugroup_predict:
needs: [import_matched_exposure, import_matched_controls]
run:
r:latest analysis/st03_sub_hurdle_model_predict.R
outputs:
moderately_sensitive:
subgroup_hos_visit: "output/st_03_subgroup_result_hos_cumulative_visit_hurdle.csv"
report_03_3_twopart_model_predict:
needs: [import_matched_exposure, import_matched_controls]
run:
r:latest analysis/st03_two_part_model.R
outputs:
moderately_sensitive:
total_costs: "output/st_04_result_cumulative_cost_full_2pm.csv"
report_03_4_twopart_model_subgroup_predict:
needs: [import_matched_exposure, import_matched_controls]
run:
r:latest analysis/st03_twopart_sub_model.R
outputs:
moderately_sensitive:
sub_costs: "output/st_04_result_sub_cumulative_cost_2pm.csv"
report_04_plot_twopart_costs:
needs: [report_03_3_twopart_model_predict]
run:
r:latest analysis/st04_fig_plot_twopm_cost.R
outputs:
moderately_sensitive:
plot_total_costs: "output/st_fig_04_cumulative_costs.png"
report_05_historical_did_model:
needs: [import_matched_historical_exposure, import_matched_historical_controls]
run:
r:latest analysis/st05_did_model.R
outputs:
highly_sensitive:
fitted_did: "output/predicted_did_counts.csv.gz"
moderately_sensitive:
model_dispersion: "output/sup_st01_model_compare.csv"
stats_output: "output/st05_did_stats.csv"
summarised_did_predicted: "output/st05_summarised_did_predicted_results.csv"
Timeline
-
Created:
-
Started:
-
Finished:
-
Runtime: 00:09:39
These timestamps are generated and stored using the UTC timezone on the TPP backend.
Job information
- Status
-
Failed
- Backend
- TPP
- Workspace
- healthcare_utilisation_openprompt
- Requested by
- Liang-Yu Lin
- Branch
- main
- Force run dependencies
- No
- Git commit hash
- 0293ae9
- Requested actions
-
-
report_03_model_02_clustered_analysis
-
Code comparison
Compare the code used in this Job Request