Job request: 4256
- Organisation:
- Bennett Institute
- Workspace:
- long-covid-sick-notes
- ID:
- r6fpiajsqsswcmf7
This page shows the technical details of what happened when the authorised researcher Robin Park requested one or more actions to be run against real patient data in the project, within a secure environment.
By cross-referencing the list of jobs with the
pipeline section below, you can infer what
security level
various outputs were written to. Researchers can never directly
view outputs marked as
highly_sensitive
;
they can only request that code runs against them. Outputs
marked as
moderately_sensitive
can be viewed by an approved researcher by logging into a highly
secure environment. Only outputs marked as
moderately_sensitive
can be requested for release to the public, via a controlled
output review service.
Jobs
-
- Job identifier:
-
mosvwxmf2sxaub6b
Pipeline
Show project.yaml
version: '3.0'
expectations:
population_size: 10000
actions:
generate_study_population_covid_2020:
run: cohortextractor:latest generate_cohort --study-definition study_definition_covid_2020 --output-dir=output/cohorts
outputs:
highly_sensitive:
cohort: output/cohorts/input_covid_2020.csv
generate_study_population_general_2019:
run: cohortextractor:latest generate_cohort --study-definition study_definition_general_2019 --output-dir=output/cohorts
outputs:
highly_sensitive:
cohort: output/cohorts/input_general_2019.csv
generate_study_population_general_2020:
run: cohortextractor:latest generate_cohort --study-definition study_definition_general_2020 --output-dir=output/cohorts
outputs:
highly_sensitive:
cohort: output/cohorts/input_general_2020.csv
generate_study_population_pneumonia_2019:
run: cohortextractor:latest generate_cohort --study-definition study_definition_pneumonia_2019 --output-dir=output/cohorts
outputs:
highly_sensitive:
cohort: output/cohorts/input_pneumonia_2019.csv
reconcile_sick_note_spells_covid_2020:
run: python:latest python analysis/reconcile_sick_note_spells.py "_covid_2020"
needs: [generate_study_population_covid_2020]
outputs:
highly_sensitive:
cohort_with_duration: output/cohorts/input_covid_2020_with_duration.csv
reconcile_sick_note_spells_general_2019:
run: python:latest python analysis/reconcile_sick_note_spells.py "_general_2019"
needs: [generate_study_population_general_2019]
outputs:
highly_sensitive:
cohort_with_duration: output/cohorts/input_general_2019_with_duration.csv
reconcile_sick_note_spells_general_2020:
run: python:latest python analysis/reconcile_sick_note_spells.py "_general_2020"
needs: [generate_study_population_general_2020]
outputs:
highly_sensitive:
cohort_with_duration: output/cohorts/input_general_2020_with_duration.csv
reconcile_sick_note_spells_pneumonia_2019:
run: python:latest python analysis/reconcile_sick_note_spells.py "_pneumonia_2019"
needs: [generate_study_population_pneumonia_2019]
outputs:
highly_sensitive:
cohort_with_duration: output/cohorts/input_pneumonia_2019_with_duration.csv
# matching_2019:
# run: python:latest python analysis/match_running.py "input_general_2019" "_2019" "2019-02-01" --output-dir=output/cohorts
# needs: [generate_study_population_covid_2020, generate_study_population_general_2019]
# outputs:
# moderately_sensitive:
# matching_report: output/cohorts/matching_report_2019.txt
# highly_sensitive:
# matched_cohort: output/cohorts/matched_matches_2019.csv
# matching_2020:
# run: python:latest python analysis/match_running.py "input_general_2020" "_2020" "2020-02-01" --output-dir=output/cohorts
# needs: [generate_study_population_covid_2020, generate_study_population_general_2020]
# outputs:
# moderately_sensitive:
# matching_report: output/cohorts/matching_report_2020.txt
# highly_sensitive:
# matched_cohort: output/cohorts/matched_matches_2020.csv
covid_2020_rates_cohort:
run: stata-mp:latest analysis/000_cr_define_covariates_simple_rates.do "covid_2020" --output-dir=output/cohorts
needs: [generate_study_population_covid_2020]
outputs:
highly_sensitive:
analysis_dataset: output/cohorts/cohort_rates_covid_2020.dta
general_2019_rates_cohort:
run: stata-mp:latest analysis/000_cr_define_covariates_simple_rates.do "general_2019" --output-dir=output/cohorts
needs: [generate_study_population_general_2019]
outputs:
highly_sensitive:
analysis_dataset: output/cohorts/cohort_rates_general_2019.dta
general_2020_rates_cohort:
run: stata-mp:latest analysis/000_cr_define_covariates_simple_rates.do "general_2020" --output-dir=output/cohorts
needs: [generate_study_population_general_2020]
outputs:
highly_sensitive:
analysis_dataset: output/cohorts/cohort_rates_general_2020.dta
pneumonia_2019_rates_cohort:
run: stata-mp:latest analysis/000_cr_define_covariates_simple_rates.do "pneumonia_2019" --output-dir=output/cohorts
needs: [generate_study_population_pneumonia_2019]
outputs:
highly_sensitive:
analysis_dataset: output/cohorts/cohort_rates_pneumonia_2019.dta
covid_2020_rates:
run: stata-mp:latest analysis/100_cr_simple_rates.do "covid_2020" --output-dir=output/tabfig
needs: [covid_2020_rates_cohort]
outputs:
moderately_sensitive:
rates: output/tabfig/rates_summary_covid_2020.csv
general_2019_rates:
run: stata-mp:latest analysis/100_cr_simple_rates.do "general_2019" --output-dir=output/tabfig
needs: [general_2019_rates_cohort]
outputs:
moderately_sensitive:
rates: output/tabfig/rates_summary_general_2019.csv
general_2020_rates:
run: stata-mp:latest analysis/100_cr_simple_rates.do "general_2020" --output-dir=output/tabfig
needs: [general_2020_rates_cohort]
outputs:
moderately_sensitive:
rates: output/tabfig/rates_summary_general_2020.csv
pneumonia_2019_rates:
run: stata-mp:latest analysis/100_cr_simple_rates.do "pneumonia_2019" --output-dir=output/tabfig
needs: [pneumonia_2019_rates_cohort]
outputs:
moderately_sensitive:
rates: output/tabfig/rates_summary_pneumonia_2019.csv
append_cohorts:
run: stata-mp:latest analysis/200_cr_data_management_matching.do --output-dir=output/cohorts
needs: [covid_2020_rates_cohort, pneumonia_2019_rates_cohort, general_2020_rates_cohort, general_2019_rates_cohort]
outputs:
moderately_sensitive:
log: output/cohorts/append_cohorts.txt
highly_sensitive:
dataset: output/cohorts/combined_covid_pneumonia.dta
dataset2: output/cohorts/combined_covid_general_2019.dta
dataset3: output/cohorts/combined_covid_general_2020.dta
cox_models:
run: stata-mp:latest analysis/201_cox_models.do
needs: [append_cohorts]
outputs:
moderately_sensitive:
log: output/cohorts/cox_models.txt
dataset: output/tabfig/cox_model_summary.csv
describe_duration:
run: jupyter:latest jupyter nbconvert /workspace/notebooks/describe_duration.ipynb --execute --to html --template basic --output-dir=/workspace/output --ExecutePreprocessor.timeout=86400 --no-input
needs: [reconcile_sick_note_spells_covid_2020, reconcile_sick_note_spells_general_2019, reconcile_sick_note_spells_general_2020, reconcile_sick_note_spells_pneumonia_2019]
outputs:
moderately_sensitive:
notebook: output/describe_duration.html
table: output/tabfig/med_iqr_overall.csv
table2: output/tabfig/med_iqr_age_group.csv
table3: output/tabfig/med_iqr_sex.csv
table4: output/tabfig/med_iqr_ethnicity.csv
table5: output/tabfig/med_iqr_imd.csv
table6: output/tabfig/med_iqr_region.csv
rates_over_time:
run: jupyter:latest jupyter nbconvert /workspace/notebooks/rates_over_time.ipynb --execute --to html --template basic --output-dir=/workspace/output --ExecutePreprocessor.timeout=86400 --no-input
needs: [generate_study_population_covid_2020, generate_study_population_general_2019, generate_study_population_general_2020, generate_study_population_pneumonia_2019]
outputs:
moderately_sensitive:
notebook: output/rates_over_time.html
Timeline
-
Created:
-
Finished:
-
Runtime:
These timestamps are generated and stored using the UTC timezone on the TPP backend.
Job information
- Status
-
Succeeded
- Backend
- TPP
- Workspace
- long-covid-sick-notes
- Requested by
- Robin Park
- Branch
- master
- Force run dependencies
- No
- Git commit hash
- 09eb943
- Requested actions
-
-
rates_over_time
-
Code comparison
Compare the code used in this Job Request