Job request: 10114
- Organisation:
- Bennett Institute
- Workspace:
- antidepressant-prescribing-lda
- ID:
- jkk3kerk6nhenldq
This page shows the technical details of what happened when the authorised researcher Christine Cunningham requested one or more actions to be run against real patient data within a secure environment.
By cross-referencing the list of jobs with the pipeline section below, you can infer what security level the outputs were written to.
The output security levels are:
-
highly_sensitive
- Researchers can never directly view these outputs
- Researchers can only request code is run against them
-
moderately_sensitive
- Can be viewed by an approved researcher by logging into a highly secure environment
- These are the only outputs that can be requested for public release via a controlled output review service.
Jobs
-
- Job identifier:
-
fokwps4opvheqoxu
Pipeline
Show project.yaml
version: '3.0'
expectations:
population_size: 10000
actions:
####################
# Cohort Generation
####################
# Since this runs on everyone, we can reuse for both studies
generate_study_population_ethnicity:
run: cohortextractor:latest generate_cohort --study-definition study_definition_ethnicity --output-format=csv.gz
outputs:
highly_sensitive:
cohort: output/input_ethnicity.csv.gz
# Generate depression register by month
generate_study_population_register:
run: cohortextractor:latest generate_cohort --study-definition study_definition_register --index-date-range "2019-03-01 to 2022-04-01 by month" --output-format=csv.gz --output-dir output/qof
outputs:
highly_sensitive:
cohort: output/qof/input_register_*.csv.gz
# Generate dep003 by month
generate_study_population_dep003:
run: cohortextractor:latest generate_cohort --study-definition study_definition_dep003 --index-date-range "2019-03-01 to 2022-04-01 by month" --output-format=csv.gz --output-dir output/qof
outputs:
highly_sensitive:
cohort: output/qof/input_dep003_*.csv.gz
# Generate prescription variables by month
generate_study_population_lda:
run: cohortextractor:latest generate_cohort --study-definition study_definition_lda --index-date-range "2019-03-01 to 2022-04-01 by month" --output-format=csv.gz --output-dir output/lda
outputs:
highly_sensitive:
cohort: output/lda/input_lda_*.csv.gz
# Generate dataset report
generate_dataset_report:
run: >
dataset-report:v0.0.19
--input-files output/qof/input_*.csv.gz
--output-dir output/qof/
needs: [generate_study_population_register, generate_study_population_dep003]
outputs:
moderately_sensitive:
dataset_report: output/qof/input_*.html
####################
# Join ethnicity to all generated input files
# Efficiency fix https://github.com/opensafely/research-template
# BUT BEWARE STALE DATA
###################
join_cohorts_qof:
run: >
cohort-joiner:v0.0.18
--lhs output/qof/input_*.csv.gz
--rhs output/input_ethnicity.csv.gz
--output-dir output/qof/joined
needs: [generate_study_population_ethnicity, generate_study_population_register, generate_study_population_dep003]
outputs:
highly_sensitive:
cohort: output/qof/joined/input_*.csv.gz
join_cohorts_lda:
run: >
cohort-joiner:v0.0.18
--lhs output/lda/input_*.csv.gz
--rhs output/input_ethnicity.csv.gz
--output-dir output/lda/joined
needs: [generate_study_population_ethnicity, generate_study_population_lda]
outputs:
highly_sensitive:
cohort: output/lda/joined/input_*.csv.gz
####################
# Python testing
####################
test_input:
run: >
python:latest python analysis/test_input.py
--input-files output/qof/joined/input_*.csv.gz
--output-dir output/qof/joined/python
needs: [join_cohorts_qof]
outputs:
moderately_sensitive:
cohort: output/qof/joined/python/test_*.*
####################
# Measures
####################
# Output the summary values by date
generate_measures_register:
run: cohortextractor:latest generate_measures --study-definition study_definition_register --output-dir=output/qof/joined
needs: [join_cohorts_qof]
outputs:
moderately_sensitive:
# Only output the single summary file
measure_csv: output/qof/joined/measure_register_*_rate.csv
join_measures_register:
run: python:latest python analysis/join_and_round.py
--input-list output/qof/joined/measure_register_total_rate.csv
--input-list output/qof/joined/measure_register_age_band_rate.csv
--input-list output/qof/joined/measure_register_carehome_rate.csv
--input-list output/qof/joined/measure_register_ethnicity_rate.csv
--input-list output/qof/joined/measure_register_imd_rate.csv
--input-list output/qof/joined/measure_register_learning_disability_rate.csv
--input-list output/qof/joined/measure_register_region_rate.csv
--input-list output/qof/joined/measure_register_sex_rate.csv
--output-dir output/qof/joined/summary
--output-name "measure_register.csv"
needs: [generate_measures_register]
outputs:
moderately_sensitive:
# Only output the single summary file
measure_csv: output/qof/joined/summary/measure_register.csv
generate_measures_dep003:
run: cohortextractor:latest generate_measures --study-definition study_definition_dep003 --output-dir=output/qof/joined
needs: [join_cohorts_qof]
outputs:
moderately_sensitive:
# Only output the single summary file
measure_csv: output/qof/joined/measure_dep003_*_rate.csv
generate_measures_lda:
run: cohortextractor:latest generate_measures --study-definition study_definition_lda --output-dir=output/lda/joined
needs: [join_cohorts_lda]
outputs:
moderately_sensitive:
# Only output the single summary file
measure_csv: output/lda/joined/measure_*_rate.csv
#############################
# Plotting
#############################
generate_qof_deciles_charts:
run: >
deciles-charts:v0.0.15
--input-files output/qof/joined/measure_*_practice_rate.csv
--output-dir output/qof/joined
config:
show_outer_percentiles: false
tables:
output: true
charts:
output: true
needs: [generate_measures_register, generate_measures_dep003]
outputs:
moderately_sensitive:
cohort: output/qof/joined/deciles_*_*.*
generate_qof_groups:
run: >
python:latest python analysis/group_charts.py
--input-files output/qof/joined/measure_*.csv
--output-dir output/qof/joined
--date-lines "2019-03-31" "2020-03-31" "2021-03-31"
--scale "percentage"
needs: [generate_measures_register, generate_measures_dep003]
outputs:
moderately_sensitive:
cohort: output/qof/joined/group_chart_*.png
generate_lda_groups:
run: >
python:latest python analysis/group_charts.py
--input-files output/lda/joined/measure_*.csv
--output-dir output/lda/joined
--date-lines "2020-03-16" "2020-12-02"
--scale "rate"
needs: [generate_measures_lda]
outputs:
moderately_sensitive:
cohort: output/lda/joined/group_chart_*.png
generate_table1:
run: >
python:latest python analysis/table1.py
--input-dir output/lda/joined
--output-dir output/lda/joined
--measure-attribute "antidepressant_any"
needs: [generate_measures_lda]
outputs:
moderately_sensitive:
cohort: output/lda/joined/table1.csv
#############################
# Display
#############################
generate_report:
run: >
python:latest python analysis/report.py
--input-dir output/qof/joined
--output-dir output/qof/joined
--resource-dir analysis/resources
needs: [generate_qof_deciles_charts, generate_qof_groups]
outputs:
moderately_sensitive:
cohort: output/qof/joined/report.html
Timeline
-
Created:
-
Started:
-
Finished:
-
Runtime: 00:23:23
These timestamps are generated and stored using the UTC timezone on the TPP backend.
Job request
- Status
-
Succeeded
- Backend
- TPP
- Workspace
- antidepressant-prescribing-lda
- Requested by
- Christine Cunningham
- Branch
- main
- Force run dependencies
- No
- Git commit hash
- 36fe17f
- Requested actions
-
-
generate_measures_dep003
-
Code comparison
Compare the code used in this job request