Job request: 7960
- Organisation:
- Bennett Institute
- Workspace:
- hepatitis_in_children
- ID:
- 5nndqek4a7l6jdmz
This page shows the technical details of what happened when the authorised researcher Louis Fisher requested one or more actions to be run against real patient data within a secure environment.
By cross-referencing the list of jobs with the pipeline section below, you can infer what security level the outputs were written to.
The output security levels are:
-
highly_sensitive
- Researchers can never directly view these outputs
- Researchers can only request code is run against them
-
moderately_sensitive
- Can be viewed by an approved researcher by logging into a highly secure environment
- These are the only outputs that can be requested for public release via a controlled output review service.
Jobs
-
- Job identifier:
-
tzh6g55fqw6nskvp
Pipeline
Show project.yaml
version: '3.0'
expectations:
population_size: 1000
actions:
generate_study_population_monthly_1:
run: cohortextractor:latest generate_cohort --study-definition study_definition --index-date-range "2017-04-01 to 2018-04-01 by month" --output-dir=output/monthly --output-format csv.gz
outputs:
highly_sensitive:
cohort: output/monthly/input_*.csv.gz
generate_study_population_monthly_2:
run: cohortextractor:latest generate_cohort --study-definition study_definition --index-date-range "2018-05-01 to 2019-04-01 by month" --output-dir=output/monthly --output-format csv.gz
outputs:
highly_sensitive:
cohort: output/monthly/input*.csv.gz
generate_study_population_monthly_3:
run: cohortextractor:latest generate_cohort --study-definition study_definition --index-date-range "2019-05-01 to 2020-04-01 by month" --output-dir=output/monthly --output-format csv.gz
outputs:
highly_sensitive:
cohort: output/monthly/inpu*.csv.gz
generate_study_population_monthly_4:
run: cohortextractor:latest generate_cohort --study-definition study_definition --index-date-range "2020-05-01 to 2021-04-01 by month" --output-dir=output/monthly --output-format csv.gz
outputs:
highly_sensitive:
cohort: output/monthly/inp*.csv.gz
generate_study_population_monthly_5:
run: cohortextractor:latest generate_cohort --study-definition study_definition --index-date-range "2021-05-01 to 2022-03-01 by month" --output-dir=output/monthly --output-format csv.gz
outputs:
highly_sensitive:
cohort: output/monthly/in*.csv.gz
generate_study_population_weekly_1:
run: cohortextractor:latest generate_cohort --study-definition study_definition_weekly --index-date-range "2021-04-01 to 2021-09-02 by week" --output-dir=output/weekly --output-format csv.gz
outputs:
highly_sensitive:
cohort: output/weekly/input_*.csv.gz
generate_study_population_weekly_2:
run: cohortextractor:latest generate_cohort --study-definition study_definition_weekly --index-date-range "2021-09-09 to 2022-04-14 by week" --output-dir=output/weekly --output-format csv.gz
outputs:
highly_sensitive:
cohort: output/weekly/input*.csv.gz
generate_study_population_weekly_3:
run: cohortextractor:latest generate_cohort --study-definition study_definition_weekly --index-date-range "2022-04-21 to 2022-05-05 by week" --output-dir=output/weekly --output-format csv.gz
outputs:
highly_sensitive:
cohort: output/weekly/inpu*.csv.gz
generate_study_population_dob:
run: cohortextractor:latest generate_cohort --study-definition study_definition_dob --output-dir=output --output-format csv.gz
outputs:
highly_sensitive:
cohort: output/input_dob.csv.gz
join_cohorts_monthly:
run: >
cohort-joiner:v0.0.9
--lhs output/monthly/input_20*.csv.gz
--rhs output/input_dob.csv.gz
--output-dir output/monthly/joined
needs: [
generate_study_population_monthly_1,
generate_study_population_monthly_2,
generate_study_population_monthly_3,
generate_study_population_monthly_4,
generate_study_population_monthly_5,
generate_study_population_dob]
outputs:
highly_sensitive:
cohort: output/monthly/joined/input_20*.csv.gz
join_cohorts_weekly:
run: >
cohort-joiner:v0.0.9
--lhs output/weekly/input_weekly_20*.csv.gz
--rhs output/input_dob.csv.gz
--output-dir output/weekly/joined
needs: [
generate_study_population_weekly_1,
generate_study_population_weekly_2,
generate_study_population_dob]
outputs:
highly_sensitive:
cohort: output/weekly/joined/input_weekly_20*.csv.gz
join_cohorts_weekly_2:
run: >
cohort-joiner:v0.0.9
--lhs output/weekly/input_weekly_2022-04*.csv.gz
--rhs output/input_dob.csv.gz
--output-dir output/weekly/joined
needs: [
generate_study_population_weekly_3,
generate_study_population_dob]
outputs:
highly_sensitive:
cohort: output/weekly/joined/input_weekly_2022*.csv.gz
get_age_months:
run: >
python:latest python analysis/get_age_months.py
needs: [join_cohorts_monthly, join_cohorts_weekly, generate_study_population_dob]
outputs:
highly_sensitive:
cohorts_monthly: output/monthly/joined/input_2*.csv.gz
cohorts_weekly: output/weekly/joined/input_weekly*.csv.gz
mean_values_by_age:
run: >
python:latest python analysis/mean_values.py
needs: [get_age_months]
outputs:
moderately_sensitive:
monthly: output/monthly/joined/mean_test_value_*_by_age.csv
generate_measures:
run: cohortextractor:latest generate_measures
--study-definition study_definition
--output-dir=output/monthly/joined
needs: [
get_age_months
]
outputs:
moderately_sensitive:
measure_csv_monthly: output/monthly/joined/measure_*_rate.csv
generate_measures_weekly:
run: cohortextractor:latest generate_measures
--skip-existing
--study-definition study_definition_weekly
--output-dir=output/weekly/joined
needs: [get_age_months]
outputs:
moderately_sensitive:
measure_csv: output/weekly/joined/measure_*_rate.csv
generate_plots:
run: python:latest python analysis/plots.py
needs: [generate_measures, generate_measures_weekly, mean_values_by_age]
outputs:
moderately_sensitive:
counts: output/*/joined/plot_*.png
deciles_charts: output/*/joined/deciles_chart_*.png
measure_csv: output/*/joined/redacted/measure_*_rate.csv
num_practices: output/*/joined/practice_count*.json
generate_notebook:
run: jupyter:latest jupyter nbconvert /workspace/analysis/report.ipynb --execute --to html --template basic --output-dir=/workspace/output --ExecutePreprocessor.timeout=86400 --no-input
needs: [generate_plots]
outputs:
moderately_sensitive:
notebook: output/report.html
Timeline
-
Created:
-
Started:
-
Finished:
-
Runtime: 00:39:58
These timestamps are generated and stored using the UTC timezone on the TPP backend.
Job request
- Status
-
Succeeded
- Backend
- TPP
- Workspace
- hepatitis_in_children
- Requested by
- Louis Fisher
- Branch
- main
- Force run dependencies
- No
- Git commit hash
- 9bd9966
- Requested actions
-
-
mean_values_by_age
-
Code comparison
Compare the code used in this job request