Job request: 5846
- Organisation:
- Bennett Institute
- Workspace:
- pincer-measures-emis
- ID:
- ivv367fj7hsvsg4p
This page shows the technical details of what happened when the authorised researcher Lisa Hopcroft requested one or more actions to be run against real patient data within a secure environment.
By cross-referencing the list of jobs with the pipeline section below, you can infer what security level the outputs were written to.
The output security levels are:
-
highly_sensitive
- Researchers can never directly view these outputs
- Researchers can only request code is run against them
-
moderately_sensitive
- Can be viewed by an approved researcher by logging into a highly secure environment
- These are the only outputs that can be requested for public release via a controlled output review service.
Jobs
-
- Job identifier:
-
6wshygls4s4a6bka
Pipeline
Show project.yaml
version: "3.0"
expectations:
population_size: 5000
actions:
generate_study_population_1:
run: cohortextractor:latest generate_cohort --study-definition study_definition --index-date-range "2019-09-01 to 2020-05-01 by month" --output-format feather
outputs:
highly_sensitive:
cohort: output/input_*.feather
generate_study_population_2:
run: cohortextractor:latest generate_cohort --study-definition study_definition --index-date-range "2020-06-01 to 2021-02-01 by month" --output-format feather
outputs:
highly_sensitive:
cohort: output/input*.feather
generate_study_population_3:
run: cohortextractor:latest generate_cohort --study-definition study_definition --index-date-range "2021-03-01 to 2021-09-01 by month" --output-format csv.gz
outputs:
highly_sensitive:
cohort: output/inpu*.csv.gz
generate_study_definition_demographics_1:
run: cohortextractor:latest generate_cohort --study-definition study_definition_demographics --index-date-range "2019-09-01 to 2020-05-01 by month" --output-format csv.gz
outputs:
highly_sensitive:
cohort: output/input_demographics_*.csv.gz
generate_study_definition_demographics_2:
run: cohortextractor:latest generate_cohort --study-definition study_definition_demographics --index-date-range "2020-06-01 to 2021-02-01 by month" --output-format csv.gz
outputs:
highly_sensitive:
cohort: output/input_demographics*.csv.gz
generate_study_definition_demographics_3:
run: cohortextractor:latest generate_cohort --study-definition study_definition_demographics --index-date-range "2021-03-01 to 2021-09-01 by month" --output-format csv.gz
outputs:
highly_sensitive:
cohort: output/input_demographic*.csv.gz
generate_study_population_region:
run: cohortextractor:latest generate_cohort --study-definition study_definition_region --index-date-range "2019-09-01 to 2021-09-01 by month" --output-format csv.gz
outputs:
highly_sensitive:
cohort: output/input_region*.csv.gz
generate_study_population_ethnicity:
run: cohortextractor:latest generate_cohort --study-definition study_definition_ethnicity --output-format feather
outputs:
highly_sensitive:
cohort: output/input_ethnicity.feather
generate_demographics:
run: python:latest python analysis/demographics_summary.py
needs:
[
generate_study_population_1,
generate_study_population_2,
generate_study_population_3,
generate_study_population_ethnicity,
generate_study_definition_demographics_1,
generate_study_definition_demographics_2,
generate_study_definition_demographics_3,
generate_study_population_region
]
outputs:
moderately_sensitive:
demographics: output/demographics_summary*.csv
# join_ethnicity_region:
# run: python:latest python analysis/join_ethnicity_region.py
# needs:
# [
# generate_study_population_1,
# generate_study_population_2,
# generate_study_population_3,
# generate_study_population_ethnicity,
# ]
# outputs:
# highly_sensitive:
# cohort: output/inp*.feather
filter_population:
run: python:latest python analysis/filter_population.py
needs: [generate_study_population_1, generate_study_population_2, generate_study_population_3]
outputs:
highly_sensitive:
cohort: output/input_filtered_*.feather
calculate_numerators:
run: python:latest python analysis/calculate_numerators.py
needs: [filter_population]
outputs:
highly_sensitive:
cohort: output/indicator_e_f_*.feather
# calculate_composite_indicators:
# run: python:latest python analysis/composite_indicators.py
# needs: [calculate_numerators, filter_population]
# outputs:
# moderately_sensitive:
# counts: output/*_composite_measure.csv
generate_measures:
run: cohortextractor:latest generate_measures --study-definition study_definition --output-dir=output
needs: [filter_population]
outputs:
moderately_sensitive:
measure_csv: output/measure_*_rate.csv
generate_measures_additional:
run: python:latest python analysis/calculate_measures.py
needs: [calculate_numerators, filter_population]
outputs:
moderately_sensitive:
measure_csv: output/measure*_rate.csv
generate_measures_region:
run: cohortextractor:latest generate_measures --study-definition study_definition_region --output-dir=output
needs: [generate_study_population_region]
outputs:
moderately_sensitive:
measure_csv: output/measure_msoa_rate.csv
measure_csv_region: output/measure_region_rate.csv
generate_region_counts:
run: python:latest python analysis/check_region.py
needs:
[
generate_study_population_region
]
outputs:
moderately_sensitive:
region_count: output/combined_count.csv
generate_summary_counts:
run: python:latest python analysis/summary_statistics.py
needs:
[
filter_population,
generate_measures,
generate_measures_additional,
calculate_numerators,
]
outputs:
moderately_sensitive:
patient_count: output/patient_count_*.json
practice_count: output/practice_count_*.json
summary: output/indicator_summary_statistics_*.json
produce_stripped_measures:
run: python:latest python analysis/stripped_measures.py
needs:
[
generate_measures,
generate_measures_additional
]
outputs:
moderately_sensitive:
measures: output/measure_stripped_*.csv
generate_plots:
run: python:latest python analysis/plot_measures.py
needs:
[
generate_measures,
generate_measures_additional
]
outputs:
moderately_sensitive:
counts: output/figures/plot_*.jpeg
combined: output/figures/combined_plot_*.png
medians: output/medians.json
generate_notebook:
run: jupyter:latest jupyter nbconvert /workspace/analysis/report.ipynb --execute --to html --template basic --output-dir=/workspace/output --ExecutePreprocessor.timeout=86400 --no-input
needs: [generate_plots, generate_summary_counts]
outputs:
moderately_sensitive:
notebook: output/report.html
# generate_dem_notebook:
# run: jupyter:latest jupyter nbconvert /workspace/analysis/demographic_report.ipynb --execute --to html --template basic --output-dir=/workspace/output --ExecutePreprocessor.timeout=86400 --no-input
# needs: [generate_plots]
# outputs:
# moderately_sensitive:
# notebook: output/demographic_report.html
# plot_Q1_comparisons:
# run: r:latest analysis/generate_demographic_slope_plot.R
# needs: [generate_plots]
# outputs:
# moderately_sensitive:
# plots: output/figures/SLOPE_*.png
# run_tests:
# run: python:latest python -m pytest --junit-xml=output/pytest.xml --verbose
# outputs:
# moderately_sensitive:
# log: output/pytest.xml
# test_population:
# run: python:latest python analysis/test_population.py
# needs: [filter_population]
# outputs:
# moderately_sensitive:
# counts: output/population_counts.csv
# count: output/patient_count_check.json
Timeline
-
Created:
-
Started:
-
Finished:
-
Runtime: 00:00:22
These timestamps are generated and stored using the UTC timezone on the EMIS backend.
Job request
- Status
-
Succeeded
- Backend
- EMIS
- Workspace
- pincer-measures-emis
- Requested by
- Lisa Hopcroft
- Branch
- emis
- Force run dependencies
- No
- Git commit hash
- 2c893b1
- Requested actions
-
-
generate_notebook
-
Code comparison
Compare the code used in this job request