Job request: 20169

Organisation:: Bennett Institute
Workspace:: pincer-measures
ID:: xd3xebtlpkw65cfv

This page shows the technical details of what happened when the authorised researcher Louis Fisher requested one or more actions to be run against real patient data within a secure environment.

By cross-referencing the list of jobs with the pipeline section below, you can infer what security level the outputs were written to.

The output security levels are:

highly_sensitive
- Researchers can never directly view these outputs
- Researchers can only request code is run against them
moderately_sensitive
- Can be viewed by an approved researcher by logging into a highly secure environment
- These are the only outputs that can be requested for public release via a controlled output review service.

Jobs

Action:

measures_ehrql_simple

Status:

Status: Succeeded

Job identifier:

pl2znueflzhv2vnl

Pipeline

Show project.yaml

version: "3.0"

expectations:
  population_size: 5000

actions:
  generate_study_population_1:
    run: cohortextractor:latest generate_cohort --study-definition study_definition --index-date-range "2019-09-01 to 2020-05-01 by month" --output-format feather
    outputs:
      highly_sensitive:
        cohort: output/input_*.feather

  generate_study_population_2:
    run: cohortextractor:latest generate_cohort --study-definition study_definition --index-date-range "2020-06-01 to 2021-02-01 by month" --output-format feather
    outputs:
      highly_sensitive:
        cohort: output/input*.feather

  generate_study_population_3:
    run: cohortextractor:latest generate_cohort --study-definition study_definition --index-date-range "2021-03-01 to 2021-09-01 by month" --output-format feather
    outputs:
      highly_sensitive:
        cohort: output/inpu*.feather

  generate_study_population_4:
    run: cohortextractor:latest generate_cohort --study-definition study_definition --index-date-range "2021-10-01 to 2022-02-01 by month" --output-format feather
    outputs:
      highly_sensitive:
        cohort: output/in*.feather
        
  generate_study_population_5:
    run: cohortextractor:latest generate_cohort --study-definition study_definition --index-date-range "2022-03-01 to 2023-05-01 by month" --output-format feather
    outputs:
      highly_sensitive:
        cohort: output/i*.feather

  generate_study_population_ethnicity:
    run: cohortextractor:latest generate_cohort --study-definition study_definition_ethnicity --output-format feather
    outputs:
      highly_sensitive:
        cohort: output/input_ethnicity.feather

  join_ethnicity_region:
    run: python:latest python analysis/join_ethnicity_region.py
    needs:
      [
        generate_study_population_1,
        generate_study_population_2,
        generate_study_population_3,
        generate_study_population_4,
        generate_study_population_5,
        generate_study_population_ethnicity,
      ]
    outputs:
      highly_sensitive:
        cohort: output/inp*.feather

  filter_population:
    run: python:latest python analysis/filter_population.py
    needs: [join_ethnicity_region]
    outputs:
      highly_sensitive:
        cohort: output/input_filtered_*.feather

  calculate_numerators:
    run: python:latest python analysis/calculate_numerators.py
    needs: [filter_population]
    outputs:
      highly_sensitive:
        cohort: output/indicator_e_f_*.feather

  calculate_composite_indicators:
    run: python:latest python analysis/composite_indicators.py
    needs: [calculate_numerators, filter_population]
    outputs:
      moderately_sensitive:
        counts: output/*_composite_measure.csv

  generate_measures:
    run: cohortextractor:latest generate_measures --study-definition study_definition --output-dir=output
    needs: [filter_population]
    outputs:
      moderately_sensitive:
        measure_csv: output/measure_*_rate.csv

  generate_measures_demographics:
    run: python:latest python analysis/calculate_measures.py
    needs: [calculate_numerators, filter_population]
    outputs:
      moderately_sensitive:
        counts: output/indicator_measure_*.csv
        measure_csv: output/measure*_rate.csv
        demographics: output/demographics_summary_*.csv
  
  produce_stripped_measures:
    run: python:latest python analysis/stripped_measures.py
    needs:
      [
        generate_measures,
        generate_measures_demographics
      ]
    outputs:
      moderately_sensitive:
        measures: output/measure_stripped_*.csv

  produce_stripped_measures_ehrql:
    run: python:latest python analysis/ehrQL/stripped_measures.py
    needs:
      [
        measures_ehrql
      ]
    outputs:
      moderately_sensitive:
        measures_stripped: output/measure_stripped_*_ehrql.csv
        measures: output/measure_*_ehrql.csv

  generate_summary_counts:
    run: python:latest python analysis/summary_statistics.py
    needs:
      [
        filter_population,
        generate_measures,
        generate_measures_demographics,
        calculate_numerators,
      ]
    outputs:
      moderately_sensitive:
        patient_count: output/patient_count_*.json
        practice_count: output/practice_count_*.json
        summary: output/indicator_summary_statistics_*.json

  generate_plots:
    run: python:latest python analysis/plot_measures.py
    needs:
      [
        produce_stripped_measures,
      ]
    outputs:
      moderately_sensitive:
        counts: output/figures/plot_*.jpeg
        medians: output/medians.json
  
  generate_plots_ehrql:
    run: python:latest python analysis/ehrQL/plot_measures.py
    needs:
      [
        produce_stripped_measures_ehrql,
      ]
    outputs:
      moderately_sensitive:
        counts: output/figures_ehrql/plot*.jpeg

  generate_plots_alternative:
    run: python:latest python analysis/plot_measures_alternative.py
    needs:
      [
        generate_measures,
        generate_measures_demographics,
      ]
    outputs:
      moderately_sensitive:
        counts: output/figures/plot_*_alternative.jpeg

  generate_notebook:
    run: jupyter:latest jupyter nbconvert /workspace/analysis/report.ipynb --execute --to html --template basic --output-dir=/workspace/output --ExecutePreprocessor.timeout=86400 --no-input
    needs: [generate_plots, generate_summary_counts]
    outputs:
      moderately_sensitive:
        notebook: output/report.html

  generate_notebook_updating:
    run: jupyter:latest jupyter nbconvert /workspace/analysis/report_updating.ipynb --execute --to html --template basic --output-dir=/workspace/output --ExecutePreprocessor.timeout=86400 --no-input
    needs: [generate_plots, generate_summary_counts]
    outputs:
      moderately_sensitive:
        notebook: output/report_updating.html

  generate_notebook_updating_ehrql:
    run: jupyter:latest jupyter nbconvert /workspace/analysis/ehrQL/report_updating_ehrql.ipynb --execute --to html --template basic --output-dir=/workspace/output --ExecutePreprocessor.timeout=86400 --no-input
    needs: [generate_plots_ehrql]
    outputs:
      moderately_sensitive:
        notebook: output/report_updating_ehrql.html

  run_tests:
    run: python:latest python -m pytest --junit-xml=output/pytest.xml --verbose
    outputs:
      moderately_sensitive:
        log: output/pytest.xml
  
  non_zero_count:
    run: python:latest python analysis/non_zero.py
    needs: [produce_stripped_measures]
    outputs:
      moderately_sensitive:
        counts: output/non_zero*.csv
  
  numerator_distribution:
    run: python:latest python analysis/event_distribution.py
    needs: [generate_measures, calculate_composite_indicators]
    outputs:
      moderately_sensitive:
        counts: output/numerator_*_distribution*
  
  measures_ehrql:
    run: ehrql:v0 generate-measures analysis/ehrQL/measure_definition.py --output output/measures.csv
    outputs:
      moderately_sensitive:
        measure_csv: output/measures.csv

  measures_ehrql_simple:
    run: ehrql:v0 generate-measures analysis/ehrQL/measure_definition_simple.py --output output/measures_simple.csv
    outputs:
      moderately_sensitive:
        measure_csv: output/measures_simple.csv

Timeline

Created: 2 years, 6 months ago 31 Aug 2023 11:27:20 UTC
Started: 2 years, 6 months ago 31 Aug 2023 12:10:20 UTC
Finished: 2 years, 6 months ago 31 Aug 2023 13:13:53 UTC
Runtime: 01:03:33

These timestamps are generated and stored using the UTC timezone on the TPP backend.

Job request

Status: Succeeded
Backend: TPP
Workspace: pincer-measures
Requested by: Louis Fisher
Branch: main
Force run dependencies: No
Git commit hash: 74e7001
Requested actions: measures_ehrql_simple

Code comparison

Compare the code used in this job request