Job request: 13686

Organisation:: Bennett Institute
Workspace:: appointments-short-data-report
ID:: zbuoizjfrbc7yzeo

This page shows the technical details of what happened when the authorised researcher Iain Dillingham requested one or more actions to be run against real patient data within a secure environment.

By cross-referencing the list of jobs with the pipeline section below, you can infer what security level the outputs were written to.

The output security levels are:

highly_sensitive
- Researchers can never directly view these outputs
- Researchers can only request code is run against them
moderately_sensitive
- Can be viewed by an approved researcher by logging into a highly secure environment
- These are the only outputs that can be requested for public release via a controlled output review service.

Jobs

Action:

query_distinct_values

Status:

Status: Succeeded

Job identifier:

5brr53e5nj4enuts
Action:

reindex_distinct_values

Status:

Status: Succeeded

Job identifier:

khuej2pbw5v7u57h
Action:

generate_prop_distinct_values_by_organisation_id_measure

Status:

Status: Succeeded

Job identifier:

z3bx2agzfo3s4zs7
Action:

generate_distinct_values_deciles_charts

Status:

Status: Succeeded

Job identifier:

i2igxj777mtte7qk

Pipeline

Show project.yaml

version: "3.0"

expectations:
  population_size: 1000

actions:
  query_distinct_values:
    run: >
      sqlrunner:latest
        --output output/distinct_values/rows.csv
        analysis/distinct_values/query.sql
    outputs:
      highly_sensitive:
        rows: output/distinct_values/rows.csv

  reindex_distinct_values:
    needs: [query_distinct_values]
    run: >
      python:latest python -m analysis.actions.reindex
        --output output/distinct_values/reindexed_rows.csv
        --date-column-name booked_date
        output/distinct_values/rows.csv
        --group-by-column-names Organisation_ID
    outputs:
      highly_sensitive:
        rows: output/distinct_values/reindexed_rows.csv

  generate_prop_distinct_values_by_organisation_id_measure:
    needs: [reindex_distinct_values]
    run: >
      python:latest python -m analysis.distinct_values.generate_measure
    outputs:
      highly_sensitive:
        measure: output/distinct_values/measure_prop_distinct_values_by_organisation_id.csv

  generate_distinct_values_deciles_charts:
    needs: [generate_prop_distinct_values_by_organisation_id_measure]
    run: >
      deciles-charts:v0.0.33
        --input-files output/distinct_values/measure_*.csv
        --output-dir output/distinct_values
    config:
      show_outer_percentiles: true
    outputs:
      moderately_sensitive:
        deciles_charts: output/distinct_values/deciles_chart_*.png
        deciles_tables: output/distinct_values/deciles_table_*.csv

  query_status:
    run: >
      sqlrunner:latest
        --output output/status/rows.csv
        analysis/status/query.sql
    outputs:
      highly_sensitive:
        rows: output/status/rows.csv

  round_status:
    needs: [query_status]
    run: >
      python:latest python -m analysis.actions.round
        --output output/status/results.csv
        output/status/rows.csv
        --column-names num_values
    outputs:
      moderately_sensitive:
        results: output/status/results.csv

  query_date_range:
    run: >
      sqlrunner:latest
        --output output/date_range/rows.csv
        analysis/date_range/query.sql
    outputs:
      highly_sensitive:
        rows: output/date_range/rows.csv

  copy_date_range:
    needs: [query_date_range]
    run: >
      python:latest python -m analysis.actions.copy
        --output output/date_range/results.csv
        output/date_range/rows.csv
    outputs:
      moderately_sensitive:
        results: output/date_range/results.csv

  query_num_rows_by_month:
    run: >
      sqlrunner:latest
        --output output/num_rows_by_month/rows.csv
        analysis/num_rows_by_month/query.sql
    outputs:
      highly_sensitive:
        rows: output/num_rows_by_month/rows.csv

  round_num_rows_by_month:
    needs: [query_num_rows_by_month]
    run: >
      python:latest python -m analysis.actions.round
        --output output/num_rows_by_month/results.csv
        output/num_rows_by_month/rows.csv
        --column-names num_rows
    outputs:
      moderately_sensitive:
        results: output/num_rows_by_month/results.csv

  query_lead_time:
    run: >
      sqlrunner:latest
        --output output/lead_time/rows.csv
        analysis/lead_time/query.sql
    outputs:
      highly_sensitive:
        rows: output/lead_time/rows.csv

  round_lead_time:
    needs: [query_lead_time]
    run: >
      python:latest python -m analysis.actions.round
        --output output/lead_time/results.csv
        output/lead_time/rows.csv
        --column-names frequency
    outputs:
      moderately_sensitive:
        results: output/lead_time/results.csv

  make_html_reports:
    # --execute
    #   execute notebooks before converting them to HTML reports
    # --no-input
    #   exclude input cells and output prompts from HTML reports
    # --to=html
    #   convert notebooks to HTML reports (not e.g. to PDF reports)
    # --template basic
    #   use the basic (unstyled) template for HTML reports
    # --output-dir=output/reports
    #   write HTML reports to the `output/reports` directory
    # --ExecutePreprocessor.timeout=-1
    #   disable the time to wait (in seconds) for output from executions
    run: >
      python:latest jupyter nbconvert
        --execute
        --no-input
        --to=html
        --template basic
        --output-dir=output/reports
        --ExecutePreprocessor.timeout=-1
        analysis/reports/*.ipynb
    needs:
      - generate_distinct_values_deciles_charts
      - round_status
      - copy_date_range
      - round_num_rows_by_month
      - round_lead_time
    outputs:
      moderately_sensitive:
        reports: output/reports/*.html

Timeline

Created: 3 years, 4 months ago 02 Dec 2022 14:36:55 UTC
Started: 3 years, 4 months ago 02 Dec 2022 14:37:55 UTC
Finished: 3 years, 4 months ago 02 Dec 2022 14:50:24 UTC
Runtime: 00:10:44

These timestamps are generated and stored using the UTC timezone on the TPP backend.

Job request

Status: Succeeded
Backend: TPP
Workspace: appointments-short-data-report
Requested by: Iain Dillingham
Branch: main
Force run dependencies: Yes
Git commit hash: c4b869d
Requested actions: generate_distinct_values_deciles_charts

Code comparison

Compare the code used in this job request