Skip to content

Job request: 20434

Organisation:
University of Bristol
Workspace:
vaccine-counts
ID:
tgtjkyz7g52xjko7

This page shows the technical details of what happened when the authorised researcher Ed Parker requested one or more actions to be run against real patient data within a secure environment.

By cross-referencing the list of jobs with the pipeline section below, you can infer what security level the outputs were written to.

The output security levels are:

  • highly_sensitive
    • Researchers can never directly view these outputs
    • Researchers can only request code is run against them
  • moderately_sensitive
    • Can be viewed by an approved researcher by logging into a highly secure environment
    • These are the only outputs that can be requested for public release via a controlled output review service.

Jobs

Pipeline

Show project.yaml
version: '3.0'

expectations:
  population_size: 1000

actions:

  extract_fixed:
    run: ehrql:v0 generate-dataset analysis/dataset_definition_fixed.py --output output/extracts/extract_fixed.arrow
    outputs:
      highly_sensitive:
        cohort: output/extracts/extract_fixed.arrow

  extract_varying:
    run: ehrql:v0 generate-dataset analysis/dataset_definition_varying.py --output output/extracts/extract_varying.arrow
    outputs:
      highly_sensitive:
        cohort: output/extracts/extract_varying.arrow


  process:
    run: r:latest analysis/process.R
    needs: [extract_fixed, extract_varying]
    outputs:
      highly_sensitive:
        rds: output/process/*.rds

  report:
    run: r:latest analysis/report.R
    needs: [process]
    outputs:
      moderately_sensitive:
        csv: output/report/*.csv
        png: output/report/*.png

  # Additional actions for snapshot
  extract_snapshot:
    run: >
      cohortextractor:latest generate_cohort
        --study-definition study_definition_snapshot
        --skip-existing
        --output-format=csv.gz
    outputs:
      highly_sensitive:
        cohort: output/input_snapshot.csv.gz

  process_snapshot:
    run: r:latest analysis/snapshot_process.R
    needs: [extract_snapshot]
    outputs:
      highly_sensitive:
        rds: output/snapshot/processed_snapshot.rds

  report_snapshot:
    run: r:latest analysis/snapshot_report.R
    needs: [process_snapshot]
    outputs:
      moderately_sensitive:
        csv: output/snapshot_report/snapshot*.csv

Timeline

  • Created:

  • Started:

  • Finished:

  • Runtime: 09:29:34

These timestamps are generated and stored using the UTC timezone on the TPP backend.

Job request

Status
Succeeded
Backend
TPP
Workspace
vaccine-counts
Requested by
Ed Parker
Branch
main
Force run dependencies
No
Git commit hash
80e828a
Requested actions
  • extract_snapshot
  • process_snapshot

Code comparison

Compare the code used in this job request