Job request: 24408

Organisation:: University of Bristol
Workspace:: post-covid-cvd-v1
ID:: y76gliw6xonqmjn5

This page shows the technical details of what happened when the authorised researcher Venexia Walker requested one or more actions to be run against real patient data within a secure environment.

By cross-referencing the list of jobs with the pipeline section below, you can infer what security level the outputs were written to.

The output security levels are:

highly_sensitive
- Researchers can never directly view these outputs
- Researchers can only request code is run against them
moderately_sensitive
- Can be viewed by an approved researcher by logging into a highly secure environment
- These are the only outputs that can be requested for public release via a controlled output review service.

Jobs

Action:

clean_data_prevax

Status:

Status: Succeeded

Job identifier:

gu4oec3ja5gnxru3
Action:

clean_data_vax

Status:

Status: Succeeded

Job identifier:

xpqqjklfhcvjbs7s
Action:

clean_data_unvax

Status:

Status: Succeeded

Job identifier:

zwckd2v75vkrb5wl

Pipeline

Show project.yaml

version: '3.0'

expectations:

  population_size: 5000

actions:

  ## # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # 
  ## DO NOT EDIT project.yaml DIRECTLY 
  ## This file is created by create_project_actions.R 
  ## Edit and run create_project_actions.R to update the project.yaml 
  ## # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # 
  ## Define study dates 

  study_dates:
    run: r:latest analysis/study_dates.R
    outputs:
      highly_sensitive:
        study_dates_json: output/study_dates.json

  ## Generate dates for all cohorts 

  generate_dates:
    run: ehrql:v1 generate-dataset analysis/dataset_definition/dataset_definition_dates.py
      --output output/dataset_definition/index_dates.csv.gz
    needs:
    - study_dates
    outputs:
      highly_sensitive:
        dataset: output/dataset_definition/index_dates.csv.gz

  ## Generate cohort - prevax 

  generate_cohort_prevax:
    run: ehrql:v1 generate-dataset analysis/dataset_definition/dataset_definition_prevax.py
      --output output/dataset_definition/input_prevax.csv.gz
    needs:
    - generate_dates
    outputs:
      highly_sensitive:
        cohort: output/dataset_definition/input_prevax.csv.gz

  ## Generate cohort - vax 

  generate_cohort_vax:
    run: ehrql:v1 generate-dataset analysis/dataset_definition/dataset_definition_vax.py
      --output output/dataset_definition/input_vax.csv.gz
    needs:
    - generate_dates
    outputs:
      highly_sensitive:
        cohort: output/dataset_definition/input_vax.csv.gz

  ## Generate cohort - unvax 

  generate_cohort_unvax:
    run: ehrql:v1 generate-dataset analysis/dataset_definition/dataset_definition_unvax.py
      --output output/dataset_definition/input_unvax.csv.gz
    needs:
    - generate_dates
    outputs:
      highly_sensitive:
        cohort: output/dataset_definition/input_unvax.csv.gz

  ## Clean data - prevax, with describe = TRUE 

  clean_data_prevax:
    run: r:latest analysis/dataset_clean/dataset_clean.R prevax TRUE
    needs:
    - study_dates
    - generate_cohort_prevax
    outputs:
      moderately_sensitive:
        describe_raw: output/describe/prevax_raw.txt
        describe_venn: output/describe/prevax_venn.txt
        describe_preprocessed: output/describe/prevax_preprocessed.txt
        flow: output/dataset_clean/flow_prevax.csv
        flow_midpoint6: output/dataset_clean/flow_prevax_midpoint6.csv
      highly_sensitive:
        venn: output/dataset_clean/venn_prevax.rds
        cohort_clean: output/dataset_clean/input_prevax_clean.rds

  ## Clean data - vax, with describe = TRUE 

  clean_data_vax:
    run: r:latest analysis/dataset_clean/dataset_clean.R vax TRUE
    needs:
    - study_dates
    - generate_cohort_vax
    outputs:
      moderately_sensitive:
        describe_raw: output/describe/vax_raw.txt
        describe_venn: output/describe/vax_venn.txt
        describe_preprocessed: output/describe/vax_preprocessed.txt
        flow: output/dataset_clean/flow_vax.csv
        flow_midpoint6: output/dataset_clean/flow_vax_midpoint6.csv
      highly_sensitive:
        venn: output/dataset_clean/venn_vax.rds
        cohort_clean: output/dataset_clean/input_vax_clean.rds

  ## Clean data - unvax, with describe = TRUE 

  clean_data_unvax:
    run: r:latest analysis/dataset_clean/dataset_clean.R unvax TRUE
    needs:
    - study_dates
    - generate_cohort_unvax
    outputs:
      moderately_sensitive:
        describe_raw: output/describe/unvax_raw.txt
        describe_venn: output/describe/unvax_venn.txt
        describe_preprocessed: output/describe/unvax_preprocessed.txt
        flow: output/dataset_clean/flow_unvax.csv
        flow_midpoint6: output/dataset_clean/flow_unvax_midpoint6.csv
      highly_sensitive:
        venn: output/dataset_clean/venn_unvax.rds
        cohort_clean: output/dataset_clean/input_unvax_clean.rds

Timeline

Created: 11 months, 1 week ago 28 Mar 2025 17:25:31 UTC
Started: 11 months, 1 week ago 28 Mar 2025 17:26:28 UTC
Finished: 11 months, 1 week ago 28 Mar 2025 20:31:40 UTC
Runtime: 07:47:49

These timestamps are generated and stored using the UTC timezone on the TPP backend.

Job request

Status: Succeeded
Backend: TPP
Workspace: post-covid-cvd-v1
Requested by: Venexia Walker
Branch: main
Force run dependencies: No
Git commit hash: be30579
Requested actions: clean_data_prevax

clean_data_vax

clean_data_unvax

Code comparison

Compare the code used in this job request