Job request: 16702

Organisation:: Bennett Institute
Workspace:: vax-fourth-dose-rd-baseline
ID:: ajhegn2wfny2zurd

This page shows the technical details of what happened when the authorised researcher Andrea Schaffer requested one or more actions to be run against real patient data within a secure environment.

By cross-referencing the list of jobs with the pipeline section below, you can infer what security level the outputs were written to.

The output security levels are:

highly_sensitive
- Researchers can never directly view these outputs
- Researchers can only request code is run against them
moderately_sensitive
- Can be viewed by an approved researcher by logging into a highly secure environment
- These are the only outputs that can be requested for public release via a controlled output review service.

Jobs

Action:

covid_outcomes

Status:

Status: Succeeded

Job identifier:

4g2lrutb74g65yay

Pipeline

Show project.yaml

######################################

# This script defines the project pipeline - it specifies the execution orders for all the code in this
# repo using a series of actions.

######################################


version: '3.0'

expectations:
  population_size: 100000

actions:

# Generate study population and extract baseline characteristics at Sep 3, 2022
  generate_study_pop_baseline:
    run: cohortextractor:latest generate_cohort
      --study-definition study_definition_baseline
      --output-dir=feather 
      --output-format=feather
    outputs:
      highly_sensitive:
        cohort: output/input_baseline.feather
      
# Data cleaning, defining exclusions, saving final study pop
  data_process_baseline:
    run: r:latest analysis/processing/data_process_baseline.R
    needs: [generate_study_pop_baseline]
    outputs:
      highly_sensitive:
        cohort: output/cohort/cohort_*.csv
      moderately_sensitive:
        descriptive: output/descriptive/total_*.csv

 # Extract outcomes pre-campaign (index date = Sep 3)
  outcomes_sep:
    run: cohortextractor:latest generate_cohort
      --study-definition study_definition_outcomes_1
      --index-date-range "2022-09-03" 
      --output-dir=feather 
      --output-format=feather
    needs: [data_process_baseline]
    outputs:
      highly_sensitive:
        cohort: output/index/input_*.feather

 # Extract outcomes mid-campaign (index date = Oct 15)
  outcomes_oct:
    run: cohortextractor:latest generate_cohort
      --study-definition study_definition_outcomes_1
      --index-date-range "2022-10-15" 
      --output-dir=feather 
      --output-format=feather
    needs: [data_process_baseline]
    outputs:
      highly_sensitive:
        cohort: output/index/input*.feather

 # Extract outcomes during-campaign (index date = Nov 26)
  outcomes_nov:
    run: cohortextractor:latest generate_cohort
      --study-definition study_definition_outcomes_2
      --index-date-range "2022-11-26 to 2023-01-31 by week" 
      --output-dir=feather 
      --output-format=feather
    needs: [data_process_baseline]
    outputs:
      highly_sensitive:
        cohort: output/index/inpu*.feather

# Data cleaning of outcome data
  data_process_outcomes:
    run: r:latest analysis/processing/data_process_outcomes.R
    needs: [outcomes_sep, outcomes_oct, outcomes_nov]
    outputs:
      highly_sensitive:
        outcomes: output/cohort/outcomes*.csv

# Plots of COVID booster uptake by age
  booster_uptake:
   run: r:latest analysis/descriptive/cumulative_vax_byage.R
   needs: [data_process_baseline]
   outputs:
      moderately_sensitive:
        rates_csv: output/cumulative_rates/final_*.csv 
        plot: output/cumulative_rates/plot_*.png

# Outcome plots #
  covid_outcomes:
   run: r:latest analysis/descriptive/plot_outcomes_byage.R
   needs: [data_process_outcomes]
   outputs:
      moderately_sensitive:
        measure_csv: output/covid_outcomes/plot_*.csv
        plot: output/covid_outcomes/plot_*.png

# Discontinuity of demographics
  # demographics:
  #  run: r:latest analysis/descriptive/demographics_byage.R
  #  needs: [generate_study_pop_baseline, data_process_baseline]
  #  outputs:
  #     moderately_sensitive:
  #       measures_csv: output/descriptive/demographics_*.csv


# Flu vaccine uptake #
  # flu_vax:
  #  run: r:latest analysis/descriptive/flu_vax_byage.R
  #  needs: [generate_study_pop]
  #  outputs:
  #     moderately_sensitive:
  #       measure_csv: output/cumulative_rates/flu_*.csv
  #       plot: output/cumulative_rates/plot_flu_vax_byage.png

# ITT analysis #
  # itt_analysis:
  #  run: r:latest analysis/statistical_analysis/itt_analysis.R
  #  needs: [data_process_outcomes]
  #  outputs:
  #     moderately_sensitive:
  #       measure_csv: output/covid_outcomes/outcome_*.csv

# Outcome plots #
  # covid_outcomes:
  #  run: r:latest analysis/descriptive/covid_outcomes_byage.R
  #  needs: [generate_study_pop]
  #  outputs:
  #     moderately_sensitive:
  #       measure_csv: output/covid_outcomes/covid_*.csv
  #       plot: output/covid_outcomes/plot_outcomes_*.png
  




# IV analysis #

Timeline

Created: 2 years, 11 months ago 26 Mar 2023 14:43:37 UTC
Started: 2 years, 11 months ago 26 Mar 2023 14:43:39 UTC
Finished: 2 years, 11 months ago 26 Mar 2023 14:47:59 UTC
Runtime: 00:04:20

These timestamps are generated and stored using the UTC timezone on the TPP backend.

Job request

Status: Succeeded
Backend: TPP
Workspace: vax-fourth-dose-rd-baseline
Requested by: Andrea Schaffer
Branch: Protocol-updates
Force run dependencies: No
Git commit hash: 40e5edb
Requested actions: covid_outcomes

Code comparison

Compare the code used in this job request