Job request: 17308

Organisation:: University of Bristol
Workspace:: vax-fourth-dose-rd
ID:: vikeibndb5k4uynk

This page shows the technical details of what happened when the authorised researcher Andrea Schaffer requested one or more actions to be run against real patient data within a secure environment.

By cross-referencing the list of jobs with the pipeline section below, you can infer what security level the outputs were written to.

The output security levels are:

highly_sensitive
- Researchers can never directly view these outputs
- Researchers can only request code is run against them
moderately_sensitive
- Can be viewed by an approved researcher by logging into a highly secure environment
- These are the only outputs that can be requested for public release via a controlled output review service.

Jobs

Action:

data_process_baseline

Status:

Succeeded

Job identifier:

oz4fxoyq4lqrsbrk
Action:

outcomes_oct

Status:

Succeeded

Job identifier:

k7tuxmuuf4qlrad3
Action:

outcomes_nov

Status:

Succeeded

Job identifier:

y646mtm7ihv6e3ua
Action:

demographics

Status:

Succeeded

Job identifier:

6ksivhqo35as3hmf
Action:

outcomes_sep

Status:

Succeeded

Job identifier:

nttatlo4la6imgjg
Action:

data_process_outcomes_1

Status:

Succeeded

Job identifier:

3xlmoqgj3ajrh2oe
Action:

data_process_outcomes_2

Status:

Succeeded

Job identifier:

mv5y5657szb7fmtd
Action:

aggregate_outcomes_byage

Status:

Succeeded

Job identifier:

onzfkuwqzd4rsd7j
Action:

fuzzy_analysis

Status:

Succeeded

Job identifier:

bviamyjf2pini4sw
Action:

sharp_analysis_lpm

Status:

Succeeded

Job identifier:

tub4fwaf6evzgc5r

Pipeline

Show project.yaml

######################################

# This script defines the project pipeline - it specifies the execution orders for all the code in this
# repo using a series of actions.

######################################


version: '3.0'

expectations:
  population_size: 1000000

actions:

# Generate study population and extract baseline characteristics at Sep 3, 2022
  generate_study_pop_baseline:
    run: cohortextractor:latest generate_cohort
      --study-definition study_definition_baseline
      --output-dir=feather 
      --output-format=feather
    outputs:
      highly_sensitive:
        cohort: output/input_baseline.feather
      
# Data cleaning, defining exclusions, saving final study pop
  data_process_baseline:
    run: r:latest analysis/processing/data_process_baseline.R
    needs: [generate_study_pop_baseline]
    outputs:
      highly_sensitive:
        cohort: output/cohort/cohort_*.csv
      moderately_sensitive:
        descriptive: output/descriptive/total_*.csv

 # Extract outcomes pre-campaign (index date = Sep 3)
  outcomes_sep:
    run: cohortextractor:latest generate_cohort
      --study-definition study_definition_outcomes_1
      --index-date-range "2022-09-03" 
      --output-dir=feather 
      --output-format=feather
    needs: [data_process_baseline]
    outputs:
      highly_sensitive:
        cohort: output/index/input_*.feather

 # Extract outcomes mid-campaign (index date = Oct 15)
  outcomes_oct:
    run: cohortextractor:latest generate_cohort
      --study-definition study_definition_outcomes_1
      --index-date-range "2022-10-15" 
      --output-dir=feather 
      --output-format=feather
    needs: [data_process_baseline]
    outputs:
      highly_sensitive:
        cohort: output/index/input*.feather

 # Extract outcomes during-campaign (index date = Nov 26)
  outcomes_nov:
    run: cohortextractor:latest generate_cohort
      --study-definition study_definition_outcomes_2
      --index-date-range "2022-11-26 to 2022-12-05 by week" 
      --output-dir=feather 
      --output-format=feather
    needs: [data_process_baseline]
    outputs:
      highly_sensitive:
        cohort: output/index/inpu*.feather

# Data cleaning of outcome data (control periods)
  data_process_outcomes_1:
    run: r:latest analysis/processing/data_process_outcomes_1.R
    needs: [outcomes_sep, outcomes_oct]
    outputs:
      highly_sensitive:
        outcomes: output/cohort/outcomes*.csv

# Data cleaning of outcome data (Nov onward)
  data_process_outcomes_2:
    run: r:latest analysis/processing/data_process_outcomes_2.R
    needs: [outcomes_nov]
    outputs:
      highly_sensitive:
        outcomes: output/cohort/outcome*.csv

# Plots of COVID booster uptake by age
  booster_uptake:
   run: r:latest analysis/descriptive/cumulative_vax_byage.R
   needs: [data_process_baseline]
   outputs:
      moderately_sensitive:
        rates_csv: output/cumulative_rates/final_*.csv 
        plot: output/cumulative_rates/plot_*.png

# Aggregate data by age
  aggregate_outcomes_byage:
    run: r:latest analysis/processing/aggregate_outcomes.R
    needs: [data_process_outcomes_1, data_process_outcomes_2]
    outputs:
      moderately_sensitive:
        outcomes: output/covid_outcomes/by_start_date/outcomes_*.csv
#        no_patients: output/descriptive/total_n_by_date.csv

# Outcome plots #
  # plot_outcomes:
  #  run: r:latest analysis/descriptive/plot_outcomes_byage.R
  #  needs: [aggregate_outcomes_byage]
  #  outputs:
  #     moderately_sensitive:
  #       plot: output/covid_outcomes/figures/plot_*.png

# sharp analysis #
  # sharp_analysis_logistic:
  #  run: r:latest analysis/statistical_analysis/sharp_analysis_logistic.R
  #  needs: [data_process_outcomes_1, data_process_outcomes_2]
  #  outputs:
  #     moderately_sensitive:
  #       predicted_csv: output/modelling/predicted_*.csv        
  #       coefficients1_csv: output/modelling/coef_*.csv
  #       coefficients2_csv: output/modelling/final/coef_*.csv
  #       plot: output/modelling/figures/plot*.png

  sharp_analysis_lpm:
   run: r:latest analysis/statistical_analysis/sharp_analysis_lpm.R
   needs: [data_process_outcomes_1, data_process_outcomes_2]
   outputs:
      moderately_sensitive:
        predicted_csv: output/modelling/predicted_lpm*.csv
        coefficients1_csv: output/modelling/coef_lpm*.csv
        coefficients2_csv: output/modelling/final/coef_lpm*.csv
        plot: output/modelling/figures/plot_pred_lpm*.png

# Fuzzy analysis #
  fuzzy_analysis:
   run: r:latest analysis/statistical_analysis/fuzzy_analysis.R
   needs: [data_process_outcomes_1, data_process_outcomes_2]
   outputs:
      moderately_sensitive:
        coefficients_csv: output/modelling/iv/coef_iv*.csv
        final_csv: output/modelling/final/coef_i*.csv

# Latest date of outcome
  latest_date_outcomes:
   run: r:latest analysis/descriptive/latest_date_outcomes.R
   needs: [data_process_outcomes_2]
   outputs:
      moderately_sensitive:
        plot: output/descriptive/over*.png

# Discontinuity of demographics
  demographics:
   run: r:latest analysis/descriptive/demographics_byage.R
   needs: [generate_study_pop_baseline, data_process_baseline]
   outputs:
      moderately_sensitive:
        measures_csv: output/descriptive/demographics_*.csv


# Flu vaccine uptake #
  # flu_vax:
  #  run: r:latest analysis/descriptive/flu_vax_byage.R
  #  needs: [generate_study_pop]
  #  outputs:
  #     moderately_sensitive:
  #       measure_csv: output/cumulative_rates/flu_*.csv
  #       plot: output/cumulative_rates/plot_flu_vax_byage.png





# IV analysis #

Job statistics

Status	Count	Percentage
Pending	0	0%
Running	0	0%
Succeeded	10	100%
Failed	0	0%

10 / 10 (100%) complete

Timeline

Created: 3 years, 2 months ago 24 Apr 2023 17:50:14 UTC
Started: 3 years, 2 months ago 24 Apr 2023 17:50:03 UTC
Finished: 3 years, 2 months ago 25 Apr 2023 00:26:36 UTC
Runtime: 07:37:55

These timestamps are generated and stored using the UTC timezone on the TPP backend.

Job request

Status: Succeeded
Backend: TPP
Workspace: vax-fourth-dose-rd
Requested by: Andrea Schaffer
Branch: main
Force run dependencies: No
Git commit hash: e1fb0c3
Requested actions: data_process_baseline

outcomes_sep

outcomes_oct

outcomes_nov

data_process_outcomes_1

data_process_outcomes_2

aggregate_outcomes_byage

sharp_analysis_lpm

fuzzy_analysis

demographics

Code comparison

Compare the code used in this job request