Skip to content

Job request: 18160

Organisation:
The London School of Hygiene & Tropical Medicine
Workspace:
openprompt-hrqol
ID:
4ktkcoj5ytmb45ak

This page shows the technical details of what happened when the authorised researcher Alasdair Henderson requested one or more actions to be run against real patient data within a secure environment.

By cross-referencing the list of jobs with the pipeline section below, you can infer what security level the outputs were written to.

The output security levels are:

  • highly_sensitive
    • Researchers can never directly view these outputs
    • Researchers can only request code is run against them
  • moderately_sensitive
    • Can be viewed by an approved researcher by logging into a highly secure environment
    • These are the only outputs that can be requested for public release via a controlled output review service.

Jobs

Pipeline

Show project.yaml
version: '3.0'

expectations:
 population_size: 10000

actions:

  # create_dummy_openprompt_data: 
  #   run: >
  #     r:latest
  #       analysis/create_dummy_openprompt_data.R
  #   outputs: 
  #     moderately_sensitive: 
  #       dummy_openprompt: output/dummy_openprompt.csv.gz

  generate_openprompt_baseline: 
    run: >
      databuilder:v0
        generate-dataset 
        analysis/model_questions/process_baseline.py 
        --output output/openprompt_baseline.csv
        --
        --day=0
    outputs:
      highly_sensitive:
        openprompt_baseline: output/openprompt_baseline.csv

  generate_openprompt_survey1: 
    run: >
      databuilder:v0
        generate-dataset 
        analysis/model_questions/process_research.py 
        --output output/openprompt_survey1.csv
        --
        --day=0
    outputs:
      highly_sensitive:
        openprompt_survey1: output/openprompt_survey1.csv

  generate_openprompt_survey2: 
    run: >
      databuilder:v0
        generate-dataset 
        analysis/model_questions/process_research.py 
        --output output/openprompt_survey2.csv
        --
        --day=30
    outputs:
      highly_sensitive:
        openprompt_survey2: output/openprompt_survey2.csv

  generate_openprompt_survey3: 
    run: >
      databuilder:v0
        generate-dataset 
        analysis/model_questions/process_research.py 
        --output output/openprompt_survey3.csv
        --
        --day=60
    outputs:
      highly_sensitive:
        openprompt_survey3: output/openprompt_survey3.csv

  generate_openprompt_survey4: 
    run: >
      databuilder:v0
        generate-dataset 
        analysis/model_questions/process_research.py 
        --output output/openprompt_survey4.csv
        --
        --day=90
    outputs:
      highly_sensitive:
        openprompt_survey4: output/openprompt_survey4.csv

  combine_openprompt:
    run: >
      r:latest analysis/001_datacombine.R
    needs: [generate_openprompt_baseline, generate_openprompt_survey1, generate_openprompt_survey2, generate_openprompt_survey3, generate_openprompt_survey4]
    outputs: 
      highly_sensitive: 
        openprompt_combined: output/openprompt_raw.gz.parquet
      moderately_sensitive:
        openprompt_raw_skim: output/data_properties/op_raw_skim.txt
        openprompt_raw_tab: output/data_properties/op_raw_tabulate.txt
        openprompt_mapped_skim: output/data_properties/op_mapped_skim.txt
        openprompt_mapped_tab: output/data_properties/op_mapped_tabulate.txt
        check_days_after_baseline: output/data_properties/sample_day_lags.pdf
        table1: output/tab1_baseline_description.html

  # generate_openprompt_plus_tpp: 
  #   run: >
  #     databuilder:v0
  #       generate-dataset analysis/dataset_definition_openprompt.py --output output/openprompt_raw_plus_tpp.csv.gz
  #   needs: [create_dummy_openprompt_data]
  #   outputs:
  #     highly_sensitive:
  #       openprompt_tpp_combined: output/openprompt_raw_plus_tpp.csv.gz

  # quick_summ_data:
  #   run: >
  #     r:latest
  #       analysis/010_table1.R
  #   needs: [generate_openprompt_plus_tpp]
  #   outputs:
  #     highly_sensitive:
  #       cleandata: output/cleaned_data.gz.parquet
  #     moderately_sensitive:
  #       table1: output/tab1_baseline_description.html
  #       longcovid_dates: output/longcovid_dates.pdf

Timeline

  • Created:

  • Started:

  • Finished:

  • Runtime: 00:00:54

These timestamps are generated and stored using the UTC timezone on the TPP backend.

Job request

Status
Succeeded
Backend
TPP
Workspace
openprompt-hrqol
Requested by
Alasdair Henderson
Branch
main
Force run dependencies
No
Git commit hash
13d6387
Requested actions
  • combine_openprompt

Code comparison

Compare the code used in this job request