Job request: 16829

Organisation:: The London School of Hygiene & Tropical Medicine
Workspace:: healthcare_utilisation_openprompt
ID:: wyl3azcwosoo6ye7

This page shows the technical details of what happened when the authorised researcher Liang-Yu Lin requested one or more actions to be run against real patient data in the project, within a secure environment.

By cross-referencing the list of jobs with the pipeline section below, you can infer what security level various outputs were written to. Researchers can never directly view outputs marked as highly_sensitive ; they can only request that code runs against them. Outputs marked as moderately_sensitive can be viewed by an approved researcher by logging into a highly secure environment. Only outputs marked as moderately_sensitive can be requested for release to the public, via a controlled output review service.

Jobs

Action:

split_unmatched_data_by_stp_regions

Status:

Status: Succeeded

Job identifier:

tqabl2d4ctcwmgds

Pipeline

Show project.yaml

version: '3.0'

expectations:
  population_size: 500

actions:

  generate_long_covid_exposure_dataset:
    run: 
      databuilder:v0 generate-dataset
        analysis/dataset_definition_unmatched_exp_lc.py
        --output output/dataset_exp_lc_unmatched.csv
    outputs:
      highly_sensitive:
        cohort: output/dataset_exp_lc_unmatched.csv

  check_stp_regions:
    needs: [generate_long_covid_exposure_dataset]
    run: r:latest analysis/dm00_check_stp_regions.R
    outputs:
      moderately_sensitive: 
        stp_region: output/stp_regions_counts.csv

  generate_list_gp_use_long_covid_dx:
    run: 
      databuilder:v0 generate-dataset
        analysis/dataset_definition_lc_gp_list.py
        --output output/dataset_lc_gp_list.csv
    outputs:
      highly_sensitive:
        cohort: output/dataset_lc_gp_list.csv

  generate_dataset_comparator_exclude_gp_no_long_covid:
    needs: [generate_list_gp_use_long_covid_dx]
    run: 
      databuilder:v0 generate-dataset
        analysis/dataset_definition_unmatched_comparator.py
        --output output/dataset_comparator_unmatched.csv
    outputs:
      highly_sensitive:
        cohort: output/dataset_comparator_unmatched.csv

  split_unmatched_data_by_stp_regions:
    needs: [generate_long_covid_exposure_dataset, generate_dataset_comparator_exclude_gp_no_long_covid]
    run: r:latest analysis/dm00_split_stp_for_matching.R
    outputs: 
      moderately_sensitive: 
        stp_exp_table: output/exp_stp_names_numbers.csv
        stp_com_table: output/com_stp_names_numbers.csv
      # highly_sensitive: 
      #   exp_stp_1: output/exp_stp_E84000005.csv
      #   exp_stp_2: output/exp_stp_E84000006.csv
      #   exp_stp_3: output/exp_stp_E84000007.csv
      #   exp_stp_4: output/exp_stp_E84000008.csv
      #   exp_stp_5: output/exp_stp_E84000009.csv
      #   exp_stp_6: output/exp_stp_E84000010.csv
      #   exp_stp_7: output/exp_stp_E84000012.csv
      #   exp_stp_8: output/exp_stp_E84000013.csv
      #   exp_stp_9: output/exp_stp_E84000014.csv
      #   exp_stp_10: output/exp_stp_E84000015.csv
      #   exp_stp_11: output/exp_stp_E84000016.csv
      #   exp_stp_12: output/exp_stp_E84000017.csv
      #   exp_stp_13: output/exp_stp_E84000020.csv
      #   exp_stp_14: output/exp_stp_E84000021.csv
      #   exp_stp_15: output/exp_stp_E84000022.csv
      #   exp_stp_16: output/exp_stp_E84000023.csv
      #   exp_stp_17: output/exp_stp_E84000024.csv
      #   exp_stp_18: output/exp_stp_E84000025.csv
      #   exp_stp_19: output/exp_stp_E84000026.csv
      #   exp_stp_20: output/exp_stp_E84000027.csv
      #   exp_stp_21: output/exp_stp_E84000029.csv
      #   exp_stp_22: output/exp_stp_E84000033.csv
      #   exp_stp_23: output/exp_stp_E84000035.csv
      #   exp_stp_24: output/exp_stp_E84000036.csv
      #   exp_stp_25: output/exp_stp_E84000037.csv
      #   exp_stp_26: output/exp_stp_E84000040.csv
      #   exp_stp_27: output/exp_stp_E84000041.csv
      #   exp_stp_28: output/exp_stp_E84000042.csv
      #   exp_stp_29: output/exp_stp_E84000043.csv
      #   exp_stp_30: output/exp_stp_E84000044.csv
      #   exp_stp_31: output/exp_stp_E84000049.csv
      #   com_stp_1: output/comp_stp_E84000005.csv
      #   com_stp_2: output/comp_stp_E84000006.csv
      #   com_stp_3: output/comp_stp_E84000007.csv
      #   com_stp_4: output/comp_stp_E84000008.csv
      #   com_stp_5: output/comp_stp_E84000009.csv
      #   com_stp_6: output/comp_stp_E84000010.csv
      #   com_stp_7: output/comp_stp_E84000012.csv
      #   com_stp_8: output/comp_stp_E84000013.csv
      #   com_stp_9: output/comp_stp_E84000014.csv
      #   com_stp_10: output/comp_stp_E84000015.csv
      #   com_stp_11: output/comp_stp_E84000016.csv
      #   com_stp_12: output/comp_stp_E84000017.csv
      #   com_stp_13: output/comp_stp_E84000020.csv
      #   com_stp_14: output/comp_stp_E84000021.csv
      #   com_stp_15: output/comp_stp_E84000022.csv
      #   com_stp_16: output/comp_stp_E84000023.csv
      #   com_stp_17: output/comp_stp_E84000024.csv
      #   com_stp_18: output/comp_stp_E84000025.csv
      #   com_stp_19: output/comp_stp_E84000026.csv
      #   com_stp_20: output/comp_stp_E84000027.csv
      #   com_stp_21: output/comp_stp_E84000029.csv
      #   com_stp_22: output/comp_stp_E84000033.csv
      #   com_stp_23: output/comp_stp_E84000035.csv
      #   com_stp_24: output/comp_stp_E84000036.csv
      #   com_stp_25: output/comp_stp_E84000037.csv
      #   com_stp_26: output/comp_stp_E84000040.csv
      #   com_stp_27: output/comp_stp_E84000041.csv
      #   com_stp_28: output/comp_stp_E84000042.csv
      #   com_stp_29: output/comp_stp_E84000043.csv
      #   com_stp_30: output/comp_stp_E84000044.csv
      #   com_stp_31: output/comp_stp_E84000049.csv

  # match_comparators:
  #   run:
  #     python:latest python analysis/match.py
  #   needs: [generate_dataset_comparator_exclude_gp_no_long_covid, generate_long_covid_exposure_dataset]
  #   outputs: 
  #     highly_sensitive:
  #       matched_cases: output/matched_cases.csv
  #       matched_matches: output/matched_matches.csv
  #       matched_all: output/matched_combined.csv
  #     moderately_sensitive: 
  #       matching_report: output/matching_report.txt

Timeline

Created: 1 year, 8 months ago 30 Mar 2023 12:59:27 UTC
Started: 1 year, 8 months ago 30 Mar 2023 12:59:05 UTC
Finished: 1 year, 8 months ago 30 Mar 2023 13:01:12 UTC
Runtime: 00:02:07

These timestamps are generated and stored using the UTC timezone on the TPP backend.

Job information

Status: Succeeded
Backend: TPP
Workspace: healthcare_utilisation_openprompt
Requested by: Liang-Yu Lin
Branch: main
Force run dependencies: No
Git commit hash: e1fa4a9
Requested actions: split_unmatched_data_by_stp_regions

Code comparison

Compare the code used in this Job Request