Skip to content

Job request: 24991

Organisation:
University of Manchester
Workspace:
openpregnosis_main
ID:
yxcj7s7hwv2mvss2

This page shows the technical details of what happened when the authorised researcher Paolo Mazzone requested one or more actions to be run against real patient data in the project, within a secure environment.

By cross-referencing the list of jobs with the pipeline section below, you can infer what security level various outputs were written to. Researchers can never directly view outputs marked as highly_sensitive ; they can only request that code runs against them. Outputs marked as moderately_sensitive can be viewed by an approved researcher by logging into a highly secure environment. Only outputs marked as moderately_sensitive can be requested for release to the public, via a controlled output review service.

Jobs

Pipeline

Show project.yaml
version: '3.0'

# Ignore this`expectation` block. It is required but not used, and will be removed in future versions.
expectations:
  population_size: 1000

actions:
## Diagnostic analysis for event counts ##

  generate_diagnostic_event_counts:
    run: ehrql:v1 generate-dataset analysis/01_diagnostic_event_counts.py --output output/diagnostic_event_counts.csv.gz
    outputs:
      highly_sensitive:
        dataset: output/diagnostic_event_counts.csv.gz

  analyze_diagnostic_event_counts:
    run: r:latest analysis/02_analyse_diagnostic.R
    needs: [generate_diagnostic_event_counts]
    outputs:
      moderately_sensitive:
        analysis_results: output/diagnostic_analysis_results.txt
        recommendations: output/episode_count_recommendations.csv

## Main pregnancy event extraction - Split by category ##

  generate_livebirth_data:
    run: ehrql:v1 generate-dataset analysis/03a_livebirth_dataset_definition.py --output output/livebirth_dataset.csv.gz
    outputs:
      highly_sensitive:
        dataset: output/livebirth_dataset.csv.gz

  generate_stillbirth_data:
    run: ehrql:v1 generate-dataset analysis/03b_stillbirth_dataset_definition.py --output output/stillbirth_dataset.csv.gz
    outputs:
      highly_sensitive:
        dataset: output/stillbirth_dataset.csv.gz

  generate_early_loss_data:
    run: ehrql:v1 generate-dataset analysis/03c_early_loss_dataset_definition.py --output output/early_loss_dataset.csv.gz
    outputs:
      highly_sensitive:
        dataset: output/early_loss_dataset.csv.gz

  generate_pregnancy_start_data:
    run: ehrql:v1 generate-dataset analysis/03d_pregnancy_start_dataset_definition.py --output output/pregnancy_start_dataset.csv.gz
    outputs:
      highly_sensitive:
        dataset: output/pregnancy_start_dataset.csv.gz

  create_individual_datasets:
    run: r:latest analysis/create_individual_datasets.R
    needs: [generate_livebirth_data, generate_stillbirth_data, generate_early_loss_data, generate_pregnancy_start_data]
    outputs:
      highly_sensitive:
        individual_datasets_summary: output/individual_datasets/individual_datasets_summary.txt

## Codelist Export ##

  export_codelists_for_r:
    run: python:latest analysis/16_export_codelists.py
    outputs:
      highly_sensitive:
        delivery_codes: output/codelists/delivery_codes.csv
        stillbirth_codes: output/codelists/stillbirth_codes.csv
        miscarriage_codes: output/codelists/miscarriage_codes.csv
        ectopic_codes: output/codelists/ectopic_codes.csv
        molar_codes: output/codelists/molar_codes.csv
        blighted_ovum_codes: output/codelists/blighted_ovum_codes.csv
        termination_codes: output/codelists/termination_codes.csv
        postnatal_codes: output/codelists/postnatal_codes.csv
        antenatal_codes: output/codelists/antenatal_codes.csv
        postterm_codes: output/codelists/postterm_codes.csv
        lmp_codes: output/codelists/lmp_codes.csv
        edd_codes: output/codelists/edd_codes.csv
        edc_codes: output/codelists/edc_codes.csv
        # dating_scan_codes: output/codelists/dating_scan_codes.csv  # COMMENTED OUT - DATING SCAN DATA IS MASKED
        multi_pregnancy_codes: output/codelists/multi_pregnancy_codes.csv
        preeclampsia_codes: output/codelists/preeclampsia_codes.csv
        codelist_summary: output/codelists/codelist_summary.csv

## Data processing and validation ##

  convert_stillbirth_episodes:
    run: r:latest analysis/04_convert_stillbirth_episodes.R
    needs: [create_individual_datasets, export_codelists_for_r]
    outputs:
      highly_sensitive:
        stillbirth_events_raw: output/raw_clinical_events/stillbirth_events_raw.csv
        stillbirth_events_summary: output/raw_clinical_events/stillbirth_events_summary.csv

  convert_livebirth_episodes:
    run: r:latest analysis/05_convert_livebirth_episodes.R
    needs: [create_individual_datasets, export_codelists_for_r]
    outputs:
      highly_sensitive:
        livebirth_events_raw: output/raw_clinical_events/livebirth_events_raw.csv
        livebirth_events_summary: output/raw_clinical_events/livebirth_events_summary.csv

  convert_miscarriage_episodes:
    run: r:latest analysis/06_convert_miscarriage_episodes.R
    needs: [create_individual_datasets, export_codelists_for_r]
    outputs:
      highly_sensitive:
        miscarriage_events_raw: output/raw_clinical_events/miscarriage_events_raw.csv
        miscarriage_events_summary: output/raw_clinical_events/miscarriage_events_summary.csv

  convert_ectopic_episodes:
    run: r:latest analysis/07_convert_ectopic_episodes.R
    needs: [create_individual_datasets, export_codelists_for_r]
    outputs:
      highly_sensitive:
        ectopic_events_raw: output/raw_clinical_events/ectopic_events_raw.csv
        ectopic_events_summary: output/raw_clinical_events/ectopic_events_summary.csv

  convert_blighted_ovum_episodes:
    run: r:latest analysis/08_convert_blighted_ovum_episodes.R
    needs: [create_individual_datasets, export_codelists_for_r]
    outputs:
      highly_sensitive:
        blighted_ovum_events_raw: output/raw_clinical_events/blighted_ovum_events_raw.csv
        blighted_ovum_events_summary: output/raw_clinical_events/blighted_ovum_events_summary.csv

  convert_molar_episodes:
    run: r:latest analysis/09_convert_molar_episodes.R
    needs: [create_individual_datasets, export_codelists_for_r]
    outputs:
      highly_sensitive:
        molar_events_raw: output/raw_clinical_events/molar_events_raw.csv
        molar_events_summary: output/raw_clinical_events/molar_events_summary.csv

  # convert_tops_episodes: - COMMENTED OUT - TOPS DATA IS MASKED
  #   run: r:latest analysis/10_convert_tops_episodes.R
  #   needs: [create_individual_datasets]
  #   outputs:
  #     highly_sensitive:
  #       tops_events_long: output/processed/tops_events_long.csv
  #       tops_pregnancies_wide: output/processed/tops_pregnancies_wide.csv
  #       tops_pregnancy_summary: output/processed/tops_pregnancy_summary.csv

## Early Pregnancy Marker Processing ##

  convert_lmp_episodes:
    run: r:latest analysis/11_convert_lmp_episodes.R
    needs: [create_individual_datasets, export_codelists_for_r]
    outputs:
      highly_sensitive:
        lmp_events_raw: output/raw_clinical_events/lmp_events_raw.csv
        lmp_events_summary: output/raw_clinical_events/lmp_events_summary.csv

  convert_edd_episodes:
    run: r:latest analysis/12_convert_edd_episodes.R
    needs: [create_individual_datasets, export_codelists_for_r]
    outputs:
      highly_sensitive:
        edd_events_raw: output/raw_clinical_events/edd_events_raw.csv
        edd_events_summary: output/raw_clinical_events/edd_events_summary.csv

  convert_edc_episodes:
    run: r:latest analysis/13_convert_edc_episodes.R
    needs: [create_individual_datasets, export_codelists_for_r]
    outputs:
      highly_sensitive:
        edc_events_raw: output/raw_clinical_events/edc_events_raw.csv
        edc_events_summary: output/raw_clinical_events/edc_events_summary.csv

  # convert_dating_scan_episodes: - COMMENTED OUT - DATING SCAN DATA IS MASKED
  #   run: r:latest analysis/14_convert_dating_scan_episodes.R
  #   needs: [create_individual_datasets]
  #   outputs:
  #     highly_sensitive:
  #       dating_scan_events_long: output/processed/dating_scan_events_long.csv
  #       dating_scan_pregnancies_wide: output/processed/dating_scan_pregnancies_wide.csv
  #       dating_scan_pregnancy_summary: output/processed/dating_scan_pregnancy_summary.csv



## Harmonised Algorithm ##

  run_harmonised_algorithm:
    run: r:latest analysis/17_execute_harmonised_algorithm.R
    needs: [convert_stillbirth_episodes, convert_livebirth_episodes, convert_miscarriage_episodes, convert_ectopic_episodes, convert_blighted_ovum_episodes, convert_molar_episodes, convert_lmp_episodes, convert_edd_episodes, convert_edc_episodes, export_codelists_for_r]
    outputs:
      highly_sensitive:
        harmonised_episodes: output/harmonised_components/harmonised_episodes.csv
        harmonised_summary_stats: output/harmonised_components/harmonised_summary_stats.csv

  test_harmonised_algorithm:
    run: r:latest analysis/18_run_test.R
    needs: [generate_livebirth_data, generate_stillbirth_data, generate_early_loss_data, generate_pregnancy_start_data, analyze_diagnostic_event_counts, run_harmonised_algorithm, export_codelists_for_r]
    outputs:
      highly_sensitive:
        harmonised_episodes: output/harmonised_test_results/harmonised_episodes.csv
        test_report: output/harmonised_test_results/test_report.txt
        comparison_results: output/harmonised_test_results/comparison_results.rds

## Complete workflow ##

  run_all:
    run: echo:v1 "Complete pregnancy outcome analysis pipeline executed successfully" > output/pipeline_complete.txt
    needs: [generate_livebirth_data, generate_stillbirth_data, generate_early_loss_data, generate_pregnancy_start_data, analyze_diagnostic_event_counts, convert_stillbirth_episodes, convert_livebirth_episodes, convert_miscarriage_episodes, convert_ectopic_episodes, convert_blighted_ovum_episodes, convert_molar_episodes, convert_lmp_episodes, convert_edd_episodes, convert_edc_episodes, run_harmonised_algorithm, test_harmonised_algorithm]
    outputs:
      highly_sensitive:
        complete_pipeline: output/pipeline_complete.txt

Timeline

  • Created:

  • Started:

  • Finished:

  • Runtime: 00:29:28

These timestamps are generated and stored using the UTC timezone on the TPP backend.

Job information

Status
Failed
Backend
TPP
Workspace
openpregnosis_main
Requested by
Paolo Mazzone
Branch
main
Force run dependencies
No
Git commit hash
ca13b98
Requested actions
  • generate_livebirth_data
  • generate_stillbirth_data
  • generate_early_loss_data
  • generate_pregnancy_start_data
  • create_individual_datasets
  • export_codelists_for_r
  • convert_stillbirth_episodes
  • convert_livebirth_episodes

Code comparison

Compare the code used in this Job Request