Job request: 9753

Organisation:: Bennett Institute
Workspace:: bmi-short-data-report-segmented
ID:: iqz4kpvhf5xyayro

This page shows the technical details of what happened when the authorised researcher Robin Park requested one or more actions to be run against real patient data within a secure environment.

By cross-referencing the list of jobs with the pipeline section below, you can infer what security level the outputs were written to.

The output security levels are:

highly_sensitive
- Researchers can never directly view these outputs
- Researchers can only request code is run against them
moderately_sensitive
- Can be viewed by an approved researcher by logging into a highly secure environment
- These are the only outputs that can be requested for public release via a controlled output review service.

Pipeline

Show project.yaml

version: '3.0'

expectations:
  population_size: 1000

actions:

  generate_study_population:
    run: cohortextractor:latest generate_cohort --study-definition study_definition --output-dir=output/data --output-format feather
    outputs:
      highly_sensitive:
        cohort: output/data/input.feather

  generate_study_population_derived_bmi:
    run: cohortextractor:latest generate_cohort --study-definition study_definition_derived_bmi --output-dir=output/data --output-format feather
    outputs:
      highly_sensitive:
        cohort: output/data/input_derived_bmi.feather

  generate_study_population_recorded_bmi:
    run: cohortextractor:latest generate_cohort --study-definition study_definition_recorded_bmi --output-dir=output/data --output-format feather
    outputs:
      highly_sensitive:
        cohort: output/data/input_recorded_bmi.feather

  generate_study_population_snomed_hw:
    run: cohortextractor:latest generate_cohort --study-definition study_definition_snomed_hw --output-dir=output/data --output-format feather
    outputs:
      highly_sensitive:
        cohort: output/data/input_snomed_hw.feather

  generate_study_population_ctv3_hw:
    run: cohortextractor:latest generate_cohort --study-definition study_definition_ctv3_hw --output-dir=output/data --output-format feather
    outputs:
      highly_sensitive:
        cohort: output/data/input_ctv3_hw.feather

  preprocess_derived_bmi_input:
    run: python:latest python analysis/preprocess_bmi_inputs.py "derived_bmi" --output-format feather
    needs: [generate_study_population_derived_bmi]
    outputs:
      highly_sensitive:
        cohort_with_duration: output/data/input_processed_derived_bmi.feather

  preprocess_recorded_bmi_input:
    run: python:latest python analysis/preprocess_bmi_inputs.py "recorded_bmi" --output-format feather
    needs: [generate_study_population_recorded_bmi]
    outputs:
      highly_sensitive:
        cohort_with_duration: output/data/input_processed_recorded_bmi.feather

  preprocess_computed_bmi_input:
    run: python:latest python analysis/preprocess_hw_inputs.py "height" "weight" "snomed" "computed_bmi" --output-format feather
    needs: [generate_study_population_snomed_hw]
    outputs:
      highly_sensitive:
        cohort_with_duration: output/data/input_processed_computed_bmi.feather

  preprocess_backend_computed_bmi_input:
    run: python:latest python analysis/preprocess_hw_inputs.py "height_backend" "weight_backend" "ctv3" "backend_computed_bmi" --output-format feather
    needs: [generate_study_population_ctv3_hw]
    outputs:
      highly_sensitive:
        cohort_with_duration: output/data/input_processed_backend_computed_bmi.feather

  join_cohorts:
    run: >
      cohort-joiner:v0.0.35
        --lhs output/data/input_processed*.feather
        --rhs output/data/input.feather
        --output-dir output/joined
    needs: [generate_study_population, preprocess_derived_bmi_input, preprocess_recorded_bmi_input, preprocess_computed_bmi_input, preprocess_backend_computed_bmi_input]
    outputs:
      highly_sensitive:
        cohort: output/joined/input_processed*.feather

  preprocess_age_dates:
    run: python:latest python analysis/preprocess_age_dates.py --output-format feather
    needs: [join_cohorts]
    outputs:
      highly_sensitive:
        cohort1: output/joined/input_processed_backend_computed_bmi.feather
        cohort2: output/joined/input_processed_computed_bmi.feather
        cohort3: output/joined/input_processed_derived_bmi.feather
        cohort4: output/joined/input_processed_recorded_bmi.feather

  execute_validation_analyses_derived_bmi:
    run: python:latest python analysis/validation_script_single_definition.py "derived_bmi"
    needs: [preprocess_age_dates]
    outputs:
      moderately_sensitive: 
        tables: output/validation/tables/derived_bmi/*.csv

  execute_validation_analyses_recorded_bmi:
    run: python:latest python analysis/validation_script_single_definition.py "recorded_bmi"
    needs: [preprocess_age_dates]
    outputs:
      moderately_sensitive: 
        tables: output/validation/tables/recorded_bmi/*.csv

  execute_validation_analyses_computed_bmi:
    run: python:latest python analysis/validation_script_single_definition.py "computed_bmi"
    needs: [preprocess_age_dates]
    outputs:
      moderately_sensitive: 
        tables: output/validation/tables/computed_bmi/*.csv

  execute_validation_analyses_backend_computed_bmi:
    run: python:latest python analysis/validation_script_single_definition.py "backend_computed_bmi"
    needs: [preprocess_age_dates]
    outputs:
      moderately_sensitive: 
        tables: output/validation/tables/backend_computed_bmi/*.csv

  execute_validation_analyses_high_computed_bmi:
    run: python:latest python analysis/validation_script_high_bmi.py "computed_bmi"
    needs: [preprocess_age_dates]
    outputs:
      moderately_sensitive: 
        tables: output/validation/tables/high_computed_bmi/*.csv
        
  execute_validation_analyses_high_backend_computed_bmi:
    run: python:latest python analysis/validation_script_high_bmi.py "backend_computed_bmi"
    needs: [preprocess_age_dates]
    outputs:
      moderately_sensitive: 
        tables: output/validation/tables/high_backend_computed_bmi/*.csv

  execute_validation_analyses_all_definitions:
    run: python:latest python analysis/validation_script_all_definitions.py
    needs: [preprocess_age_dates]
    outputs:
      moderately_sensitive: 
        tables: output/validation/tables/comparison/*.csv

  consolidate_analyses:
    run: python:latest python analysis/consolidate_analyses.py
    needs: [execute_validation_analyses_derived_bmi, execute_validation_analyses_recorded_bmi, execute_validation_analyses_backend_computed_bmi, execute_validation_analyses_computed_bmi]
    outputs:
      moderately_sensitive: 
        tables: output/validation/formatted_tables/*.csv

Timeline

Created: 3 years, 5 months ago 24 Jun 2022 18:12:51 UTC
Started: 3 years, 5 months ago 27 Jun 2022 16:10:03 UTC
Finished: 3 years, 5 months ago 27 Jun 2022 16:10:03 UTC
Runtime:

These timestamps are generated and stored using the UTC timezone on the TPP backend.

Job request

Status: Failed
JobRequestError: Internal error
Backend: TPP
Workspace: bmi-short-data-report-segmented
Requested by: Robin Park
Branch: separate-study-definitions
Force run dependencies: No
Git commit hash: 5581d88
Requested actions: consolidate_analyses

Code comparison

Compare the code used in this job request