Job request: 15037
- Organisation:
- Bennett Institute
- Workspace:
- mabsavs-usernonuser-ccw
- ID:
- pgfm5fmshagboysn
This page shows the technical details of what happened when the authorised researcher Linda Nab requested one or more actions to be run against real patient data within a secure environment.
By cross-referencing the list of jobs with the pipeline section below, you can infer what security level the outputs were written to.
The output security levels are:
-
highly_sensitive
- Researchers can never directly view these outputs
- Researchers can only request code is run against them
-
moderately_sensitive
- Can be viewed by an approved researcher by logging into a highly secure environment
- These are the only outputs that can be requested for public release via a controlled output review service.
Jobs
-
- Job identifier:
-
u3yg7irixdejfhbc
-
- Job identifier:
-
ue3ddmfaryu3l4l3
-
- Job identifier:
-
sqmqiwqnatfye24t
-
- Job identifier:
-
sxzow3rqi5o63rqw
-
- Job identifier:
-
it2pt3iwbwes7ixz
-
- Job identifier:
-
2c4kht5twbm5taen
Pipeline
Show project.yaml
version: '3.0'
expectations:
population_size: 100000
actions:
## # # # # # # # # # # # # # # # # # # #
## Data extraction
## # # # # # # # # # # # # # # # # # # #
generate_study_population:
run: cohortextractor:latest generate_cohort --study-definition study_definition --output-format=csv.gz
outputs:
highly_sensitive:
cohort: output/input.csv.gz
generate_study_population_ba2:
run: cohortextractor:latest generate_cohort --study-definition study_definition_ba2 --output-format=csv.gz
outputs:
highly_sensitive:
cohort: output/input_ba2.csv.gz
generate_study_population_flowchart:
run: cohortextractor:latest generate_cohort --study-definition study_definition_flowchart --output-format=csv.gz
outputs:
highly_sensitive:
cohort: output/input_flowchart.csv.gz
generate_study_population_flowchart_ba2:
run: cohortextractor:latest generate_cohort --study-definition study_definition_flowchart_ba2 --output-format=csv.gz
outputs:
highly_sensitive:
cohort: output/input_flowchart_ba2.csv.gz
## # # # # # # # # # # # # # # # # # # #
## Data cleaning and description
## # # # # # # # # # # # # # # # # # # #
data_process:
run: r:latest analysis/data_process.R ba1
needs: [generate_study_population]
outputs:
highly_sensitive:
data: output/data/data_processed.rds
rds: output/data_properties/n_excluded.rds
data_process_ba2:
run: r:latest analysis/data_process.R ba2
needs: [generate_study_population_ba2]
outputs:
highly_sensitive:
data: output/data/ba2_data_processed.rds
rds: output/data_properties/ba2_n_excluded.rds
data_process_flowchart:
run: r:latest analysis/data_process_flowchart.R ba1
needs: [generate_study_population_flowchart]
outputs:
highly_sensitive:
data: output/data/data_flowchart_processed.rds
data_process_flowchart_ba2:
run: r:latest analysis/data_process_flowchart.R ba2
needs: [generate_study_population_flowchart_ba2]
outputs:
highly_sensitive:
data: output/data/ba2_data_flowchart_processed.rds
data_properties:
run: r:latest analysis/data_properties/data_properties.R output/data/data_processed.rds output/data_properties
needs: [data_process]
outputs:
moderately_sensitive:
txt1: output/data_properties/data_processed_skim.txt
txt2: output/data_properties/data_processed_coltypes.txt
txt3: output/data_properties/data_processed_tabulate.txt
data_properties_ba2:
run: r:latest analysis/data_properties/data_properties.R output/data/ba2_data_processed.rds output/data_properties
needs: [data_process_ba2]
outputs:
moderately_sensitive:
txt1: output/data_properties/ba2_data_processed_skim.txt
txt2: output/data_properties/ba2_data_processed_coltypes.txt
txt3: output/data_properties/ba2_data_processed_tabulate.txt
create_flowchart:
run: r:latest analysis/flowchart.R ba1
needs: [data_process_flowchart, data_process]
outputs:
moderately_sensitive:
csv1: output/tables/flowchart/flowchart.csv
csv2: output/tables/flowchart/flowchart_redacted.csv
create_flowchart_ba2:
run: r:latest analysis/flowchart.R ba2
needs: [data_process_flowchart_ba2, data_process_ba2]
outputs:
moderately_sensitive:
csv1: output/tables/flowchart/ba2_flowchart.csv
csv2: output/tables/flowchart/ba2_flowchart_redacted.csv
sense_check:
run: r:latest analysis/data_properties/sense_check.R ba1
needs: [data_process]
outputs:
moderately_sensitive:
csv: output/data_properties/sense_checks.txt
sense_check_ba2:
run: r:latest analysis/data_properties/sense_check.R ba2
needs: [data_process_ba2]
outputs:
moderately_sensitive:
csv: output/data_properties/ba2_sense_checks.txt
## # # # # # # # # # # # # # # # # # # #
## CCW Analysis - Day 0
## # # # # # # # # # # # # # # # # # # #
## # # # # # # # # # # # # # # # # # # #
## Tables
## # # # # # # # # # # # # # # # # # # #
#create_table1:
# run: r:latest analysis/table_1.R ba1
#needs: [data_process]
#outputs:
# moderately_sensitive:
# html: output/tables/table1_redacted.html
#create_table1_ba2:
# run: r:latest analysis/table_1.R ba2
#needs: [data_process_ba2]
#outputs:
# moderately_sensitive:
# html: output/tables/ba2_table1_redacted.html
Timeline
-
Created:
-
Started:
-
Finished:
-
Runtime: 00:09:10
These timestamps are generated and stored using the UTC timezone on the TPP backend.
Job request
- Status
-
Succeeded
- Backend
- TPP
- Workspace
- mabsavs-usernonuser-ccw
- Requested by
- Linda Nab
- Branch
- ccw-analysis
- Force run dependencies
- No
- Git commit hash
- e462613
- Requested actions
-
-
data_process -
data_process_ba2 -
data_properties -
data_properties_ba2 -
create_flowchart -
create_flowchart_ba2
-
Code comparison
Compare the code used in this job request