This page shows the technical details of what happened when authorised researcher Millie Green requested one or more actions to be run against real patient data in the project, within a secure environment.
By cross-referencing the indicated Requested Actions with the
Pipeline section below, you can infer what
security level
various outputs were written to. Outputs marked as
highly_sensitive
can never be viewed directly by a researcher; they can only
request that code runs against them. Outputs marked as
moderately_sensitive
can be viewed by an approved researcher by logging into a highly
secure environment. Only outputs marked as
moderately_sensitive
can be requested for release to the public, via a controlled
output review service.
Jobs
-
- Job identifier:
-
4znuuj3fstzmqgdq
-
- Job identifier:
-
gtn4k3x3kusvm5fu
-
- Job identifier:
-
lprpzrw3bf45cklo
Pipeline
Show project.yaml
################################################################################
#
# Description: This script defines the project pipeline - it specifys the
# execution orders for all the code in this repo using a series of
# actions.
#
# Author(s): M Green
# Date last updated: 11/02/2022
#
################################################################################
version: '3.0'
expectations:
population_size: 100000
actions:
# Extract data ----
extract_data:
run: cohortextractor:latest generate_cohort --study-definition study_definition --output-dir=output/data --output-format=csv.gz
outputs:
highly_sensitive:
cohort: output/data/input.csv.gz
# Data processing ----
data_process:
run: r:latest analysis/process/process_data.R
needs: [extract_data]
outputs:
highly_sensitive:
data: output/data/data_processed*.rds
# Data summaries ----
data_properties:
run: r:latest analysis/descriptive/data_properties.R output/data/data_processed.rds output/data_properties
needs: [data_process]
outputs:
moderately_sensitive:
cohort: output/data_properties/data_processed*.txt
# Report ----
report_data:
run: r:latest analysis/descriptive/coverage_report_data.R
needs: [data_process]
outputs:
moderately_sensitive:
redacted_tables: output/coverage/table_*.csv
unredacted_tables: output/coverage/for-checks/table_*.csv
Timeline
-
Created:
-
Started:
-
Finished:
-
Runtime: 00:16:32
These timestamps are generated and stored using the UTC timezone on the backend.
Job information
- Status
-
Succeeded
- Backend
- TPP
- Workspace
- mabs-draft-report
- Requested by
- Millie Green
- Branch
- draft-report
- Force run dependencies
- No
- Git commit hash
- 7fce014
- Requested actions
-
-
data_process
-
data_properties
-
report_data
-