Primary care records offer an opportunity to ascertain cases of COVID-19 which do not necessarily result in hospital admission or death. This could be useful for studying the burden of COVID-19 in the community, risk factors for SARS-CoV-2 infection separately to risk of severe COVID-19, risk factors for mortality and case fatality ratios among those infected, and post-viral effects in people who had COVID-19 that did not require hospitalisation.
There are over 100 primary care (CTV3) codes with terms related to COVID-19 used by TPP and available for selection in studies performed in the OpenSAFELY platform (https://opensafely.org/). The majority of these codes have been newly created for use in the current pandemic. The aim of this work was to assign these codes into categories related to the identification of COVID-19 in primary care, and to provide advice for studies using the OpenSAFELY platform that require people to be classified by their COVID-19 case status as defined in primary care records (either as an exposure or as an outcome).
An initial list of TPP primary care codes related to COVID-19 was obtained by searching the TPP database for terms containing "COV-2", "Coronavirus", or "COVID". The returned terms were cross-checked against the NHS Digital COVID-19 SNOMED CT codes and CTV3 codes for any missing terms which were added to the list when found. The resulting list of terms was then reviewed by a team of clinicians, epidemiologists and statisticians in order to identify distinct categories of terms and assign terms into one of these distinct categories.
An initial analysis of (probable case and suspected case) sub-categories was then performed by plotting the following using OpenSAFELY data from between February 2020 – November 2021 (1) the frequency of codes entered into TPP software by GPs over time and (2) the proportion of people dying due to (a) COVID-19 and (b) causes other than COVID-19 (using ONS cause of death data) in the 80 days after a record of a positive test in either primary care TPP data or in SGSS data.
A total of 187 terms were identified. These were assigned into the 14 categories/subcategories detailed in the table below. The 14 codelists for classifying COVID-19 are publicly available on OpenSAFELY.org for inspection and re-use codelists.opensafely.org.
A relatively low level of COVID-19 related mortality in people identified as "probable cases" is consistent with these codes failing to identify the most severe COVID-19 cases with high specificity. "Suspected case" codes were initially more widely used but do not seem to identify covid cases and should be used with care. Further work will include investigating code sensitivity, and understanding how individual patient characteristics relate to the varying probability of being tested.
OpenSAFELY is a data analytics platform built by a mixed team of software developers, clinicians, and epidemiologists from the Oxford DataLab, London School of Hygiene and Tropical Medicine Electronic Health Record research group, health software company TPP and NHS England. It represents a fundamentally different way of conducting electronic health record (EHR) research: instead of sending EHR data to a third party for analysis, we've developed a system for conducting analyses within the secure environment where the data is already stored, so that the electronic health record data never leaves the NHS ecosystem.
Currently, OpenSAFELY uses the electronic health records of all patients registered at a GP practice using the SystmOne clinical information system run by TPP, covering around 22 million people. Additional data for these patients covering COVID-related tests, hospital admissions, ITU admissions, and registered deaths are also securely imported to the platform.
For more information, visit https://opensafely.org