Identifying Ethnicity in OpenSAFELY-TPP

This short report describes how ethnicity can be identified in the OpenSAFELY-TPP database, and the strengths and weaknesses of the methods. This is a living document that will be updated to reflect changes to the OpenSAFELY-TPP database and the patient records within.

OpenSAFELY

OpenSAFELY is an analytics platform for conducting analyses on Electronic Health Records inside the secure environment where the records are held. This has multiple benefits:

  • We don't transport large volumes of potentially disclosive pseudonymised patient data outside of the secure environments for analysis
  • Analyses can run in near real-time as records are ready for analysis as soon as they appear in the secure environment
  • All infrastructure and analysis code is stored in GitHub repositories, which are open for security review, scientific review, and re-use

A key feature of OpenSAFELY is the use of study definitions, which are formal specifications of the datasets to be generated from the OpenSAFELY database. This takes care of much of the complex EHR data wrangling required to create a dataset in an analysis-ready format. It also creates a library of standardised and validated variable definitions that can be deployed consistently across multiple projects.

The purpose of this report is to describe all such variables that relate to BMI, their relative strengths and weaknesses, in what scenarios they are best deployed. It will also describe potential future definitions that have not yet been implemented.

Available Records

OpenSAFELY-TPP runs inside TPP’s data centre which contains the primary care records for all patients registered at practices using TPP’s SystmOne Clinical Information System. This data centre also imports external datasets from other sources, including A&E attendances and hospital admissions from NHS Digital’s Secondary Use Service, and death registrations from the ONS. More information on available data sources can be found within the OpenSAFELY documentation.

Results

Count of Patients

ethnicity_5 ethnicity_new_5 ethnicity_primis_5 all_filled population
group subgroup
all with records 24474735.0 (74.6) 24129590.0 (73.5) 19159630.0 (58.4) 18943015.0 (57.7) 32807505.0
age_band 0-19 3948960.0 (62.5) 3892720.0 (61.6) 2988370.0 (47.3) 2956815.0 (46.8) 6314700.0
20-29 3014255.0 (72.1) 2971670.0 (71.1) 2406685.0 (57.6) 2380180.0 (56.9) 4181455.0
30-39 3967180.0 (81.4) 3903095.0 (80.0) 3193225.0 (65.5) 3152280.0 (64.7) 4875825.0
40-49 3242920.0 (81.4) 3188675.0 (80.0) 2585550.0 (64.9) 2551215.0 (64.0) 3985390.0
50-59 3169745.0 (79.5) 3126855.0 (78.4) 2506200.0 (62.8) 2478290.0 (62.1) 3988210.0
60-69 2578365.0 (79.5) 2547130.0 (78.6) 2025760.0 (62.5) 2005025.0 (61.8) 3242295.0
70-79 2230210.0 (78.9) 2204475.0 (78.0) 1734190.0 (61.4) 1716875.0 (60.8) 2825395.0
80+ 2310790.0 (68.9) 2282770.0 (68.1) 1713295.0 (51.1) 1695985.0 (50.6) 3352430.0
missing 12310.0 (29.4) 12200.0 (29.2) 6360.0 (15.2) 6350.0 (15.2) 41810.0
sex F 12567380.0 (76.2) 12390885.0 (75.2) 9900805.0 (60.1) 9788470.0 (59.4) 16485060.0
M 11907360.0 (73.0) 11738705.0 (71.9) 9258825.0 (56.7) 9154545.0 (56.1) 16322445.0
region East 4354510.0 (75.6) 4294100.0 (74.6) 3380420.0 (58.7) 3344085.0 (58.1) 5758535.0
East Midlands 3262875.0 (76.6) 3208865.0 (75.3) 2507680.0 (58.9) 2472405.0 (58.0) 4260350.0
London 1436215.0 (80.0) 1409285.0 (78.5) 1262330.0 (70.4) 1243690.0 (69.3) 1794330.0
North East 888240.0 (76.1) 881040.0 (75.5) 739970.0 (63.4) 734745.0 (63.0) 1166440.0
North West 1621780.0 (76.0) 1603115.0 (75.1) 1193320.0 (55.9) 1183135.0 (55.4) 2134525.0
South East 1216550.0 (74.0) 1199055.0 (72.9) 939735.0 (57.2) 928370.0 (56.5) 1643790.0
South West 2481280.0 (73.0) 2454175.0 (72.2) 1886215.0 (55.5) 1868945.0 (55.0) 3399525.0
West Midlands 811275.0 (80.2) 794275.0 (78.6) 616410.0 (61.0) 605155.0 (59.9) 1010960.0
Yorkshire and The Humber 2688860.0 (75.7) 2658015.0 (74.8) 2170395.0 (61.1) 2148670.0 (60.5) 3552510.0
imd 1 Most deprived 4918965.0 (76.3) 4836790.0 (75.0) 3899640.0 (60.5) 3850080.0 (59.7) 6450350.0
2 4889880.0 (75.6) 4817165.0 (74.5) 3872780.0 (59.9) 3826505.0 (59.2) 6467650.0
3 5122765.0 (75.2) 5054475.0 (74.2) 4007860.0 (58.8) 3963295.0 (58.2) 6811855.0
4 4647340.0 (74.1) 4585355.0 (73.1) 3618525.0 (57.7) 3578945.0 (57.1) 6270250.0
5 Least deprived 4127870.0 (73.0) 4077950.0 (72.1) 3165520.0 (56.0) 3134720.0 (55.4) 5656380.0
Unknown 767920.0 (66.7) 757855.0 (65.8) 595310.0 (51.7) 589475.0 (51.2) 1151025.0
dementia False 24333695.0 (74.6) 23990845.0 (73.6) 19052375.0 (58.4) 18837280.0 (57.8) 32614480.0
True 141040.0 (73.1) 138745.0 (71.9) 107255.0 (55.6) 105740.0 (54.8) 193025.0
diabetes False 21870490.0 (73.9) 21562400.0 (72.9) 17102235.0 (57.8) 16910840.0 (57.2) 29576355.0
True 2604250.0 (80.6) 2567190.0 (79.5) 2057395.0 (63.7) 2032175.0 (62.9) 3231150.0
hypertension False 22878575.0 (74.1) 22551810.0 (73.0) 17912060.0 (58.0) 17707690.0 (57.3) 30881665.0
True 1596160.0 (82.9) 1577785.0 (81.9) 1247570.0 (64.8) 1235325.0 (64.1) 1925840.0
learning_disability False 24332055.0 (74.6) 23989775.0 (73.5) 19043695.0 (58.4) 18829235.0 (57.7) 32632255.0
True 142680.0 (81.4) 139815.0 (79.8) 115935.0 (66.2) 113780.0 (64.9) 175250.0

Count of Missings

ethnicity_5 ethnicity_new_5 ethnicity_primis_5 all_missing population
group subgroup
all missing records 8332770.0 (25.4) 8677915.0 (26.5) 13647875.0 (41.6) 8332770.0 (25.4) 32807505.0
age_band 0-19 2365740.0 (37.5) 2421980.0 (38.4) 3326330.0 (52.7) 2365740.0 (37.5) 6314700.0
20-29 1167200.0 (27.9) 1209785.0 (28.9) 1774775.0 (42.4) 1167200.0 (27.9) 4181455.0
30-39 908640.0 (18.6) 972730.0 (20.0) 1682600.0 (34.5) 908640.0 (18.6) 4875825.0
40-49 742465.0 (18.6) 796715.0 (20.0) 1399835.0 (35.1) 742465.0 (18.6) 3985390.0
50-59 818470.0 (20.5) 861355.0 (21.6) 1482010.0 (37.2) 818470.0 (20.5) 3988210.0
60-69 663930.0 (20.5) 695165.0 (21.4) 1216535.0 (37.5) 663930.0 (20.5) 3242295.0
70-79 595185.0 (21.1) 620920.0 (22.0) 1091205.0 (38.6) 595185.0 (21.1) 2825395.0
80+ 1041640.0 (31.1) 1069660.0 (31.9) 1639135.0 (48.9) 1041640.0 (31.1) 3352430.0
missing 29495.0 (70.5) 29610.0 (70.8) 35450.0 (84.8) 29495.0 (70.5) 41810.0
sex F 3917680.0 (23.8) 4094175.0 (24.8) 6584255.0 (39.9) 3917680.0 (23.8) 16485060.0
M 4415090.0 (27.0) 4583740.0 (28.1) 7063620.0 (43.3) 4415090.0 (27.0) 16322445.0
region East 1404020.0 (24.4) 1464430.0 (25.4) 2378115.0 (41.3) 1404020.0 (24.4) 5758535.0
East Midlands 997470.0 (23.4) 1051485.0 (24.7) 1752665.0 (41.1) 997470.0 (23.4) 4260350.0
London 358115.0 (20.0) 385045.0 (21.5) 532005.0 (29.6) 358115.0 (20.0) 1794330.0
North East 278200.0 (23.9) 285400.0 (24.5) 426475.0 (36.6) 278200.0 (23.9) 1166440.0
North West 512745.0 (24.0) 531410.0 (24.9) 941205.0 (44.1) 512745.0 (24.0) 2134525.0
South East 427245.0 (26.0) 444735.0 (27.1) 704060.0 (42.8) 427245.0 (26.0) 1643790.0
South West 918245.0 (27.0) 945350.0 (27.8) 1513310.0 (44.5) 918245.0 (27.0) 3399525.0
West Midlands 199685.0 (19.8) 216685.0 (21.4) 394550.0 (39.0) 199685.0 (19.8) 1010960.0
Yorkshire and The Humber 863655.0 (24.3) 894500.0 (25.2) 1382115.0 (38.9) 863655.0 (24.3) 3552510.0
imd 1 Most deprived 1531390.0 (23.7) 1613560.0 (25.0) 2550710.0 (39.5) 1531390.0 (23.7) 6450350.0
2 1577770.0 (24.4) 1650485.0 (25.5) 2594870.0 (40.1) 1577770.0 (24.4) 6467650.0
3 1689090.0 (24.8) 1757380.0 (25.8) 2804000.0 (41.2) 1689090.0 (24.8) 6811855.0
4 1622910.0 (25.9) 1684890.0 (26.9) 2651725.0 (42.3) 1622910.0 (25.9) 6270250.0
5 Least deprived 1528510.0 (27.0) 1578430.0 (27.9) 2490860.0 (44.0) 1528510.0 (27.0) 5656380.0
Unknown 383105.0 (33.3) 393170.0 (34.2) 555715.0 (48.3) 383105.0 (33.3) 1151025.0
dementia False 8280780.0 (25.4) 8623630.0 (26.4) 13562105.0 (41.6) 8280780.0 (25.4) 32614480.0
True 51985.0 (26.9) 54280.0 (28.1) 85770.0 (44.4) 51985.0 (26.9) 193025.0
diabetes False 7705865.0 (26.1) 8013955.0 (27.1) 12474120.0 (42.2) 7705865.0 (26.1) 29576355.0
True 626905.0 (19.4) 663960.0 (20.5) 1173755.0 (36.3) 626905.0 (19.4) 3231150.0
hypertension False 8003090.0 (25.9) 8329855.0 (27.0) 12969605.0 (42.0) 8003090.0 (25.9) 30881665.0
True 329680.0 (17.1) 348060.0 (18.1) 678270.0 (35.2) 329680.0 (17.1) 1925840.0
learning_disability False 8300200.0 (25.4) 8642480.0 (26.5) 13588560.0 (41.6) 8300200.0 (25.4) 32632255.0
True 32570.0 (18.6) 35435.0 (20.2) 59315.0 (33.8) 32570.0 (18.6) 175250.0

Count by Category

ethnicity_5_Asian ethnicity_new_5_Asian ethnicity_primis_5_Asian ethnicity_5_Black ethnicity_new_5_Black ethnicity_primis_5_Black ethnicity_5_Mixed ethnicity_new_5_Mixed ethnicity_primis_5_Mixed ethnicity_5_Other ethnicity_new_5_Other ethnicity_primis_5_Other ethnicity_5_White ethnicity_new_5_White ethnicity_primis_5_White all_filled population
group subgroup
all with records 2159535.0 (6.6) 2173495.0 (6.6) 1874355.0 (5.7) 777690.0 (2.4) 768500.0 (2.3) 606995.0 (1.9) 471280.0 (1.4) 465265.0 (1.4) 428130.0 (1.3) 833460.0 (2.5) 696645.0 (2.1) 676620.0 (2.1) 20232770.0 (61.7) 20025680.0 (61.0) 15573530.0 (47.5) 18943015.0 (57.7) 32807505.0
age_band 0-19 447025.0 (7.1) 448865.0 (7.1) 382980.0 (6.1) 169065.0 (2.7) 167230.0 (2.6) 126905.0 (2.0) 158955.0 (2.5) 157095.0 (2.5) 137025.0 (2.2) 142965.0 (2.3) 117070.0 (1.9) 112955.0 (1.8) 3030950.0 (48.0) 3002460.0 (47.5) 2228500.0 (35.3) 2956815.0 (46.8) 6314700.0
20-29 329110.0 (7.9) 331825.0 (7.9) 281460.0 (6.7) 126605.0 (3.0) 125160.0 (3.0) 99375.0 (2.4) 88010.0 (2.1) 87025.0 (2.1) 79860.0 (1.9) 196770.0 (4.7) 174385.0 (4.2) 167510.0 (4.0) 2273760.0 (54.4) 2253275.0 (53.9) 1778480.0 (42.5) 2380180.0 (56.9) 4181455.0
30-39 500955.0 (10.3) 503705.0 (10.3) 433695.0 (8.9) 148340.0 (3.0) 146640.0 (3.0) 117045.0 (2.4) 90635.0 (1.9) 89435.0 (1.8) 84210.0 (1.7) 209185.0 (4.3) 175555.0 (3.6) 172775.0 (3.5) 3018065.0 (61.9) 2987760.0 (61.3) 2385500.0 (48.9) 3152280.0 (64.7) 4875825.0
40-49 397165.0 (10.0) 400400.0 (10.0) 349225.0 (8.8) 137470.0 (3.4) 135690.0 (3.4) 108725.0 (2.7) 60515.0 (1.5) 59565.0 (1.5) 57340.0 (1.4) 136425.0 (3.4) 109845.0 (2.8) 108875.0 (2.7) 2511350.0 (63.0) 2483175.0 (62.3) 1961385.0 (49.2) 2551215.0 (64.0) 3985390.0
50-59 208560.0 (5.2) 211125.0 (5.3) 185540.0 (4.7) 103360.0 (2.6) 102100.0 (2.6) 82320.0 (2.1) 38600.0 (1.0) 38060.0 (1.0) 36660.0 (0.9) 74220.0 (1.9) 59675.0 (1.5) 57865.0 (1.5) 2745005.0 (68.8) 2715895.0 (68.1) 2143820.0 (53.8) 2478290.0 (62.1) 3988210.0
60-69 136925.0 (4.2) 137560.0 (4.2) 120410.0 (3.7) 49230.0 (1.5) 48615.0 (1.5) 38975.0 (1.2) 18765.0 (0.6) 18500.0 (0.6) 18015.0 (0.6) 40000.0 (1.2) 32705.0 (1.0) 31110.0 (1.0) 2333450.0 (72.0) 2309750.0 (71.2) 1817250.0 (56.0) 2005025.0 (61.8) 3242295.0
70-79 76205.0 (2.7) 76575.0 (2.7) 66775.0 (2.4) 20190.0 (0.7) 19950.0 (0.7) 15815.0 (0.6) 8335.0 (0.3) 8210.0 (0.3) 7930.0 (0.3) 20080.0 (0.7) 16200.0 (0.6) 15230.0 (0.5) 2105400.0 (74.5) 2083540.0 (73.7) 1628440.0 (57.6) 1716875.0 (60.8) 2825395.0
80+ 62430.0 (1.9) 62270.0 (1.9) 53510.0 (1.6) 23010.0 (0.7) 22695.0 (0.7) 17620.0 (0.5) 6835.0 (0.2) 6740.0 (0.2) 6630.0 (0.2) 13455.0 (0.4) 10950.0 (0.3) 10075.0 (0.3) 2205060.0 (65.8) 2180115.0 (65.0) 1625460.0 (48.5) 1695985.0 (50.6) 3352430.0
missing 1160.0 (2.8) 1170.0 (2.8) 755.0 (1.8) 425.0 (1.0) 425.0 (1.0) 225.0 (0.5) 635.0 (1.5) 635.0 (1.5) 455.0 (1.1) 365.0 (0.9) 260.0 (0.6) 225.0 (0.5) 9730.0 (23.3) 9715.0 (23.2) 4705.0 (11.3) 6350.0 (15.2) 41810.0
sex F 1046535.0 (6.3) 1057310.0 (6.4) 914340.0 (5.5) 393620.0 (2.4) 388980.0 (2.4) 309110.0 (1.9) 242875.0 (1.5) 239815.0 (1.5) 221840.0 (1.3) 432430.0 (2.6) 360780.0 (2.2) 349320.0 (2.1) 10451915.0 (63.4) 10344000.0 (62.7) 8106195.0 (49.2) 9788470.0 (59.4) 16485060.0
M 1113000.0 (6.8) 1116190.0 (6.8) 960015.0 (5.9) 384065.0 (2.4) 379520.0 (2.3) 297885.0 (1.8) 228405.0 (1.4) 225450.0 (1.4) 206290.0 (1.3) 401030.0 (2.5) 335865.0 (2.1) 327300.0 (2.0) 9780855.0 (59.9) 9681685.0 (59.3) 7467335.0 (45.7) 9154545.0 (56.1) 16322445.0
region East 287525.0 (5.0) 289030.0 (5.0) 245825.0 (4.3) 143215.0 (2.5) 141535.0 (2.5) 107365.0 (1.9) 87280.0 (1.5) 86240.0 (1.5) 79910.0 (1.4) 108620.0 (1.9) 86400.0 (1.5) 85090.0 (1.5) 3727865.0 (64.7) 3690895.0 (64.1) 2862225.0 (49.7) 3344085.0 (58.1) 5758535.0
East Midlands 308710.0 (7.2) 308740.0 (7.2) 258085.0 (6.1) 94665.0 (2.2) 93730.0 (2.2) 72590.0 (1.7) 58275.0 (1.4) 57530.0 (1.4) 51365.0 (1.2) 78155.0 (1.8) 63975.0 (1.5) 62285.0 (1.5) 2723075.0 (63.9) 2684890.0 (63.0) 2063355.0 (48.4) 2472405.0 (58.0) 4260350.0
London 351885.0 (19.6) 357290.0 (19.9) 318055.0 (17.7) 120545.0 (6.7) 118860.0 (6.6) 101125.0 (5.6) 58730.0 (3.3) 57885.0 (3.2) 54770.0 (3.1) 148350.0 (8.3) 126580.0 (7.1) 125050.0 (7.0) 756705.0 (42.2) 748675.0 (41.7) 663330.0 (37.0) 1243690.0 (69.3) 1794330.0
North East 46910.0 (4.0) 46970.0 (4.0) 41940.0 (3.6) 19435.0 (1.7) 19310.0 (1.7) 16580.0 (1.4) 12900.0 (1.1) 12790.0 (1.1) 11845.0 (1.0) 22510.0 (1.9) 19690.0 (1.7) 20135.0 (1.7) 786490.0 (67.4) 782280.0 (67.1) 649465.0 (55.7) 734745.0 (63.0) 1166440.0
North West 65015.0 (3.0) 65340.0 (3.1) 50855.0 (2.4) 16770.0 (0.8) 16600.0 (0.8) 12285.0 (0.6) 13695.0 (0.6) 13520.0 (0.6) 12350.0 (0.6) 26315.0 (1.2) 22060.0 (1.0) 20170.0 (0.9) 1499990.0 (70.3) 1485595.0 (69.6) 1097660.0 (51.4) 1183135.0 (55.4) 2134525.0
South East 55655.0 (3.4) 56925.0 (3.5) 47440.0 (2.9) 22880.0 (1.4) 22530.0 (1.4) 17540.0 (1.1) 22840.0 (1.4) 22555.0 (1.4) 20090.0 (1.2) 35280.0 (2.1) 28645.0 (1.7) 27710.0 (1.7) 1079900.0 (65.7) 1068405.0 (65.0) 826950.0 (50.3) 928370.0 (56.5) 1643790.0
South West 64835.0 (1.9) 66095.0 (1.9) 55195.0 (1.6) 23530.0 (0.7) 23230.0 (0.7) 18060.0 (0.5) 30835.0 (0.9) 30360.0 (0.9) 27770.0 (0.8) 41675.0 (1.2) 33525.0 (1.0) 32340.0 (1.0) 2320400.0 (68.3) 2300960.0 (67.7) 1752845.0 (51.6) 1868945.0 (55.0) 3399525.0
West Midlands 166450.0 (16.5) 164490.0 (16.3) 139655.0 (13.8) 63605.0 (6.3) 62460.0 (6.2) 48660.0 (4.8) 28005.0 (2.8) 27580.0 (2.7) 24435.0 (2.4) 30425.0 (3.0) 25305.0 (2.5) 23250.0 (2.3) 522795.0 (51.7) 514440.0 (50.9) 380410.0 (37.6) 605155.0 (59.9) 1010960.0
Yorkshire and The Humber 286225.0 (8.1) 286255.0 (8.1) 265755.0 (7.5) 60275.0 (1.7) 59780.0 (1.7) 48720.0 (1.4) 38245.0 (1.1) 37785.0 (1.1) 35080.0 (1.0) 57010.0 (1.6) 46420.0 (1.3) 47345.0 (1.3) 2247105.0 (63.3) 2227770.0 (62.7) 1773495.0 (49.9) 2148670.0 (60.5) 3552510.0
imd 1 Most deprived 665075.0 (10.3) 665510.0 (10.3) 581305.0 (9.0) 297870.0 (4.6) 294355.0 (4.6) 234660.0 (3.6) 124995.0 (1.9) 123475.0 (1.9) 113740.0 (1.8) 200250.0 (3.1) 160985.0 (2.5) 162075.0 (2.5) 3630770.0 (56.3) 3592465.0 (55.7) 2807860.0 (43.5) 3850080.0 (59.7) 6450350.0
2 584025.0 (9.0) 587770.0 (9.1) 509310.0 (7.9) 191645.0 (3.0) 189375.0 (2.9) 150355.0 (2.3) 103600.0 (1.6) 102265.0 (1.6) 95145.0 (1.5) 195535.0 (3.0) 163360.0 (2.5) 160605.0 (2.5) 3815080.0 (59.0) 3774395.0 (58.4) 2957365.0 (45.7) 3826505.0 (59.2) 6467650.0
3 409500.0 (6.0) 412645.0 (6.1) 356640.0 (5.2) 124505.0 (1.8) 123120.0 (1.8) 97335.0 (1.4) 91205.0 (1.3) 89970.0 (1.3) 82600.0 (1.2) 164065.0 (2.4) 139580.0 (2.0) 134710.0 (2.0) 4333490.0 (63.6) 4289165.0 (63.0) 3336570.0 (49.0) 3963295.0 (58.2) 6811855.0
4 241580.0 (3.9) 244780.0 (3.9) 208680.0 (3.3) 76915.0 (1.2) 75860.0 (1.2) 59400.0 (0.9) 73445.0 (1.2) 72475.0 (1.2) 66920.0 (1.1) 133485.0 (2.1) 113785.0 (1.8) 108580.0 (1.7) 4121910.0 (65.7) 4078455.0 (65.0) 3174940.0 (50.6) 3578945.0 (57.1) 6270250.0
5 Least deprived 172575.0 (3.1) 175050.0 (3.1) 146610.0 (2.6) 48380.0 (0.9) 47715.0 (0.8) 36545.0 (0.6) 58690.0 (1.0) 57925.0 (1.0) 52420.0 (0.9) 99130.0 (1.8) 83660.0 (1.5) 77270.0 (1.4) 3749095.0 (66.3) 3713605.0 (65.7) 2852675.0 (50.4) 3134720.0 (55.4) 5656380.0
Unknown 86780.0 (7.5) 87745.0 (7.6) 71805.0 (6.2) 38375.0 (3.3) 38080.0 (3.3) 28700.0 (2.5) 19345.0 (1.7) 19160.0 (1.7) 17310.0 (1.5) 40990.0 (3.6) 35275.0 (3.1) 33375.0 (2.9) 582430.0 (50.6) 577600.0 (50.2) 444120.0 (38.6) 589475.0 (51.2) 1151025.0
dementia False 2156245.0 (6.6) 2170225.0 (6.7) 1871440.0 (5.7) 775920.0 (2.4) 766765.0 (2.4) 605595.0 (1.9) 470850.0 (1.4) 464845.0 (1.4) 427700.0 (1.3) 832745.0 (2.6) 696100.0 (2.1) 676075.0 (2.1) 20097930.0 (61.6) 19892910.0 (61.0) 15471560.0 (47.4) 18837280.0 (57.8) 32614480.0
True 3290.0 (1.7) 3275.0 (1.7) 2910.0 (1.5) 1765.0 (0.9) 1735.0 (0.9) 1400.0 (0.7) 430.0 (0.2) 420.0 (0.2) 430.0 (0.2) 715.0 (0.4) 545.0 (0.3) 545.0 (0.3) 134840.0 (69.9) 132770.0 (68.8) 101970.0 (52.8) 105740.0 (54.8) 193025.0
diabetes False 1835590.0 (6.2) 1849585.0 (6.3) 1585955.0 (5.4) 687955.0 (2.3) 679890.0 (2.3) 533955.0 (1.8) 442365.0 (1.5) 436760.0 (1.5) 400165.0 (1.4) 783010.0 (2.6) 657100.0 (2.2) 637320.0 (2.2) 18121560.0 (61.3) 17939060.0 (60.7) 13944840.0 (47.1) 16910840.0 (57.2) 29576355.0
True 323945.0 (10.0) 323910.0 (10.0) 288400.0 (8.9) 89730.0 (2.8) 88610.0 (2.7) 73040.0 (2.3) 28910.0 (0.9) 28505.0 (0.9) 27965.0 (0.9) 50450.0 (1.6) 39545.0 (1.2) 39300.0 (1.2) 2111210.0 (65.3) 2086620.0 (64.6) 1628690.0 (50.4) 2032175.0 (62.9) 3231150.0
hypertension False 2083035.0 (6.7) 2096580.0 (6.8) 1806700.0 (5.9) 741390.0 (2.4) 732680.0 (2.4) 578295.0 (1.9) 460195.0 (1.5) 454325.0 (1.5) 417265.0 (1.4) 816575.0 (2.6) 683560.0 (2.2) 663825.0 (2.1) 18777385.0 (60.8) 18584665.0 (60.2) 14445975.0 (46.8) 17707690.0 (57.3) 30881665.0
True 76500.0 (4.0) 76920.0 (4.0) 67655.0 (3.5) 36300.0 (1.9) 35820.0 (1.9) 28700.0 (1.5) 11085.0 (0.6) 10940.0 (0.6) 10865.0 (0.6) 16885.0 (0.9) 13085.0 (0.7) 12795.0 (0.7) 1455385.0 (75.6) 1441015.0 (74.8) 1127555.0 (58.5) 1235325.0 (64.1) 1925840.0
learning_disability False 2150300.0 (6.6) 2164355.0 (6.6) 1866080.0 (5.7) 774465.0 (2.4) 765325.0 (2.3) 604445.0 (1.9) 469040.0 (1.4) 463070.0 (1.4) 426095.0 (1.3) 831965.0 (2.5) 695505.0 (2.1) 675475.0 (2.1) 20106285.0 (61.6) 19901520.0 (61.0) 15471595.0 (47.4) 18829235.0 (57.7) 32632255.0
True 9240.0 (5.3) 9140.0 (5.2) 8270.0 (4.7) 3225.0 (1.8) 3175.0 (1.8) 2550.0 (1.5) 2240.0 (1.3) 2195.0 (1.3) 2035.0 (1.2) 1495.0 (0.9) 1140.0 (0.7) 1150.0 (0.7) 126485.0 (72.2) 124165.0 (70.9) 101935.0 (58.2) 113780.0 (64.9) 175250.0

Overlapping Definitions

Idea: Use an upset plot

Latest vs. Most Common

matching (n=24318430.0) not_matching (n=579095.0)
ethnicity_5
Asian 2140725.0 87170.0
Black 762970.0 64160.0
Mixed 431905.0 124745.0
Other 787145.0 147615.0
White 20195685.0 155405.0
asian (n=2233285.0) black (n=833390.0) mixed (n=531665.0) other (n=919355.0) white (n=20379830.0)
ethnicity_5
Asian 2140725.0 5520.0 15045.0 41185.0 25420.0
Black 5315.0 762970.0 25375.0 10080.0 23390.0
Mixed 17745.0 32955.0 431905.0 17750.0 56295.0
Other 41215.0 10750.0 16610.0 787145.0 79040.0
White 28285.0 21195.0 42730.0 63195.0 20195685.0
matching (n=23999195.0) not_matching (n=515465.0)
ethnicity_new_5
Asian 2158800.0 77735.0
Black 755565.0 61475.0
Mixed 427145.0 122395.0
Other 662620.0 112785.0
White 19995065.0 141075.0
asian (n=2239940.0) black (n=824010.0) mixed (n=521965.0) other (n=765525.0) white (n=20163220.0)
ethnicity_new_5
Asian 2158800.0 5540.0 15300.0 31060.0 25835.0
Black 5335.0 755565.0 24710.0 8180.0 23250.0
Mixed 18355.0 33225.0 427145.0 14225.0 56590.0
Other 28915.0 8605.0 12785.0 662620.0 62480.0
White 28535.0 21075.0 42025.0 49440.0 19995065.0
matching (n=19071925.0) not_matching (n=387625.0)
ethnicity_primis_5
Asian 1864700.0 55570.0
Black 598055.0 45415.0
Mixed 403905.0 89765.0
Other 652710.0 87180.0
White 15552555.0 109695.0
asian (n=1927040.0) black (n=645650.0) mixed (n=477370.0) other (n=733040.0) white (n=15676450.0)
ethnicity_primis_5
Asian 1864700.0 3305.0 11110.0 21075.0 20080.0
Black 3350.0 598055.0 19120.0 6170.0 16775.0
Mixed 13720.0 22805.0 403905.0 11875.0 41365.0
Other 23515.0 6680.0 11310.0 652710.0 45675.0
White 21755.0 14805.0 31925.0 41210.0 15552555.0

State Change

asian black mixed other white
ethnicity_5
Asian (n = 2159535) 2159535.0 10320.0 30010.0 68495.0 48990.0
Black (n = 777690) 8120.0 777690.0 49520.0 17535.0 38880.0
Mixed (n = 471280) 21900.0 41500.0 471280.0 24395.0 74405.0
Other (n = 833460) 56130.0 14365.0 26040.0 833460.0 94170.0
White (n = 20232770) 37645.0 30990.0 74500.0 124580.0 20232770.0
asian black mixed other white
ethnicity_new_5
Asian (n = 2173495) 2173495.0 10315.0 30660.0 49910.0 49525.0
Black (n = 768500) 8075.0 768500.0 49125.0 14045.0 38465.0
Mixed (n = 465265) 22230.0 41110.0 465265.0 19275.0 73670.0
Other (n = 696645) 39105.0 11215.0 19495.0 696645.0 72500.0
White (n = 20025680) 37760.0 30600.0 73705.0 98815.0 20025680.0
asian black mixed other white
ethnicity_primis_5
Asian (n = 1874355) 1874355.0 5340.0 19640.0 31645.0 34970.0
Black (n = 606995) 4830.0 606995.0 32050.0 9655.0 24645.0
Mixed (n = 428130) 16170.0 27370.0 428130.0 15065.0 52310.0
Other (n = 676620) 30275.0 8490.0 15985.0 676620.0 53060.0
White (n = 15573530) 27505.0 19695.0 49570.0 71010.0 15573530.0