A blog by SPHR pre-doctoral fellow Rima Choudhury
In recent years some areas across the UK have created data hubs which cover their entire local population, linking data from a broad range of public services, offering unparalleled insights into population health. Public health challenges are complex and are often influenced by interconnected social, environmental, and economic factors. Linked datasets can play a crucial role in understanding these complex issues by connecting a wide range of health, care and social determinants data. However, navigating and understanding these datasets pose challenges due to limited information and complexity, emphasising the need for user-friendly resources. My SPHR pre-doctoral research aims to map local linked datasets across the UK to help researchers answer complex questions in public health.
Linked datasets can bring together information from primary care, community health, mental health services, acute hospital care, local government, and social care with potential to expand further. These datasets can be used to understand complex health trends, evaluate the impact of interventions across sectors, understand patient journeys and assess the efficiency and effectiveness of the whole health and care system (1,2). The data encompasses sociodemographic information such as age, sex, ethnicity, and deprivation, along with health information such as the prevalence of long-term health conditions, smoking status, and Body Mass Index (BMI). In addition, datasets are large and include groups who are typically underrepresented in research, such as older people with multiple health conditions, and marginalised groups such as migrants or people who are homeless (3). Furthermore, the capacity to link between data enhances the accuracy and comprehensiveness of the dataset (3).
One such dataset is the Whole System Integrated Care (WSIC) database. WSIC is a local data hub of electronic patient records in North West London. It was originally developed for commissioning and direct patient care but is now being considered as a valuable resource for research. WSIC links data from primary care, community and mental health care, secondary and tertiary care, acute hospital care, and social care. As of June 2019, WSIC contained records for a total of approximately 2.37 million patients, with 365 participating general practices representing 95% of the entire population of North West London. The data goes through a robust process of de-identification before it can be used for research, this means patients cannot be identified from the data (2).
WSIC has been used in a range of studies. For example, it has been used to understand the risk factors associated with poor asthma outcomes in children and young people in North West London (4). Additionally, it has been used to investigate frailty in patients with heart failure and their frequency of accessing social care and community services, to identify those most susceptible to risks (5). WSIC has also been used to gain insights into the ways in which patients with chronic illnesses access health services across community, primary, and secondary care settings to manage their care (5).
WSIC is part of a new generation of large local linked databases. There are many others similar to WSIC that I am yet to fully discover. It can be quite challenging to come across detailed descriptions of these local linked datasets, especially when dealing with datasets initially set up for non-research purposes. There is often limited or no researcher-friendly documentation available regarding the data contents, collection processes, and the steps to access them.
Given the complexities of these datasets, there is a growing need for more user-friendly resources to fully understand these datasets. Our aim is to map these local linked datasets to produce a comprehensive catalogue of what exists. This will allow us to assess what information is available in each local area and provide insights about the strengths and weaknesses of each dataset, allowing us to prioritise key research questions. Ultimately, this will not only help demystify the datasets but also encourage researchers and other professionals to utilise them more effectively.
Can you help?
Researchers working in these local areas hold valuable knowledge about these datasets. If you are a researcher with insights into local linked datasets, I encourage you to get in touch. Your understanding of the local landscape and key stakeholders would significantly contribute to our efforts in mapping local linked datasets across the UK. Please get in touch with us by emailing Dr Hanna Creese at firstname.lastname@example.org.
- Lewer D, Bourne T, George A, Abi-Aad G, Taylor C, George J. Data Resource: the Kent Integrated Dataset (KID). Internat J Population Data Sci. 2018;3:6.
- Bottle A, Cohen C, Lucas A, Saravanakumar K, Ul-Haq Z, Smith W, Majeed A, Aylin P. How an electronic health record became a real-world research resource: comparison between London’s Whole Systems Integrated Care database and the Clinical Practice Research Datalink. BMC Med Inform Decis Mak. 2020 Apr 20;20(1):71.
- Warren-Gash C. Linking and sharing routine health data for research in England. PHG Foundation. 2017
- Lim XQ, Hargreaves DS, Foley KA, et al. 150 Asthma prescribing and the risks of Emergency Department attendance among children and young people. Archives of Disease in Childhood. 2023;108:A207.
- Maxine Myers. Researchers outline how Europe’s largest patient data record can improve care. Imperial College London Blog. April 2019.