Publications

These publications are examples of research made possible with data from CanPath and its regional cohorts.

2025

Genetic Similarity Clustering Using the UK Biobank as a Reference Dataset

Authors: Ngoc-Quynh Le, Puya Gharahkhani, Stuart MacGregor

When testing their genetic ancestry clustering algorithm using CARTaGENE data, Ngoc-Quynh Le and colleagues demonstrated that it could consistently categorize participants across 19 worldwide ancestry categories. With 81% precision and 97% recall, 519 people were correctly clustered as Middle Eastern in CARTaGENE. This illustrates how ancestry in smaller cohorts, such as CARTaGENE, may be successfully identified by using the UK Biobank as a reference.

Read Publication