What is data linkage?
Data linkage is the process of combining information from different sources about the same individual to create a more comprehensive dataset for research. In population health cohorts like CanPath, this involves linking self-reported questionnaire data with administrative health records, such as hospital visits, treatment outcomes, and cancer registries. This approach allows researchers to study long-term health outcomes and patterns of health system use while minimizing the need for additional data collection.
Linkage is conducted securely using de-identified data to protect participant privacy. Researchers access the linked data through the respective provincial data custodian. Depending on the custodian’s policies, the linked data may be transferred to the researcher through secure transfer or accessed remotely in a secure cloud environment. Data linkage involves approval processes and associated fees that fall outside the scope of CanPath.
The following information is provided to support researchers in navigating the data linkage process; however, please note that CanPath does not perform the linkage on behalf of researchers.
Is CanPath data linked?
CanPath questionnaire data is self-reported. Physical measures data, such as height, weight, and blood pressure, may be self-reported or collected through standardized procedures during participant assessments.
CanPath data has been pre-linked to environmental data from the Canadian Urban Environmental Health Research Consortium (CANUE). The CANUE linkage is complete; researchers can access the linked environmental data without additional linkage steps. To access CANUE-linked data, researchers must be affiliated with an institution participating in the SMART Consortium Agreement with DMTI Spatial Inc. and sign the CANUE Third-Party Agreement.
While self-reported CanPath data has shown agreement with linked data, CanPath data is not linked to administrative records or cancer registries by default. Researchers may enhance the data by requesting linkage to provincial administrative health data. This can improve data accuracy and completeness by validating self-reported outcomes and adding variables not captured through questionnaires, such as hospitalization data, treatment outcomes, or diagnoses.
How can I link CanPath data?
Linkage requires approval from both CanPath and the relevant data custodian.
If a researcher intends to link CanPath data, the linkage must be detailed in the CanPath access application form. If linkage was not included in the initial application, and the approved user later opts to link data, an amendment must be submitted for review and approval by the CanPath Access Committee. Applications involving linkage will undergo a full review by the CanPath Access Committee. After CanPath Access Committee approval, researchers may proceed with the linkage process.
To link CanPath data, researchers must contact the relevant provincial data holders or cancer registries directly or apply through the Health Data Research Network (HDRN Canada) Data Access Support Hub (DASH).
HDRN Canada is a pan-Canadian network of member organizations that either hold linkable health and health-related data for entire populations and/or have mandates and roles relating directly to access or use of those data. If a project includes linkages across more than one province, please connect with DASH to assist in navigating multi-provincial data linkage and access process.
Does CanPath link the data for my project?
The data linkage process falls outside the scope of CanPath, and the researcher must arrange the required access approvals directly with the relevant provincial data holder or through the HDRN DASH.
Additional linkage fees apply and are not covered by CanPath or reflected in the CanPath Cost-Recovery Access Fees. Once all the approvals are in place and the data holder notifies CanPath that the linkage will proceed, the CanPath access team will work with the designated data holder to facilitate the linkage on the back end.
How soon can I access linked data?
Please connect with the relevant data holder or DASH early in the project planning phase to understand the expected timelines and fees involved in the data linkage process.
We recommend planning for a minimum timeline of six months to complete the necessary approvals, data access agreements, and linkage processes with various provincial data holders. Timelines may vary depending on the type of data requested and the specific requirements of each provincial data custodian.
Where can I find more information on linkages?
Please review the Supplemental Letter on Data Linkage, which outlines the steps required to access administrative and cancer registry data.
Download the Supplemental Letter on Data LinkageDownloadShould I link to cancer registry data or administrative data?
This depends on the data required to complete the proposed analysis. Please review the provincial data custodian(s)’s holdings to determine which data sources are best suited for the proposed project.
The data available from provincial data custodians varies in scope and may not be harmonized across regions. This data was collected for different purposes, at different times, and under different systems across jurisdictions. As a result, the datasets’ structure, completeness, and comparability can vary.
Researchers should carefully review the metadata and consult with the data custodian to understand the nuances of the data before including it in a multi-jurisdictional analysis. We encourage researchers to review each custodian’s data holdings and reach out to them directly to confirm whether the required data is available, appropriate for the research project, and accessible under their terms.
In general, if a project requires only cancer incidence data, researchers may apply directly to the relevant provincial cancer registry. In some cases, this data may be accessible via a CanPath regional partner, as outlined in Table 1. Researchers can apply for cancer registry data either directly through a provincial cancer registry or, when available, through the CanPath regional partner.
If the CanPath project involves additional administrative health data (e.g., hospitalization data, physician billing data, vital statistics, treatment outcomes), researchers must apply through the appropriate provincial administrative data holder.
Cancer and Administrative data holders are outlined in Table 1. For CanPath projects involving data linkage across two or more provinces, the HDRN DASH team can help coordinate access across regions and advise on data availability and project feasibility.
Please see the table below for the data holders in each province. Access processes are outlined in the Supplemental Letter on Data Linkage.
| Province | Administrative Health Data Holder | Cancer Registry Data Holder |
| British Columbia | PopData BC To access this data, email dataaccess@popdata.bc.ca and visit the PopData Data BC website for more info. | BC Cancer Registry This data can be accessed by applying to the BC Generations Project. |
| Alberta | Alberta Health Services (AHS) and Alberta Health (AH) via Alberta SPOR SUPPORT Unit (AbSPORU) To access this data, contact the Data Access Support Hub (DASH). | Alberta Cancer Registry This data can be accessed by applying to Alberta’s Tomorrow Project (ATP). |
| Ontario | Institute for Clinical Evaluative Sciences (Ontario) (ICES) To access this data, please contact ICES Data Analytics Services (DAS). | Cancer Care Ontario (CCO) – Ontario Health To access this data, apply directly to the OH(CCO) Data Disclosure Subcommittee (DDSC) by emailing: OH-CCO_Datarequest@ontariohealth.ca |
| Nova Scotia* | Health Data Nova Scotia (HDNS) | Nova Scotia Cancer Registry This data can be accessed by submitting an application to Atlantic PATH. |
| Newfoundland and Labrador* | Newfoundland and Labrador Centre for Health Information (NLCHI) | NL Cancer Registry This data can be accessed by submitting an application to Atlantic PATH. |
| New Brunswick* | DataNB (previously New Brunswick Institute for Research, Data and Training (NB-IRDT)) | CanPath data linkage is not yet supported. |
| Prince Edward Island* | Secure Island Data Repository (SIDR) | CanPath data linkage is not yet supported. |
| Quebec | Institut de la statistique du Québec (ISQ) is the data custodian for Quebec health data. To access this data, either apply to ISQ directly by completing an online access request through “Guichet Unique” or apply through CARTaGENE. | |
| Manitoba | CanPath data linkage is not yet supported. | |
| Saskatchewan | CanPath data linkage is not yet supported. |
*To request administrative data from:
- One administrative data holder and the Atlantic PATH cohort only, please submit a request to the Atlantic PATH.
- Multiple administrative data holder and/or 2 or more regions/cohorts, please connect with HDRN’s DASH.
See the Supplemental Letter on Data Linkage for more details.