Development of advanced pan-disease predictive models using CanPath questionnaires

Principal Investigator: Dr. John Lewis

Affiliation: University of Alberta

Start Year: 2021

This research project aims to improve our understanding and ability to predict many different diseases through the analysis of CanPath questionnaire data using conventional and advanced machine learning models. A standardized data analysis platform will be used to create different types of disease predictive models and these models will contain insights about the risk factors for the disease they predict. Some of the advanced models in this platform have been previously applied to predicting prostate cancer and have shown significantly better predictive accuracy than the best publicly available prostate cancer predictive models which illustrates the potential value of this platform for predicting many other diseases. The standardized data analysis platform does not require any functionality changes for creating predictive models for different diseases which is why this platform can be efficiently applied to the prediction of dozens of diseases such as cancers, cardiovascular diseases, pulmonary diseases, diabetes, and neurological disorders using CanPath questionnaire data. Once all predictive models have been created for each disease, the models will be analyzed to understand the global risk factors of diseases as well as how different risk factors change in risk for different subpopulations such as over different ages and genders. The predictive models will be made available for public use and new insights about disease risk factors will be published to improve humanity’s understanding of disease.