Trainees built confidence in AI- and equity-focused methods for public health research

Posted September 26, 2023

Students at the AI4PH Summer Institute, who partook in the CanPath Student Dataset challenge

This summer, the Artificial Intelligence for Public Health (AI4PH) Summer Institute hosted twenty-two graduate students, post-doctoral fellows, and early-career researchers at the Fields Institute at the University of Toronto (U of T). Renowned for its unique cross-disciplinary approach, this institute empowers trainees to enhance their expertise in artificial intelligence (AI) and big data. Their goal? To use these cutting-edge skills to drive meaningful change in tackling major public health issues and promoting health equity.

Many trainees were situated at U of T, but others trekked from coast to coast, from British Columbia to Quebec. The trainees also came from interdisciplinary backgrounds, such as epidemiology, computer science, and statistics!

This year’s theme was “Causal Inference using Machine Learning Methods,” meaning that the trainees learned to use data and algorithms to uncover cause-and-effect relationships between many variables. Over five days, the trainees participated in various learning sessions, including a data challenge that progressed from selecting and preparing data to creating machine-learning models, evaluating model accuracy, and ultimately making causal conclusions.

The trainees were supplied with data from the Canadian Partnership for Tomorrow’s Health’s (CanPath) Student Dataset for their challenge. CanPath is a population-health research platform based at U of T’s Dalla Lana School of Public Health. This platform focuses on various chronic diseases and cancers in Canadian adults.

“As you can imagine, there are CanPath upholds the highest of standards for data integrity, security, and privacy to honour Canadian participants’ consents and the rules of responsible conduct of science,” noted Sheraz Cheema, CanPath Data Manager. “By developing a student dataset, we’ve enabled the use of a sample of the data in a teaching environment.”

The Student Dataset was developed to provide students with hands-on experience with ‘real-world’ population health data while ensuring the privacy and confidentiality of the study participants. It is a synthetic dataset, meaning that it was manipulated to mimic CanPath’s nationally harmonized data but does not include or reveal actual data of any CanPath participants.

“This data challenge, developed using CanPath student data, is an amazing training opportunity to effectively introduce advanced causal inference techniques and how they can be applied to complex health studies using real-world data,” says Prof. Kuan Liu, AI4PH mentor and Assistant Professor in Health Services Research at U of T’s Institute of Health Policy, Management and Evaluation (IHPME).

Students listen in to presentations from their peers using the Student Dataset

After three fulsome days of seminars, workshops, and data challenge work, the trainees were ready to showcase their impressive findings:

Students present how the CanPath Student Dataset supported their research at the Summer Institute

The trainees were asked to consider how the skills they acquired during the data challenge could benefit their future research endeavours. They found great value in the immersive and interdisciplinary nature of the summer institute, relishing the opportunity to collaborate on research projects within a dynamic and cross-disciplinary setting.

“Given my Master of Science in Computer Engineering and AI, coupled with my current studies in Community Health Science at the University of Manitoba, I believe programs like AI4PH are invaluable,” says Hassan Maleki Golandouz, PhD Student at the University of Manitoba. “They cater to individuals like me, aiming to bridge AI and health sciences. This synthesis allows for a comprehensive understanding of health-related problems, formulating them with the right techniques, and eventually solving them.”

To learn more about AI4PH, their Summer Institute, courses, and traineeships, visit their website at

You can also learn more about the CanPath Student Dataset and how it can be utilized for your academic course or event at

The AI4PH Summer Institute is funded by the Canadian Institutes for Health Research and is led by Prof. David Buckeridge (McGill University), Prof. Laura Rosella (University of Toronto), Prof. Lisa Lix (University of Manitoba), Prof. Nathaniel Osgood (University of Saskatchewan), and Prof. Joon Lee (University of Calgary).