Clinical trials can offer life saving treatments to patients in need. While most clinical trials fail to meet their intended endpoints, many succeed. However, not all people equally enjoy the opportunity to participate in clinical trials.
Historically, racial and ethnic groups have been underrepresented in clinical trial studies. In 2020, the U.S. Food and Drug Administration reported that 75% of clinical trial subjects were white, while only 6% were Asian, 8% were Black, and 11% were Hispanic.1 Indeed, this disparity of representation extends beyond races and ethnicities to include youth, aging populations, LGBTQIA+, people living with disabilities, people living in rural communities, and people with comorbidities.
The underlying reasons behind this lack of representation are varied and complex. Some factors are sociological, while others are physiological. For example, a recent study found that African and Middle Eastern Americans were excluded from certain cancer clinical trials due to a naturally occurring difference in their blood cell counts. 2
This highlights the urgent need for inclusivity in clinical trials. Without representation, these populations not only lack early access to therapies, but the potential therapeutic value of novel therapies on these populations may not be factored into approval decisions. In some cases, this could preclude access to drugs otherwise beneficial to patient subgroups.
Fortunately, AI can help bring underrepresented patient populations into clinical trials through “virtual patients.” Virtual patients are computer-generated models that mimic the physiological and biological diversity seen in real patients, and they allow us to simulate the inclusion of underrepresented groups into the clinical trial process. These virtual patients are created by using AI to augment existing patient data with synthetic data that fills in the gaps of missing information.
Using advanced algorithms such as Neural Networks (CT-GANs, TVAEs, Gaussian Copula), along with imputation techniques (MICE, KNN, Iterative) and oversampling methods (SMOTE, SVMOTE, Borderline SMOTE), this synthetic data can be generated to closely resemble the characteristics of underrepresented patients, thereby enhancing the representation of these groups. Then, explainable AI models can analyze this data to simulate how various subgroups could react to different treatments, considering factors like age, gender, ethnicity, coexisting conditions, and lifestyle. Furthermore, virtual patients can receive multiple treatments at once, allowing researchers to assess the effectiveness of various interventions on individual patients or groups in a controlled environment.
Artificial intelligence affords researchers an opportunity to address several gaps in the design of clinical trials, as well as their outcomes. Because AI techniques can extrapolate and learn from a wide variety of datapoints incorporating a diversity of patient attributes, AI-driven insights can improve patient representation within trials and provide additional understanding of how results may impact populations not directly included in trials.