Vice President of AI Research
Accelerating COVID-19 Research
Malai Sankarasubbu has a vision: to accelerate the drug development timeline by 50% over the next five years. But when the entire world came calling on drug developers to mitigate the novel coronavirus with treatments, Malai kicked his vision in high gear to help accelerate drug developers’ science as fast as possible right now.
Malai helped to pioneer Saama’s Smart Data Quality (SDQ), a domain-centric, deep learning/AI system that learns patterns to decide whether queries need to be raised during clinical research. In 2020, SDQ was leveraged by a major pharmaceutical company in their large COVID-19 vaccine trial to shave an entire month off of the clinical development process and bring a much-needed vaccine to market to help fight the global pandemic. Thanks to Malai helping to infuse SDQ into process and technology optimizations, the time from data entry to data cleaning for that trial was reduced from 30 days to 22 hours.
Malai also helped develop the Interpretable Machine Learning Classifier to discriminate COVID-19 positive coughs from both COVID-19 negative and healthy coughs recorded on a smartphone. This type of screening is non-contact and easily applied, and helps reduce workload in testing centers as well as limit transmission by recommending early self-isolation to those who have a cough suggestive of COVID-19. Since it is developed on a commonly used smartphone, cough audio classification is cost-effective and easy to apply and deploy.
Malai was also integral to another COVID-19-related milestone in 2020. In response to the pandemic, the White House, the Allen Institute for AI (AI2), and leading research groups created the COVID-19 Open Research Dataset (CORD-19), a publicly available database containing more than 140,000 scholarly articles about COVID-19, SARS-CoV-2, and related coronaviruses. Recognizing the imperative need to expedite COVID-19 research, within days Malai spearheaded the development of a semantic search capability to help the global research community use this database even more effectively. On any given day, the semantic search engine received 20,000 to 30,000 active searches and queries—helping researchers, medical professionals, and people in general access the continuous flow of COVID-19 data faster than previously possible.