New Big Data Initiatives in Oncology

As cancer therapy moves towards the concept of “real-time” oncology — the ability to constantly monitor changes in a patient’s disease and adjust treatment accordingly — we see big data and high powered analytical approaches increasingly playing a key role. The aim of such efforts will be to find novel biomarkers and a new understanding of cancer biology, drug activities, and adverse events that can be employed to optimize clinical trials, personalize treatment options and predict therapeutic outcomes. At the recent European Society of Medical Oncology (ESMO) meeting, two new partnerships were announced with such a big data focus.

The first was a major open-access collaboration between Merck KGaA and Project Datasphere LLC, a non-profit initiative of the CEO Roundtable on Cancer’s Life Sciences Consortium (a non-profit corporation founded by President George H.W. Bush in 2001 to develop and implement initiatives that reduce cancer risk, enable early diagnosis, facilitate access to treatments and hasten the discovery of novel, more effective therapies). The two groups will work together to greatly expand Project Datasphere’s existing databases, which currently contain de-identified historical clinical data from approximately 100,000 patients from multiple institutions. The collaboration will add data from rare tumor trials, as well as experimental and real-world patient data from sets supplied by the National Cancer Institute and other groups. The goal of the effort is to use big data and analytical power to help optimize clinical trials, better define personalized treatment options and predict outcomes, and address other unmet needs in oncology. The collaboration will also study immune-mediated adverse events to help regulators develop better treatment guidelines and models that can identify potential adverse events earlier in the drug development process.

The second partnership announcement was an extension of Thermo Fisher Scientific’s Next Generation Sequencing Companion Dx Center of Excellence Program (COEP) in Europe, marked by a second agreement with the Institute of Pathology Heidelberg aimed at developing a Center of Molecular Pathology at Heidelberg University Hospital, which is the site of Germany’s largest tissue biobank. Thermo Fisher has been forming strategic collaborations aimed at the use of its Oncomine portfolio of research panels and NGS technologies to help spur the development of companion diagnostics in Europe. The IPH Center of Molecular Pathology is the second organization to join the COEP after the Institute of Medical Genetics and Pathology at University Hospital Basel, which joined in April. International reference laboratories, LabCorp and Cancer Genetics Inc., have also joined the COEP effort to help develop and commercialize Oncomine-based tests within both Europe and the United States.

We believe these partnerships will become increasingly critical as companies look to better leverage and utilize biomarker and clinical trial data, as well as work real world data into the mix. The goal will be to make it easier for oncologists to incorporate best practices and better drug choices as identified across institutions and data on patient groups and outcomes that are outside of formal clinical trials. An example of this latter goal is that of the partnerships signed in June between big data analytics platform CancerLinQ with FDA and the National Cancer Institute, giving CancerLinQ users access to data from the NCI’s Surveillance, Epidemiology and End Results (SEER) program, a key data source on cancer outcomes and survival rates in the United States. SEER includes data from 18 cancer registries covering approximately 30 percent of the US populations, and it includes patient demographics, diagnoses, tumor morphology and state at diagnosis, treatment data, lab data, and follow up records.

As yet, however, applications of such big data and artificial intelligence (A.I.) approaches to the improvement of clinical medicine remain in their infancy. IBM’s Watson is a case in point; the company began selling use of the supercomputer to doctors to recommend the best cancer treatments three years ago, but Watson’s performance is still not living up to expectations. Only a few dozen hospitals have adopted the system and doctors outside the United States complain that Watson’s recommendations are biased towards American patients and methods of care. The amount of data for analysis remains comparably limited and patient diversity is an issue. Moreover, much patient data needs to be manually submitted, and the system has trouble dealing with the idiosyncrasies and inconsistencies of medical records. Perhaps most limiting, Watson is yet unable to create new knowledge, still relying heavily on the recommendations and treatment preferences of a few dozen physicians at partner Memorial Sloan Kettering, the system’s trainers. Even some early partners, including M.D. Anderson, have shelved the effort of working with IBM. On the positive side, patients at hospitals in remote areas of the world where there are no oncology specialists have benefitted greatly from Watson’s recommendations. Watson also provides doctors with the best literature about a treatment, and empowers patients by offering information on specific treatment plans with supporting literature that helps them make better informed decisions about their care.

Other early efforts by Google (DeepMind and Verily’s Baseline) have also had limited successes to date, simply underscoring that applications of big data to health care have far to go.