Identifying the specific genetic mutations that cause cancer has always been a challenge. The EU-funded NONCODRIVERS project offers a solution with a pioneering approach that applies machine learning based modelling to tumour data. This could lead to more personalised therapies that save the lives of thousands of citizens every year. 

Mutations that occur within healthy cells can lead to the development of tumours. While hereditary factors can play a role, some mutations are caused by environmental factors, such as UV light affecting skin cells, or the carcinogens in tobacco. Mutations also occur randomly when our cells divide, a critical biological process that enables us to grow and replenish tissue.

“Most mutations do not alter the function of the cell,” explained NONCODRIVERS project coordinator Nuria Lopez-Bigas, research professor at the Institute for Research in Biomedicine in Spain. “We call these passenger mutations.”

However, some mutations may give a cancer cell the opportunity to proliferate. A cell with several of these ‘driver’ mutations may get to the point where uncontrolled proliferation leads to tumour formation. An effective means of identifying these driver mutations could enable the development of novel therapies that directly target these proto-cancerous cells.

Interpreting genetic data

Genome sequencing allows researchers to identify all the mutations in a tumour sample, the number of which can run into the thousands. Yet, most of these mutations have nothing to do with cancer development.

“To lead to effective treatments, we felt that the interpretive step had to be improved,” said Lopez-Bigas. “We needed to be able to say which mutations are cancer mutations.”

To fill this gap, gene sequencing data from over 28 000 tumours from around the world was collected. The EU-funded NONCODRIVERS project, supported by the European Research Council, was innovative in using artificial intelligence to lead the analysis of this data.

“There is already a lot of data about human tumours. We wanted to see if machine learning could be applied to this data, to help us to better differentiate between passenger and driver mutations.”

Using machine learning software trained on cancer genes, the team generated a model able to classify all possible variants in cancer genes as drivers or passengers. Lopez-Bigas and her team have so far been able to generate driver mutation maps for 84 cancer genes.

Towards personalised treatments

Knowing where driver mutations can be found for a specific cancer type could bring enormous patient benefits. Instead of administering a generic course of chemotherapy, which may destroy cells that have nothing to do with cancer development, more personalised drugs can be given to target these driver mutations. “You fix the thing that is not working,” said Lopez-Bigas.

The project’s results are already being applied in clinical settings. The project’s Cancer Genome Interpreter tool, freely available online, can be downloaded and used by researchers and medical professionals alike to help them identify potential biomarkers for specific cancers. Users input the cancer type, along with more specific and personalised information, in order to receive information about predicted drug responses.

“The key achievement so far has been the successful demonstration of this technology,” she said. “We have shown that if we have enough data, then we can build informative computer models. What we need now though is more data, so we can generate more models for more cancer genes and cancer types. The more data we have, the more we can achieve.”

Lopez-Bigas also recognises the need to personalise the tool further, so gathering more data is a critical next step. “What we recognise as a driver for, say, lung cancer might not actually be the driver for a particular patient. The patient’s age, whether they smoke, whether other mutations exist in the cell, all these things matter.”

Additional context therefore could make the tool more accurate. Yet, with more data involved, the models become ever more complex. “With NONCODRIVERS, it feels like we are on a hiking path,” said Lopez-Bigas. “We can see the top of the mountain, but it is still a long way off. We are still at the beginning of this journey.”