Xin Guo, UC Berkeley Coleman Fung Chair professor, led a team of campus researchers along with a group of international medical researchers in developing a method to detect early signs of cancer through blood tests.
The researchers collected blood samples from hospital patients and analyzed the methylation level in their DNA sequence. They specifically studied lung cancer data and developed classifiers through machine learning methods.
The classifiers identify the patient’s cancer risk status given their biomarker values.
Guo, who led the data analysis in machine learning techniques, explained that DNA methylation can be used to provide potential biomarkers for the assessment of cancer risk.
She stated the cancer data to be formatted as a “m” by “n” matrix with “m,” corresponding to the number of samples and “n” as the number of biomarkers. These samples include a label of healthy, tumor or benign, according to Guo.
Taking into account the international effort, the UC Berkeley team focused on data analysis, while the international medical researchers focused on data collection, cleaning and essay composition.
For the first time, Guo added, the research team achieved high accuracy for early lung cancer detection through the use of fewer biomarkers with approximately 2,000. In comparison, traditional medical practice uses 8,000 biomarkers.
“It is very difficult to make early detection for most cancers and in particular for lung cancers, from blood samples, because of the very weak and noisy signals embedded in the blood samples,” Guo said.
Previous machine learning techniques in detecting cancer have never met the golden medical standard of 80/99, Guo noted — a medical criterion contains two-classes classification, resulting in the classification as either tumor or healthy.
At 80% sensitivity, there are 100 tumor samples and 80 are predicted as tumors, as 20 is healthy. On the other hand, 99% specificity results in one tumor prediction in 100 healthy volunteers.
“Not only does it make early lung cancer detection possible, but also it reduces exponentially the cost of such detection techniques because of the smaller numbers of biomarkers needed for the detection,” Guo said.
Guo is optimistic for the future of this research and said it can lead to a more accurate detection of benign tumors from the cancerous state. She hopes machine learning techniques will become more accessible to the general population in the future.
Guo believes that machine learning will improve the medical field and make a significant breakthrough. Guo added that she wants to continue conducting research and collaborate with medical professionals and research scientists from various fields.