In the Computational Research Division at the Lawrence Berkeley National Laboratory, researchers have made strides identifying individuals at risk for suicide in their ongoing study with the Veterans Administration, or VA.
The VA and the Department of Energy, or DOE, are both involved in this project — combining the VA’s extensive genomic and health data with the DOE’s innovations in computing and analytics to better predict and prevent suicide, among other health issues affecting veterans.
The project applies deep learning models and patient-specific algorithms to electronic health record, or EHR, data to identify individuals at risk for suicide. The Berkeley Lab has started developing algorithms for a preliminary data set, known as the MIMIC-III — a data set of 40,000 patients in a Boston hospital, which is a publicly available EHR.
“The tools would be used to create a clinical decision support system that assists VA clinicians in suicide prevention efforts, and helps to evaluate effectiveness of various prevention strategies,” said United States Secretary of Energy Rick Perry in a statement on the DOE website.
Silvia Crivelli, a program manager at Berkeley Lab and head researcher of the project, said part of the challenge is that the data set is “noisy.” It requires “very powerful computing” and “very good algorithms,” according to Crivelli.
The Berkeley Lab team, led by Crivelli, utilized structured data such as demographics, prescribed medications, lab work and procedures. The team also used unstructured data, consisting of doctors’ notes and discharge notes to assess suicide risks, according to a Berkeley Lab article. One of the study’s goals and challenges has been to identify patterns in the unstructured data set that indicate a suicide risk.
“There is a big process of getting data ready for machine learning,” said Katherine Yelick, associate lab director for Computing Sciences at Berkeley Lab.
Crivelli and Yelick said that to develop the algorithms that identify suicide risk factors, the research team must look at words and the context in which they appear in the unstructured data set. The researchers said this is especially challenging with the vast possibilities of concepts that potentially indicate risk factors. For example, a broader concept such as social isolation could appear in the unstructured data set in a variety of ways.
With suicide as the 10th leading cause of death in the United States and even higher in the veteran community, the project aims to prevent these deaths, according to Perry’s statement. Crivelli explained that the ultimate goal of this study is to harness the power of computing to help make better diagnoses.
“We are trying to see whether we can identify patterns that have not even (been) identified by medical professionals,” Crivelli said.