Yana Feldman gave a talk at the CITRIS Research Exchange event Wednesday about the use of machine learning for analyzing nuclear proliferation data.
Feldman, who is a nonproliferation and international safeguards analyst at Lawrence Livermore National Laboratory, spoke about the challenges of researching nuclear proliferation activities using open-source information.
“There is just so much information that is becoming available in open sources,” Feldman said. “There is a lot of nuclear data out there, but relevant to everything else, it is a needle in a haystack.”
Feldman noted the importance of visual information, as there could be hidden context in images that cannot be found in the text. For example, in a picture of a nuclear centrifuge in North Korea, there could be more information than what is listed, such as the capabilities of the weapons and where they might come from.
Feldman also said the current way analysts look for information is inefficient, as the search engines may not always bring the most relevant information.
“The volume, the rate, the diversity of the information, open-source, visual information in particular,” Feldman said. “It is outpacing and overwhelming the human analysts … (it) is too much for us to cope.”
A team composed of analysts, machine learning specialists and high performing computer specialists built a “semantic wheel,” a framework for the multimodal feature that learns how to read text, images and videos and map the open-source data, according to Feldman. The analysts applied this framework to nonproliferation, using the information to assess the capability of other states to develop a nuclear fuel cycle or nuclear weapons.
Machine learning includes a unimodal and a bimodal system framework. A unimodal system analyzes a frame from image to image or from video to video. A bimodal framework analyzes the frame from image to video or from text to video.
The group trained the systems to have far fewer subject-specific pairs, allowing for a much quicker integration of new types of data beyond texts, images and video. They found success with both modes, as both systems produced relevant search results.
Feldman said she had a fear in the beginning that the machines and algorithms would replace human analysts because of the former’s speed. Feldman said she realizes now, however, that machines serve to make jobs like hers easier by increasing the amount of data for analysis.
“It is in fact possible to get at data I did not previously have access to, and retrieve data using visual elements and to do this in a way that is much faster than before,” Feldman said. “As an analyst with this system, I would have access to more data and it would be prioritized for my attention.”