A $10 million grant awarded to UC Berkeley March 29 by the National Science Foundation will help advance efforts to organize, extract and analyze massive amounts of data.
The five payments of $2 million per year will be allocated to the campus’s Algorithms, Machines and People Lab to track and understand the volumes and volumes of data circulating through the internet every day, called “Big Data,” according to Michael Franklin, computer science professor and director of the lab.
“If we’re successful — whether they’re doing science, trying to understand things about the economy or urban science or trying to understand how to maximize their business — we will produce tools for all sorts of people to make sense of their data,” Franklin said.
So far the lab’s research has focused on genome sequencing in cancer patients, highly detailed models of urban cities like San Francisco and real-time traffic monitoring to sync traffic lights and meters to reduce traffic, said Alexandre Bayen, associate professor of systems engineering and a principal investigator of “Mobile Millenium,” the traffic monitoring system. The money will help with efforts to extrapolate patterns from the data and could significantly reduce traffic, he said.
The lab was initially funded by Google and SAP a year ago, and currently is sponsored by 18 different firms. The grant will nearly double the budget of the lab, Franklin said, although most of the money will be used to hire around 15 or 16 more graduate and postdoctoral students.
The grant is part of a larger White House initiative for research into “Big Data,” although UC Berkeley was the sole university to receive funds from the foundation, according to Franklin. The hope is that the lab’s advancements can offer rare insight into the mechanics of society and improve research into such things as machine-learning, cloud-computing and crowdsourcing, he said.
But even with the grant, significant hurdles remain in compiling different sorts of information like tweets or user-generated videos and storing the more than one billion terabytes of information created per year, which is increasing by 60 percent annually, Franklin said.
Bayen echoed similar concerns.
“The grant gives a lot of freedom to handle these fundamental questions about ‘Big Data,’” he said. “But the question is: How do we fuse all the information we have?”