UC Berkeley helps lead accelerated data sharing between top West Coast research institutions

Related Posts

UC Berkeley is one of two universities leading a project that will connect the data networks of more than 20 major research institutions on the West Coast.

The Pacific Research Platform, or PRP, will expedite collaboration between scientists from all over the Pacific region by enabling faster data transfer between institutions that are part of the platform, as well as faster access to the computing capabilities at these institutions, according to Amy Walton of the National Science Foundation. UC Berkeley and UC San Diego are leading the project.

The platform won $5 million in funding from the NSF after it demonstrated at a conference in March hosted by the Corporation for Education Network Initiatives in California that it could move data 500 times faster than speeds currently available to West Coast researchers, according to Tom DeFanti, co-principal investigator for the project.

Currently, scientists on the West Coast use independent subnetworks for research purposes. To separate research information, or “science traffic,” from general Internet traffic, or “enterprise traffic,” something called a Science DMZ, or demilitarized zone, is incorporated into the network, according to Eli Dart, a network specialist with the Department of Energy’s Energy Science Network.

“The militarization there turns out to be the defenses the campuses have to put up because of the fact that they’re getting 50,000 malicious hits a day,” DeFanti said. “This is because they’re open to the Internet.”

Each campus has its own Science DMZ so researchers can transfer data in a network not hindered by these defenses.

The platform will attempt to connect the Science DMZs of the more than 20 institutions involved, which includes all 10 UC campuses, four supercomputer centers, the U.S. National Center for Atmospheric Research and the Lawrence Berkeley National Laboratory.

Jon Bashor, a colleague of Dart’s at the Lawrence Berkeley National Laboratory, said the NSF grant has helped make the PRP a “longer-term, more stable project.”

Larry Smarr, director of the California Institute for Telecommunications and Information Technology and the co-principal investigator of the project, conceived the idea of an aggregated Science DMZ and led the first experiments with the Pacific Research Platform.

According to Walton, in one of these tests, they were able to move 1.6 terabytes of data in four minutes. On the default intercampus Internet, a tenth of a terabyte could take up to three hours to transfer.

What often happens in research is that the data itself is not housed where the researchers are located, according to Camille Crittenden, deputy director at the Center for Information Technology Research in the Interest of Society.

“Essentially any field that has data-intensive science at its core is likely to benefit significantly from this,” Dart said, referencing the fields of genomics, physics, earth sciences and climate science.

“You want to see geological changes in real time or to see the data that’s generated by some of these telescopes that are taking pictures of the sky throughout the night,” Crittenden said. “(PRP) can allow for timely analysis of data.”

This is especially vital to climate science, and other fields that use data visualization — that is, information displayed in high-resolution videos.

“(The data) can be from TV cameras, acoustical things, satellites — and these are often more than imagery that you can see. It’s multi-spectral imagery, so you’re getting a lot of different wavelengths of information,” DeFanti said. “So a place like the National Center for Atmospheric Research just pulls in huge amounts of data every day — it’s just massive. … Sometimes they let scientists compute on the data in their facilities, but that’s expensive because they have to provide the computing services.”

A lot of collaboration is necessary for a single research project, which, according to Crittenden, is cumbersome when the data sets being shared are very large.

“Either it takes a very long time to download via the usual networking platform that’s currently available, or they actually download the data on a hard drive and send them by mail,” Crittenden said.

DeFanti said the goal is to “make it feel like data is local. “

Crittenden and Dart are currently working to help scientists gain access to and understand the platform as it is developed.

Dart said that the size and diversity of the UC campuses will allow them to exploit the new platform.

“You get the Metcalfe’s Law effect, where the whole is greater than the sum of the parts,” Dart said. “The network utility of these things becomes much, much greater as more institutions connect.”

Crittenden called the Pacific Research Platform “an example of a successful multi-campus effort.”

“It’s been really gratifying to see,” Crittenden said.

 

Contact Rachel Lew at [email protected] and follow her on Twitter at @Rlew12C.

Correction(s):
A previous version of this article stated that Eli Dart is a network specialist on the UC San Diego team that developed the platform. In fact, Dart works for the Department of Energy’s ESnet, where he developed the Science DMZ. He is not affiliated with UCSD.