UC Berkeley announces data science pipeline program for students

Anthony Suen/Courtesy

Related Posts

Updated 9/21/18: This article has been updated to reflect additional information from campus doctoral student Rosanna Neuhausler.

UC Berkeley announced the Data Collaboratives program Sept. 10, allowing students to use data science to solve real-world problems.

The program was initially discussed near the end of the 2018 spring semester, and is intended to create a pipeline connecting students to data-driven projects.

The goal of the program, according to Anthony Suen, director of programs at the UC Berkeley Division of Data Sciences, is to “turn data into a social impact.”

“We’re not just purely seeking knowledge,” Suen said. “We’re trying to impact policy on a level regarding issues like water and public health.”

The program was funded in part by Schmidt Futures, a philanthropic initiative dedicated to the development of science and technology, according to a campus press release. Investment management firm Two Sigma was also an early supporter of program, according to the press release.

The program was created to leverage existing resources on campus, such as Big [email protected] — an annual contest that aims to fund and support students’ innovative ideas — and UC Berkeley’s D-Lab, a data-focused research lab. According to Suen, the team behind the Data Collaboratives program hopes that it will provide support for students, whether through courses or long-term data projects.

The program’s first big event — the California Water Data Hackathon — took place Sept. 14-15, and centered on issues of water quality and accessibility, employing data to address lack of resources in certain areas.

Campus doctoral students Rosanna Neuhausler and Olivia Hoang won the event Sept. 15. Their project, “Visualizing Voices,utilized an algorithm to skim blog posts about water quality and determine whether or not toxins were discussed — if that was the case, the date was cross-referenced with governmental water quality records in that area.

Neuhausler said in an email that the goal of their project was to make sure the voices of those affected by poor water quality are heard, considering that these individuals often live in rural communities where the government rarely samples the water.

“I think that the story behind this project is really great,” Hoang said. “It’s centered around giving a voice to people who normally aren’t heard.”

The program builds on existing data opportunities for students, including the Data Science Discovery Program, which forms students into small teams, allowing them to research alongside industry workers from the public and private sectors. Suen said he hopes the program will incentivize student interest, collaboration and research.

Suen hopes that people from different disciplines will collaborate to create more opportunities and a larger, more diverse network.

“We’re using student research and their teams as core agents,” Suen said. “There’s not just a dollar incentive — we want to test out something and let them bloom.”

Contact Alexa VanHooser at [email protected] and follow her on Twitter at @dailycalexaC.