When mining information systems for stories, journalists are increasingly casting a wider net into a world that experts have dubbed “big data” — information too large or complex for people to sort through in traditional ways.
Reporters, professors, attorneys and other guests gathered at the Logan Symposium — one of the world’s largest investigative reporting conferences, now in its eighth year running — and discussed the potential and pitfalls that big data presents for journalists during a panel on day one of the three-day symposium. The symposium is hosted by the Investigative Reporting Program, through UC Berkeley’s Graduate School of Journalism.
Jennifer LaFleur, a senior editor of data journalism at the Center for Investigative Reporting, said during the panel that many newsrooms across the country do not have the software tools to work with big data given its volume, which is often measured in terabytes or even petabytes.
“We’re used to working with spreadsheets and columns,” LaFleur said during the panel. “But new data isn’t that clean.”
Still, new data is constantly available for analysis, LaFleur said, pointing to recently released Medicare information from the federal government.
LaFleur was joined on the panel by fellow data editor David Donald of the Center for Public Integrity and Jeremy Kroll, CEO and co-founder of K2 Intelligence, an investigative consulting firm.
Teasing narratives out of big data is the ultimate goal for journalists and data analysts alike, Kroll said. Among other areas, his firm investigates risk, reputation and cybersecurity, which K2 consultants then compile into visual presentations for their clients.
The three panelists agreed big data is never perfectly trustworthy, especially when obtained from the government or extracted from the Internet.
Donald discussed the volume, variety and velocity of big data, which pose challenges for those seeking to analyze it. Like people, Donald said, data does not always tell the full story.
“Think like a reporter,” Donald said during the panel. “Data is just another source.”