We’ve been hearing the phrase “big data” being tossed around among companies, industries, and organizations for some time now – but what does it really mean?
For Jerrard Gaertner, president of the Canadian Information Processing Society, big data presents a lot of potential for businesses, the public sector, and all kinds of industries – but any work with big data needs to be done with data security in mind.
While he teaches courses on this topic at Ryerson University and the University of Toronto, he managed to distill a lot of the information into an hour-long mini lecture during Wednesday’s TASK meeting in Toronto. Gaertner gave a talk on what big data means, outlining not just its potential, but also the risks of adopting it without thinking of security first.
“In many cases … you’ve got an organization that’s got absolutely wonderful security and policies and procedures and segregation of duties, and everybody has a [Certified Information Systems Security Professional certification]. But big data is over here, and we’ve got our crown jewels in there, and a couple of dozen people have access to absolutely everything,” he said, addressing a room of security professionals during his talk. “I would just caution you that big data tends to be ignored or tends to be forgotten because it’s so new.”
So what is big data? For Gaertner, he characterizes it as having at least three V’s:
This is huge amounts of data – not just gigabytes or terabytes, but potentially petabytes or exabytes.
Big data includes a variety of data, which aren’t just housed within Excel files or Word documents. This can include every file format out there, Gaertner said.
– and Velocity.
“Most big data installations – you can’t necessarily control how quickly the data comes in,” he said. For example, he mentioned how many companies have marketing departments that do sentiment analysis, meaning they analyze tweets on Twitter, posts on Facebook, or other areas of social media to figure out how a new product is performing in the marketplace and how people feel about it. However, given this is social media and Twitter users alone can create as many as 5,000 tweets a second, those seeking to harness big data can’t control how much data is coming in, nor how quickly, Gaertner said.
Given how so many businesses and industries want to tap into big data and the insights it can bring, it’s not surprising people are eager to just upload their data and start using open source software from frameworks like Apache Hadoop.
Still, Gaertner told the audience of security professionals this is where security and risk management come in. He named a number of factors that need to go into a strong, effective implementation of big data, such as creating appropriate research facilities, using relevant data sources, ensuring the hardware used has the capacity to process the data, using the right software and analytics tools, training staff in proper procedures – the list goes on.
However, a large chunk of that list requires security professionals to lend a hand, and people can’t just be left alone to play with big data without safeguards and controls, he said.
“Does the [chief security officer] or privacy officer know you’ve dumped all the information you own into a bucket and you’re playing with it?” Gaertner said, adding one of the biggest risks with big data is putting all of an organization’s data in one place, or all of its eggs in one basket.
He added security professionals also need to ask about the “provenance” of the data, or where it came from. After all, there are business risks, ethical risks, and privacy risks to using data from just anywhere and not adequately protecting it.
And of course, one of the most important pieces of security in any organization is to ensure employees are well-trained and educated in understanding the risks, especially when it comes to big data. That’s even more important than relying upon the tools and layers of defense set up to protect an organization’s data.
“You’re all security professionals,” Gaertner said to the room. “You know – never rely on the technology. It’s people, people, people.”