We’ve been hearing the phrase “big data” being tossed around among companies, industries, and organizations for some time now – but what does it really mean?

For Jerrard Gaertner, president of the Canadian Information Processing Society, big data presents a lot of potential for businesses, the public sector, and all kinds of industries – but any work with big data needs to be done with data security in mind.

While he teaches courses on this topic at Ryerson University and the University of Toronto, he managed to distill a lot of the information into an hour-long mini lecture during Wednesday’s TASK meeting in Toronto. Gaertner gave a talk on what big data means, outlining not just its potential, but also the risks of adopting it without thinking of security first.

“In many cases … you’ve got an organization that’s got absolutely wonderful security and policies and procedures and segregation of duties, and everybody has a [Certified Information Systems Security Professional certification]. But big data is over here, and we’ve got our crown jewels in there, and a couple of dozen people have access to absolutely everything,” he said, addressing a room of security professionals during his talk. “I would just caution you that big data tends to be ignored or tends to be forgotten because it’s so new.”

So what is big data? For Gaertner, he characterizes it as having at least three V’s:

- Volume
This is huge amounts of data – not just gigabytes or terabytes, but potentially petabytes or exabytes.

- Variety

Big data includes a variety of data, which aren’t just housed within Excel files or Word documents. This can include every file format out there, Gaertner said.

- and Velocity.

“Most big data installations – you can’t necessarily control how quickly the data comes in,” he said. For example, he mentioned how many companies have marketing departments that do sentiment analysis, meaning they analyze tweets on Twitter, posts on Facebook, or other areas of social media to figure out how a new product is performing in the marketplace and how people feel about it. However, given this is social media and Twitter users alone can create as many as 5,000 tweets a second, those seeking to harness big data can’t control how much data is coming in, nor how quickly, Gaertner said.

Given how so many businesses and industries want to tap into big data and the insights it can bring, it’s not surprising people are eager to just upload their data and start using open source software from frameworks like Apache Hadoop.

Still, Gaertner told the audience of security professionals this is where security and risk management come in. He named a number of factors that need to go into a strong, effective implementation of big data, such as creating appropriate research facilities, using relevant data sources, ensuring the hardware used has the capacity to process the data, using the right software and analytics tools, training staff in proper procedures – the list goes on.

However, a large chunk of that list requires security professionals to lend a hand, and people can’t just be left alone to play with big data without safeguards and controls, he said.

“Does the [chief security officer] or privacy officer know you’ve dumped all the information you own into a bucket and you’re playing with it?” Gaertner said, adding one of the biggest risks with big data is putting all of an organization’s data in one place, or all of its eggs in one basket.

He added security professionals also need to ask about the “provenance” of the data, or where it came from. After all, there are business risks, ethical risks, and privacy risks to using data from just anywhere and not adequately protecting it.

And of course, one of the most important pieces of security in any organization is to ensure employees are well-trained and educated in understanding the risks, especially when it comes to big data. That’s even more important than relying upon the tools and layers of defense set up to protect an organization’s data.

“You’re all security professionals,” Gaertner said to the room. “You know – never rely on the technology. It’s people, people, people.”

Share on LinkedIn Comment on this article Share with Google+
More Articles

  • Sunitha

    Nice article, Train on latest Big
    data technologies at 3 days Big Data Bootcamp – Seattle (Aug
    8-10, 2014) use offer code VLINKEDIN and Save Up to $50 , Register at http://globalbigdataconference.com/34/seattle/big-data-bootcamp/event.html

  • Ulf Mattsson

    I agree that “never rely on the technology. It’s people, people, people.”, but the landscape is rapidly changing and this good news.

    If the data is of no value to an attacker, then you have no problem. Then you do not need to rely on people and user errors (a major security issue).

    Organizations are now proactive and successfully using new approaches to address issues with security and privacy in Big Data environments.

    New use cases that are requiring data insight for analytics, high performance and scalability for Big Data platforms cannot be achieved by old security approaches. New security approaches are required since Big Data is based on a new and different architecture.

    Big Data is introducing a new approach to collecting data by allowing unstructured data to be blindly collected. In many cases we do not even know about all sensitive and regulated data fields that are contained in these large data feeds. Analysis of the content is often deferred to a later point in the process, to a stage when we are starting to use the data for analytics. Then it is too late to go back and try to apply data security and compliance to regulations.

    There is also a shortage in Big Data skills and an industry-wide shortage in data security personnel, so many organizations don’t even know they are doing anything wrong from a security and compliance perspective:

    1. I think a big data security crisis is likely to occur very soon and few organizations have the ability to deal with it.

    2. We have little knowledge about data loss or theft in big data environments.

    3. I imagine it is happening today but has not been disclosed to the public.

    The good news is that some organizations are proactive and successfully using new approaches to address issues with security and privacy in Big Data environments.

    Big Data technology vendors up until recently have often left data security up to customers to protect their environments, as they too feel the burden of limited options. Today, vendors such as Teradata, Hortonworks, and Cloudera, have partnered with data security vendors to help fill the security gap.

    What they’re seeking is advanced functionality equal to the task of balancing security and regulatory compliance with data insights and “big answers”.

    Ulf Mattsson, CTO Protegrity