Critical Hadoop data sparks risk concerns

Hadoop apps are key to many organisations' cost-effective Big Data plans yet may introduce risks of their own, notes Stefanie Hoffman

Big Data trends aren't going away any time soon. Neither are compliance regulations. So it should come as little surprise that more organisations are looking to track and locate the most sensitive information stored on their Hadoop databases.

Data security and protection firm Dataguise has brought that to light in a survey indicating that customers increasingly want to be apprised of Hadoop information that could put their organisation at risk.

For the channel, that means a new chance to dive deeply into data classification and pre-audit services aimed at finding, categorising and prioritising some of the most critical information housed in Big Data environments.

Among other things, in the survey 80 per cent of respondents said that it was important to know if sensitive information was stored in their Hadoop environments. And 77 per cent of respondents indicated that it was important to protect access to that sensitive data.

Their fears are not unfounded, largely because Hadoop, an open source framework that supports Big Data applications, is increasingly becoming the de facto repository to store any and all corporate information.

So much so that I believe about a third of organisations store sensitive data on Hadoop, including social security and credit card numbers and addresses.

And Hadoop will probably get even more popular as a means of storing information. Forty-three per cent of respondents to the survey were testing the platform while 31 per cent had active production environments.

Those Hadoop environments consisted primarily of log files (55 per cent) along with structured DBMS data (36 per cent), and mixed data (24 per cent).

All told, Hadoop may introduce a slew of challenges for organisations - not the least of which is lack of skills (35 per cent), Hadoop usability (23 per cent) and security management issues (21 per cent).

But those challenges, in turn, leave a wide berth for the channel to expand its Big Data footprint and build related services specifically around complex Hadoop environments.

Manmeet Singh, chief executive of vendor Dataguise, said: "Organisations require a straightforward and economical way to determine where sensitive data is and how to effectively secure their Hadoop environments.

"The data here shows that data privacy protection is important to Hadoop users and that they are actively engaging security personnel to find ways to detect and protect sensitive data to meet compliance requirements. Using solutions such as DG for HadoopTM by Dataguise allows for proactive actions to be taken while alleviating the complexity and cost of data privacy protections."

More than anything, Big Data gives the channel a foundation on which to offer new combinations of products and a more diverse array of differentiated services. This is also an opportunity that organisations themselves are starting to recognise.

Of late, HP gave its Information Optimization portfolio a Big Data makeover, merging its Converged Infrastructure offerings along with Autonomy and Vertica technologies - a move that enabled its AppSystem appliance to support Hadoop.

Last year, Symantec embarked on a joint effort with Hadoop support and services firm Hortonworks to produce an add-on for the security firm's Cluster File System, dubbed Symantec Enterprise Solution for Hadoop.

The rising interest in Hadoop environments is underscored by statistics from Gartner. Among other things, the research firm asserts that by 2015, 65 per cent of prepackaged analytic applications will have Hadoop already embedded, and predicts that business intelligence and analytics will need to scale in order to meet customers' rising Big Data demands.

Meanwhile, in light of meatier compliance regulations, some of the biggest demands undoubtedly will be around detection and protection of sensitive data.

To that end, the channel will have new opportunities to centrally manage necessary detection and protection, and govern compliance enforcement while easing adherence to more punitive regulatory requirements.

Additionally, partners will be able to revive opportunities to locate and identify data across Hadoop environments, as well as to go deeper with services around information increasingly put at risk in Big Data environments.

Dataguise is the latest hoping to get in on the Big Data ground floor with data protection. The firm faces stiff and relentless competition from heavyweight competitors -- but most organisations have yet to wrap their heads around the entirety of their Big Data challenges.

That is where smaller firms such as Dataguise hope to carve out their own niche.

Stefanie Hoffman is West Coast editor and senior associate at Channelnomics

As part of our special editorial partnership, CRN is republishing this article from Channelnomics