The elephant in the room

IDC's big data, analytics and social media forum saw Hadoop's elephant logo as central to the discussion. Fleur Doidge reports

Hadoop has been looming on the channel landscape for a while now. However, if the latest IDC conference on big data and social media is any guide, it will soon be a mighty presence in many discussions about taking advantage of big data -- especially as customers aim to harness the endless parade of social media data to gain competitive advantage.

The elephant logo of the open-source, Java-based distributed file system made by Apache is being increasingly cited to hose down fears about the massive volume, velocity and variety of data that businesses must now take into account. While customer adoption cannot yet be said to resemble a stampede, a growing number of successful big data implementations have Hadoop as a central and crucial building block.

Alys Woodward, research programme manager for European business analytics and social platforms at IDC, said social media is already transforming organisations and industries. However, the long-trumpeted potential for business intelligence and business analytics has never been realised.

"Big data is an opportunity and a challenge," she said. "In 2020, the ICT industry will reach $5tn (£3.1tn), and 80 per cent of that will be driven by third-platform technologies and an explosion of new solutions built on the new platforms."

That is according to IDC's own research, which defines "third platform" essentially as technologies related to mobile platforms and apps that generate large volumes of - often fragmentary - data in various formats at speed. The transformation is happening even faster in emerging markets, Woodward noted. Organisations of all types, in many locations, have been developing their social media strategy for years now, and they expect to profit from it - a task easier said than done.

"And in 2012, the battle for the ICT marketplace of 2020 will start to be won and lost. As we are all aware, the process of change [in the industry] has been phenomenal," she said.

Companies such as Google have quietly become "very significant" in new areas, such as mobile payments technology. Other companies, for example Facebook, have provided the kinds of user-friendly, globally connected and collaborative apps BI vendors have been seeking for 20 years, she said.

Meanwhile, in just a year, there has been a sea change in the attitude of organisations to social media, with more targeting the benefits and fewer focusing on the risks.

And once "the big guys" get into a market, Woodward said, the landscape changes quickly. As such, technology providers must rapidly become comfortable with their new environment and team up with the right vendors and offerings to secure ongoing growth.

"Big data is a name for something we have already been doing, but it is also a name for some genuinely brand-new technologies that are hugely transformative, such as Hadoop," Woodward said.

Davy Nys, EMEA and APAC vice president at business analytics software vendor Pentaho, agreed. He described a typical big data project as involving attempts to collect, process, interpret and divine value from several diverse sources. Generally, data is generated rapidly on the web, via a range of platforms from Twitter to Facebook, news media websites, the blogosphere, and so on and the organisation wants to integrate that with the rest of its business intelligence, from in-house applications and the like. It is a continual, moveable feast of some proportion.

"All the business transactions -buying something, signing up to a group or a membership - are typically put through a transactional processing system, your bread-and-butter-type system. But there is a lot of activity generating a lot of other data," Nys said.

Customers will increasingly be approaching their IT suppliers for help taking advantage of this other data, such as weblog data, ClickStream, online comments and so on, recording it and storing it in ways that enable it to extract value in the short or long term. Databases such as Hbase, MongoDB, NoSQL, and Cassandra can help here.

"And when the volume gets very challenging, you might want to push it into a Hadoop framework. Hadoop is a distributed processing and storage framework on commodity hardware. You can have it across 50, 100, or 200 nodes on very simple servers; you do not need a mainframe. And it allows you to extract certain data so it is ready, potentially, for consumption by business users," he said. "One click can generate 100 records in a web log file."

For users to further query data that has been collected, cleaned, manipulated and stored, an analytical database such as InfoBright can be added on, enabling users to do the full reporting through MapReduce apps from vendors such as Pentaho.

"We are trying to link up systems that were not really designed to do these things," Nys explained. "And now we are trying to challenge that. So there are a lot of blocks that stop you doing things, and a lot of technologies that you use to talk with your Hadoop framework."

Social media benefits

Previously, analysing all the data and different information flows was possible, but an extremely long-winded and involved process - no good for the organisation of today that needs to crunch the numbers quickly enough to act on them. Waiting six to 12 months to know the effect of, say, a network outage on consumers is no longer good enough -- if it ever really was.

Today, organisations require business intelligence in near-real time, Nys agreed.

He warned that a lot of implementations concerning big data using Hadoop and related applications may become quite technical. Standard business apps such as Oracle and Sage may not interface easily with the new infrastructure. People are having to go back to using command-line coding to realise their deployment. Suppliers as well as customers may need to consider this before leaping headfirst into a project, he hinted.

"It's like going back 20 years," Nys said. "Then, you can visually orchestrate all that; you can visually orchestrate the job work flows."

Ofer Guetta, social collaboration head at IBM UK, agreed. His presentation examined how the right collaborative and social media engagement - properly linked, and with each informing the other - can boost and optimise basically any workplace.

That is good news for the channel, which is looking for ways to increase its value to a diverse customer base.

"We believe social business fits into two categories. First, how do I move from ‘liking' to ‘leading'? And second, how do you create a smarter interface and engage their workforce around these tools?" Guetta said.

The first step is to help customers align their organisational goals and culture, and gain social trust, which is something every organisation needs to develop in order to succeed in any market, whether it is a tech provider, any kind of partner business, or end user, Guetta affirmed.

"It is useful for customer service, product development, marketing and sales. It can help bring new products to market more quickly, and it can increase exposure to the market as well," he added.

Banking group HSBC offers one practical example of how connecting socially in near-real time can help businesses in relation to the big data trend.

John Hartley, senior communications strategy and campaigns manager at HSBC, said its widespread ATM and web outage on 4 November 2011 sparked a Twitter storm as consumers discovered they could not withdraw funds or otherwise interact with the bank in expected ways. This was picked up by local news media, which called HSBC before HSBC had even heard from its own staff about the outage.

The communications and technical teams then swung into action but it was too late for mass awareness of the failure to be avoided, even though all systems were up and working again within a few hours. This created significant reputational risk, Hartley confirmed.

Randy Silver, consultant for IT operations and client services at HSBC, said a global team now monitors and interacts with social media much more extensively, working with data in a range of languages, perhaps with photos or in different alphabets, on multiple screens and in various locations, and responds much more quickly to issues raised by end users. The big data project also helps HSBC obtain better visibility of its own infrastructure as a whole.

"We are a big company, with a big bureaucracy and highly regulated. Information security, fraud, risk, communications, marketing, IT and compliance can now work together," Silver said.