What does Big Data mean for your storage strategy?

Because Big Data brings a unique set of challenges – not least the sheer volume and nature of the data to be processed – it overturns traditional precepts of data centre and network architecture.

Digital data will grow at over 40% per year until 2020, according to IDC’s 2014 Digital Universe Study1. The same report estimates that computerised devices with added software or intelligence – from cars and turbines to aircraft and dog collars – account for 10% of this ‘digital universe’ in 2014. With some 11 billion connected to the internet in 2014, this number will grow by around 150% to 28 billion connected devices on the Internet of Things (IoT) by 2020.

This rate of growth has been accelerating for some time – but the difference now is that smart executives have realised that increased digitisation is generating massive new sources of rich customer data which presents real opportunities for the creation of competitive advantage.

 What is Big Data?

Big Data applications typically involve the Four Vs: volume, velocity, variety and veracity (see callout). Some examples include:

  • Wide-ranging operational data from ERP, Financial and other enterprise systems is being presented by Business Intelligence, Business Analytics and dashboards in near-real-time, to transform the way managers analyse organisational performance, forecast and make informed business decisions.
  • In mining, equipment management has become key to operational efficiency; for instance, details such as tire pressure, scheduled repairs, faults and driver information (routes on site, speed, proper use) are driving real profitability benefits. Gas and Oil pipelines, both undersea and underground, are now equipped with thousands of sensors to detect flow rates, pressure and much more, then deliver alerts.
  • Financial organisations analyse point-of-sale, geo-location, authorisation and transaction data to identify patterns – enabling them to identify potential fraud within minutes.
  • For utilities, manufacturers and construction companies, connected ‘wearables’ such as work vests are also beginning to monitor environments for the health and safety of field workers.
  • The transportation sector is increasingly gathering real-time operational data to manage everything from location of vehicles and goods to meet operational requirements, to traffic flows on major roads and through toll gates and tunnels.

Naturally, effective platforms to manage and report on all of this data are essential to any Big Data application – but that’s a whole other story. Here we’re examining the specific implications of Big Data for your data storage strategy which includes both size and location.

The Challenges of Big Data

Because Big Data brings a unique set of challenges – not least the sheer volume and nature of the data to be processed – it overturns traditional precepts of data centre and network architecture. Most corporate ICT environments are currently ill prepared for the inherent challenges Big Data brings, and here are just some of the dilemmas our customers are wrestling with:

DAS/NAS versus SAN
Big Data computing architectures call for compute resources to be placed near the individual data sets, meaning localised Direct-Attached Storage (DAS) or Network-Attached Storage (NAS) and compute. These architectures differ from centralised Storage Area Networks (SANs) typically used for transactional applications. In networking terms, this moves traffic patterns from traditional server-client traffic to server-to-server traffic flow across the interconnected data centre network fabric.

Physical versus virtual servers
Bare-metal servers are proving to deliver superior and more consistent performance than similarly configured virtual servers. High Density servers are now making it possible to run terabytes of fast storage close to fast compute intensive power (DAS). In this new ‘shared nothing’ architecture, work is distributed across independent, stateless nodes that scale by just adding another node to meet the challenges of Big Data applications.

Direct connect versus WAN
Distributing large data loads across multiple devices is not well supported by most enterprise infrastructure platforms designs. The first question is always: “Do we have enough bandwidth?” More than 40% of 540 Big Data decision-makers in a 2014 survey2 said that increasing network bandwidth is a top priority in preparing their infrastructure. Increased horizontal traffic demands multiple 10G interconnects rather than the 1-10Gbs links of typical corporate WANs.

If you need to analyse external data where it often resides, in cloud data centres, your WAN must be equal to rapid data transfer requirements, making many public cloud providers that charge for data transfer cost prohibitive, even when your link bandwidth and latency are up to the mark.

More and more, enterprises are seeing the benefits of moving their storage to High Density environments in a Tier III data centre – to reduce infrastructure sprawl and deliver the security, scalability, high availability and high-speed network access that their real-time Big Data applications demand.

What’s Your Best Data Storage Strategy?

So how should your storage policies and infrastructure be adapted to make them ready for Big Data – let alone your network and compute power? It all comes down to your business case and the value that Big Data can provide your business.

Here are three important questions you need to ask to determine your storage requirements:

  1. What data does your organisation currently collect and how much storage do you need to plan for?
  2. Do you currently have effective policies to analyse, retain, maintain and discard it: ie, what data cleansing methodologies do you need to meet the veracity requirement and what information governance do you need to put in place?
  3. When do you require your data to be available, from data capture to action, and how will your network need to change to get it there when required?

Answering these questions can be complex and success in pioneering any Big Data initiative depends on the right advice, so seek proven experience and expertise in helping you answer them.

[1] The Digital Universe of Opportunities: Rich Data and the Increasing Value of the Internet of Things, IDC April 2014

[2] QuinStreet Enterprise Research: 2014 Big Data Outlook, May 2014