Big data is changing industries. Big data is a new application made possible by cloud computing. Big data is about patterns and trends — forward looking business intelligence. For example, a retailer can see declining sales from traditional monthly sales reports. However, why sales are declining and if they will continue to decline is another matter. Big data seeks to correlate multiple sources and types of data to understand the “why and if.” Big data is more complicated than scaling traditional data processing. Big data is hard because of 1) the sheer volume of real-time data being generated 2) the velocity of how quickly the data flows into an organization and 3) the variety of formats this data takes, both structured and unstructured. The benefits are real though. Insurance, banking, medical/pharmaceutical, retail, telecoms and many others see dramatic improvements in competitiveness, security and profitability. Big data is new, and that makes it hard.
What You Need to KnowBig data uses a “3 V’s” model to describe the challenges: volume, velocity and variety.
- Volume refers to the amount of data — structured and unstructured — generated in a connected society. Consider sensors in home appliances, smart phones apps, vending machines and kiosks, social media, click-stream analysis of web sites etc. This is a key point in big data. The amount of unstructured data generated far exceeds structured transactional data like product orders.
- Velocity is the pace at which transactions occur, as well as the pace of decision-making based on analysis. Velocity includes the frequency of traditional structured transactions. More importantly, it also includes very valuable and non-traditional data streams. These real-time data streams don’t fit into a traditional database formats or structures. But they provide a powerful understanding of complex commercial, industrial, and social systems. Velocity also impacts the timeframes for processing, storing, and then sharing or using the data. Big data requires increased agility for using the knowledge gleaned.
- The variety of data generators includes any device with an Internet connection or the ability to capture and store operational activities. The variety of data types and formats includes every conceivable human-human, human-machine, and machine-machine communication method, for example, map/GPS coordinates, images, telemetry, RFID, text, speech, etc. The variety of different data types often poses challenges for traditional relational databases.
What You Need to Do
The value of big data comes from trying to solve a specific business problem using Business Intelligence processing. This processing is so intensive that it often requires hundreds or thousands of dedicated virtual machines and massive storage. It can require radical re-engineering of applications and systems. If you want the business agility, innovation, and revenue growth cloud-driven big data can deliver, you most likely need significant changes in your people, process, and technology. Here is what you need to do:
Explore the many types of data available to you. The key to big data is integrating multiple sources and types of data. Be careful — just because a data stream is available or affordable does not mean it’ll aid in solving your business intelligence questions. Understand what big data is and how and why it could work for your firm. This isn’t a technology conversation. It’s a business intelligence problem solving conversation.
Assume that your database and business intelligence teams may not know the best sources of data. Your existing data bases may or may not be a goldmine. Be careful not to fall into the trap of just processing more existing transactional (in-house) data faster. While it might be useful, the highest ROI from big data comes from integrating non-traditional data sets and streams. Big data brings new roles around identifying and understanding the relationships and patterns between data sets. The “data scientist” is an emerging role. New skills may include identifying opportunities through the use of statistics, algorithms, mining, and visualization.
Consider that big data can change the structure or culture of your analytics or business intelligence teams. Big data is new and that makes it hard. Failures are a given and success will require multiple efforts. You need to support a culture of innovation.
Evaluate your infrastructure and security abilities and options. All approaches to big data analysis, including Apache Hadoop and Google MapReduce, require significant technical resources. Most traditional IT infrastructures (compute, storage, networking and software) will struggle to handle the integration and processing required. Public cloud services are one option. Private cloud is another option, but will likely require significant investments. When using external data sources, security also becomes a prime concern.
Begin by creating a cross functional business and IT team. Have business leaders describe problems they’d like to solve. Understand how you’ll integrate big data into your existing business, IT, and governance frameworks. Task DBA’s to understand the limited role of SQL in big data. Ask infrastructure team members to understand the interfaces and capacities required. Have the software team members look into writing applications to analyze data. To get going you must understand what you want to achieve before you invest in technology. Develop business key performance indicators (KPI) to show success. Consider how you’ll scale, reuse, and repurpose your efforts. Only towards the end should you consider how you’ll solve your business problem with Hadoop clusters, MapReduce, cloud services, etc.
How Big Data Challenges IT Storage Managers