How Big Data Changes What You Know About Business Intelligence
Why isn’t big data about big databases?
IT decision makers benefiting from big data know it’s harder than processing large databases. They know it’s about business intelligence from multiple types and sources of data. It’s a challenge, and success requires your people, processes, and technology to adapt. This brief explains why big data is a corporate game changer.
Big data is changing industries. A new application made possible by cloud computing, big data is about patterns and trends, that is, forward-looking business intelligence. For example, a retailer can see declining sales from traditional monthly sales reports. However, why sales are declining and if they will continue to decline is another matter. Big data seeks to correlate multiple sources and types of data to understand the “why and if.” Big data is more complicated than scaling traditional data processing. It’s challenging, because of the:
- Sheer volume of real-time data being generated
- Velocity of how quickly the data flows into an organization
- Variety of formats this data takes, both structured and unstructured
The benefits are real though. Those in several industries, including insurance, banking, medical/pharmaceutical, retail, and telecom, have seen dramatic improvements in competitiveness and profitability.
What You Need to Know:
Data management professionals use these “3 Vs” to describe big data’s challenges: volume, velocity, and variety.
- Volume refers to the amount of data—structured and unstructured—generated in a connected society. Consider sensors in home appliances, smartphone apps, vending machines and kiosks, social media, click-stream analysis of web sites, etc. The amount of unstructured data generated far exceeds structured transactional data such as that generated by product orders.
- Velocity is the pace at which transactions occur as well as the pace of decision making based on analysis. Velocity includes the frequency of traditional, structured transactions. More importantly, it also includes very valuable and non-traditional data streams. While these realtime data streams don’t fit into traditional database formats or structures, they provide a powerful understanding of complex commercial, industrial, and social systems. Velocity also impacts the timeframes for processing, storing, and then sharing or using the data. Big data requires increased agility for using the knowledge gleaned.
- Variety of data is a function of the number of data generators and the immense variety of data types and formats they produce. Generators include any device with an Internet connection or the ability to capture and store operational activities from every conceivable human-human, human-machine, and machine-machine communication method. The variety of data they produce, including map/GPS coordinates, social media streams and timelines, images, telemetry, RFID, text, and speech, often poses challenges for traditional relational databases
What You Need to Do:
The value of big data comes from trying to solve a specific business problem using business intelligence processing. This processing is so intensive that it often requires hundreds or thousands of dedicated virtual machines and massive storage. It can require radical re-engineering of applications and systems. If you want the business agility, innovation, and revenue growth cloud-driven big data can deliver, you most likely need significant changes in your people, process, and technology. Here is what you need to do:
- Explore the many types of data available to you. Key to big data is integrating multiple sources and types of data. Be careful: just because a data stream is available or affordable does not mean it will help solve your business intelligence questions. Understand what big data is and how and why it could work for your firm. This isn’t a technology conversation. It’s a business intelligence problem-solving conversation.
- Assume that your database and business intelligence teams may not know the best sources of data. Your existing databases may or may not be a goldmine. Be careful not to fall into the trap of just processing more existing transactional (in-house) data faster. While that might be useful, the highest ROI from big data comes from integrating non-traditional data sets and streams. Big data brings new roles around identifying and understanding the relationships and patterns between data sets. For example, the data scientist is an emerging role. New skills needed may include identifying opportunities by using statistics, algorithms, mining, and visualization.
- Consider that big data can change the structure or culture of your analytics or business intelligence teams. Big data is new, which makes it challenging. Failures are a given and success will likely require multiple efforts. Be sure to support a culture of innovation. [See IT Decision Brief “How to Confidently Decide to Adopt Cloud Computing” for details on adopting new innovations.]
- Evaluate your infrastructure and security abilities and options. All approaches to big data analysis, including Apache Hadoop and Google MapReduce, require significant technical resources. Most traditional IT infrastructures (compute, storage, networking, and software) will struggle to handle the integration and processing required. Public cloud services are one option. Private cloud is another option, but it too will likely require significant investments. When using external data sources, security also becomes a prime concern.
- Begin by creating a cross-functional business and IT team. Have business leaders describe problems they’d like to solve. Understand how you’ll integrate big data into your existing business, IT, and governance frameworks. Task DBAs to understand the limited role of SQL in big data. Ask infrastructure team members to understand the interfaces and capacities required. Have the software team members look into writing applications to analyze data.
- To get going, you must understand what you want to achieve before you invest in technology. Develop business Key Performance Indicators (KPIs) to show success. Consider how you’ll scale, reuse, and repurpose your efforts. Only near the end of this process should you consider how you’ll solve your business problem with Hadoop clusters, MapReduce, cloud services, etc.