Q&A: Emerging Analytics: Integrating Big Data, Content Analytics, Search, and Collaboration into Your Analytical Environment
What’s driving big data analytics, and how can integrating search help BI professionals better work with the data?
What’s driving big data analytics, and how can integrating search help BI professionals better work with the data? To learn more about big data and technologies that work with it, we turned to Mike Ferguson, managing director of Intelligent Business Strategies Ltd. Mr. Ferguson will be speaking about this topic at greater length at the TDWI World Conference in San Diego (July 29 through August 3, 2012).
BI This Week: What types of data are you seeing used in big data analytics and for what business purpose?
Mike Ferguson: Popular data sources for big data analytical applications are definitely Web data, sensor data, and unstructured content. Web data consists primarily of Web logs, where data is being analyzed to optimize site navigation and understand navigation behavior of customers. Social network data is also popular.
Sensor data is also a rapidly emerging area. Real-time analytics can be performed on data streams being created by sensor networks to drive immediate actions if problems or opportunities arise. Also, variance data can be stored for subsequent historical analysis to see if repeating patterns emerge indicating things like recurring problems. Example application areas for sensor data are supply/distribution chain optimization, asset management, smart metering, fraud, and grid health monitoring -- even heath care, where sensors can monitor a patient’s vital signs to alert medical staff when immediate action is needed.
What kinds of applications are driving the demand to analyze content?
Content analytics is the process of analyzing semi-structured and unstructured content from one or more data sources to derive insight of business benefit. High return on investment applications associated with content analytics include:
- Case management
- Fault management and field service maintenance
- “Voice of the customer” (for example, creating customer insight from e-mail and instant messages)
- Sentiment analytics (that is, creating insight from customer experiences on social networks)
- Competitor analysis
An example application is brand and campaign management, where content analytics is used to monitor consumer feedback as well as brand image and messaging, identify potential product issues early in the life cycle, evaluate campaign effectiveness, and identify opinion leaders and infuencers. Many types of content can be analyzed including HTML, documents, forms, e-mail messages, SMS content, and digital assets. I’ll be discussing five different approaches to analysing content in my class at the TDWI World Conference in San Diego.
Why integrate search and BI? What are the benefits, especially when it comes to big data?
There are several business benefits to integrating search and BI. First is the growing interest in self-service BI. By integrating search with BI, it becomes possible to present the user with a much simpler interface that is likely to broaden the number of users who are likely to use BI. Finding reports simply by entering a search query is a lot easier to do than using a BI tool.
In addition, being able to search also finds additional content beyond BI reports and dashboards to support the making of a decision, which means that users are better informed. Also, dimensions and measures in reports can become facets in search results -- a user can zoom in to find all reports holding a particular measure of interest.
The second reason search and BI is of interest is the emergence of big data, and in particular the need to do exploratory analytics on text. By using text analytics on multi-structured data, it is possible to build an index on big data in platforms such as Hadoop. In fact, MapReduce programs can leverage the power of Hadoop running on hundreds or even thousands of processors to build a search index for BI applications.
The result is that search can be used to index structured data and unstructured content in a big data environment.
Other applications and tools can then leverage search as a way of getting access to data warehouse data and Hadoop-based big data environments by making use of a search API. In fact, several new analytical search-based BI tools have emerged that can analyze data in these environments. I will be discussing some of these in my Emerging Analytics class in San Diego.
What is the difference between BI search and enterprise search?
With BI search, you use a search interface to search for BI artifacts (for example, queries and reports) available on a BI platform that directly accesses underlying data warehouses, data marts, and operational systems. BI search is normally provided by a BI platform vendor. It may also be the case that BI may provide connectors to enterprise search products.
Enterprise search is technology that can search unstructured, semi-structured, and structured data. This is normally provided by an enterprise search vendor, an enterprise content management vendor, or a software giant (for example, IBM, Microsoft, Oracle, SAP).
Enterprise search and BI search can be combined to facilitate rapid analysis. Also, a key benefit of integrating enterprise search with BI is that enterprise search can bridge the divide between structured data in a data warehouse, multi-structured data in a big data environment, and unmanaged data on the Web, files servers, and other content stores. A search index can be built across all three of these environments allowing users to use search-based BI tools and analytic applications to discover content, analyze it, and publish the results.
What do you consider to be the key requirements for collaborative BI?
The key requirements are to make information:
- Easy to access
- Easy to produce (for example, via collaborative authoring in a self-service BI environment)
- Easy to consume
- Easy to share
- Easy to enrich with other information and expertise (for example, via blogs, wikis, in-line comments, additional supporting content)
- Easy to collaborate over in an enterprise social computing environment
- Easy to rate
- Easy to find (for example, via Bookmark, taxonomy (navigate), and search)
- Easy to track decisions
Collaborative workspaces are becoming very important. A collaborative workspace is virtual workspace where a community of internal and/or external users can share and collaborate over business insights and other related content to jointly make decisions to manage a specific area of the business. Mobile BI users should be able to integrate with these.