Q&A: Understanding Enterprise Metadata
Data about data is a company's bread crumb trail to its information, but many enterprises have no management strategy.
Data about data is vital if you want to understand a company's information. What traps can you expect when you develop a metadata management strategy, and what best practices can you employ to make the process successful? For insight and answers, we turned to Greg Keller, chief evangelist at Embarcadero Technologies.
Enterprise Strategies: What is metadata and why is it so important to the enterprise?
Greg Keller: The short answer: it's data about data. The real answer: it's a company's bread crumb trail to its information … its lifeblood. Data is not information. Metadata is the descriptive taxonomy for a company's data which enables it to become usable information. Its most traditional implementation, metadata describes and categorizes the data which exists in enterprise applications. Not having this taxonomy increases the probability of applications errantly producing the same data (e.g., redundancy) and likely produced in a non-standardized way. Compound this problem over the hundreds or thousands of database systems in the enterprise and very quickly you can see how a data quality epidemic can arise.
What does a metadata management strategy include?
Keller: A metadata management strategy does not require you to boil the ocean or incur massive expenses. It can begin with its co-ownership by IT and those closest to the data who can assist in describing its semantics…e.g. business analysts who can describe the rules that require enforcement on the data. Ultimately, the strategy should include three key components: a means to collect, a means to normalize, and ultimately a means to communicate it.
This last component is key as the potentially voluminous information must be presented back to the business in a clear way so non-technical users can benefit from it to raise their productivity (e.g., a line-of-business analyst assembling a report and who must search for data of a specific type to quickly understand, source, and include all relevant data).
What are the general challenges a company is faced with in managing metadata?
Volume is the nemesis. The client/server boom in the 90s did good and bad things for us. It helped us stand up real applications sitting on relational database systems in weeks (not years in prior approaches) but brought with it massive data redundancy, departmentally siloed information, and radically varying quality. ERP systems tried to cure this ailment (among other things it promised) but exacerbated the problem by obfuscating a company's data.
Ultimately, the business that needs to leverage its data to drive itself forward competitively is at a stalemate as it simply is ineffective at sourcing its own information to make critical decisions.
What do companies do today to solve these problems, and how successful are they?
The successful solutions we've seen in companies vary from large scale Repository implementations to, more often, a good modeling tool and simple Excel spreadsheets made accessible to the business through intranets and SharePoint describing data assets. Realistically, companies have to acknowledge they have a data knowledge problem and back a decision to help resolve it. That is the very first step to a solution…admitting you have the problem and why it is detrimental to the business.
Are there any "gotchas" or traps that IT falls into when trying to manage metadata?
Going "too big" initially by overzealous IT or architecture teams who are attempting to build a metadata panacea is likely a path to failure. It's like trying to run a marathon with no training. The traps that are likely to be seen are from the business itself that is pressuring to see results and forced to wait for this promised valuable IT service.
Sourcing metadata from database data dictionaries is fairly trivial, but the gotchas are often attempting to understand the semantics and meaning of it. It takes time and buy-in from those closest to the application and business to assist in describing it so it can be documented, turned around, and communicated effectively.
What best practices can you recommend? What should an enterprise be doing to create an effective strategy to address the challenges you've mentioned?
As mentioned above, a good rule of thumb for ambitious teams implementing a metadata strategy is to pick a problem, even a small one, and follow through to successful implementation. An example would be to start with a single problematic application, categorize and document its data, and turn this around to business and development teams in a manner they can benefit from to get their jobs done faster. In other words, get a quick win with a challenging application and data consumer segment that will benefit from the metadata and they will tell tales of their improved effectiveness and productivity due to the metadata path to the apps data. Then replicate that success playbook to other applications/constituencies. It is no more complex than that.
What products or services does Embarcadero Technologies offer regarding metadata?
We take metadata seriously and it is core to our offerings. The DatabaseGear family of products offers key components of a metadata strategy including ER/Studio, a data modeling product which collects and visually displays metadata and its relations, ER/Studio MetaWizard, which is a facility of ER/Studio to help connect to and collect metadata from a variety of non-relational database sources (e.g. XML files, UML tools, business intelligence repositories, etc), and ER/Studio Enterprise Portal, a vehicle to enable Google-like searching of this metadata by the non-technical users in the enterprise.