next up previous contents index
Next: 9.7 Privacy and Security Up: 9. Statistical Databases Previous: 9.5 Extraction, Transformation and

9.6 Metadata and XML

McCarthy (1982) described metadata as data about data. However, the technical progress of OLTP and OLAP DBSs, workflow techniques and information dissemination has made it necessary, to use a more general definition of metadata.

Metadata is now interpreted as any kind of integrated data used for the design, implementation and usage of an information system. This implies that metadata not only describes real data, but functions or methods, data suppliers or sources and data receivers or sinks, too. It does not only give background information about the technology of a DBS or DWS, but about its semantic, structure, statistics and functionality. Especially, the semantic metadata enable the common user to retrieve definitions of an attribute, to select and filter values of meta attributes, and to navigate through taxonomies.

Figure 9.8: Statistical view of metadata
\includegraphics[width=8.3cm]{text/2-9/II8.eps}

In Fig. 9.8 we present a view of a conceptually designed metadata. Its core is given by a statistical object, which is either a specialisation of a data matrix or a data cube. It is uniquely described by a definition, and is related in a many to many way to validation and processing rules, surveys or reports and attributes. As we present only a view, no further refinement is given with respect to attributes like roles (measure, key, property), scales (nominal, ordinal, cardinal), ontologies or even domains (natural, coded) etc. Each statistical object is linked to at least one survey or report. Surveys or reports can be sequenced according to preceding or succeeding ones, are related to a statistical framework (''statistical documentation'') giving details about sampling scheme and frame, population and statistical methods, and are associated to a chronicle as calendar of events. Furthermore references to the specific literature and law are included. The corresponding substructure is not displayed in Fig. 9.8. For further information about the metadata structure from the user's point of view, see Lenz (1994).

As metadata is stored and can be retrieved similar to real data, it is captured in a repository and is managed by a metadata manager. A repository can be accessed by users, administrators and software engineers according to their privileges and read-write rights.

Such repositories are offered from various vendors. Microsoft (2001) labelled its repository as ''metadata services'', and it is integrated in its SQL server. Alliances were founded to harmonize the metadata models and to standardize the exchange formats. Leading examples are the Open Information Model of the Metadata Coalition (MDC), see http://www.mdcinfo.com, and the Common Warehouse Metamodel (CWM), which was developed by the Object Management Group (OMG), see http://www.omg.org. Since the year 2000 both groups were fused and try to merge their models. Due to the increasing importance of XML and XML databases, import and export format of metadata based on XML is becoming an industrial standard. This happened to OLAP client-server architectures, see ''XML for Analysis'' as referred in Sect. 9.3.3.


next up previous contents index
Next: 9.7 Privacy and Security Up: 9. Statistical Databases Previous: 9.5 Extraction, Transformation and