From Simple Dara Sources to a Complex Information System: Integrating Heterogeneous Data Models into an Information Infrastructure for the Puplic Administration

Granting access to governmental data has become a matter of public interest since long. There are national and international regulations like the German environmental information act (UIG) or the European infrastructure for spatial information (INSPIRE). Both encourage governmental institutions to make information available to the public. Despite the fast-paced development in information technology many problems with data integration are far from being solved. The availability of distributed data demands for standardised procedures in service discovery and invocation, in data processing and integration. Although there are many efforts for a standardisation of web services and data exchange formats, on the level of the administrative institutions heterogeneous data structures and service interfaces are still the normal case. Hence, the complexity of granting access to environmental information in a uniform way with respect to the aforementioned regulations is very high. The German environmental information portal PortalU (www.portalu.de) is a publicly financed information infrastructure. It offers a single point of entry for all information with environmental relevance held by public authorities. Here, the key problem is not how to deliver the information or how to present the results of a query, but how to integrate the plethora of heterogeneous data sources. The aim is to allow for the integration of data in as many formats as possible and give access to all sources with a uniform query mechanism. The architecture of PortalU contributes to this situation with its flexible and extensible concept of specialised data source plug-ins. From the point of view of the query facility in PortalU, all information sources are treated uniformly as they all deliver their search results in the same way (the ranking of the results can be influenced, though). Data sources of very different structure can be connected, indexed and queried. The key component is the iBus, a communication broker that distributes queries to all connected data sources and collects, combines, and delivers the answers in return. One of the strong points in the architecture of PortalU is the wide range of data formats and the diversity of information systems that are supported.


