Description
To achieve interoperability, information
systems, e-commerce
applications and program comprehension tools use mappings to translate
data
from one schema to another. Similarly, it is interoperability between
data sources
on the Semantic Web can be improved by defining mappings from data
sources to
ontologies. But both the generation and maintenance of mappings in a
dynamic
environment like the web are laborious, error prone and require
expertise. We
are attacking the schema mapping problem from two angles.
In one approach, we have constructed two
tools that deal
with mappings directly between schemas. One tool, Clio, generates such
mappings and
another tool, ToMAS,
manages
them. Both tools focus on mappings between any combination of XML and
relational schemas. Clio allows non-expert users to provide a
high-level
specification of how two schemas correspond and automatically
translates these
specifications into semantically meaningful queries that transform data
conforming to the source schema into data conforming to the target
schema. The
process consists of two phases. In the first phase, the high-level
specifications,
expressed as a set of inter-schema correspondences, are translated into
a set
of mappings that capture the design choices made in the two schemas.
The design
choices include the hierarchical organization of the data as well as
schema
constraints (i.e., foreign key constraints). During the second phase,
these
mappings are translated into queries (SQL, XQuery, or XSLT) over the
source
schema that generate data to populate the target schema. An important
feature
of the mapping algorithm is that it takes into consideration target
schema
constraints in order to guarantee that the generated data will not
violate the
integrity of the target schema.
In
dynamic environments like the Web, not only may the data
maintained by information sources change, but so may their schemas,
semantics,
and query capabilities. These changes must be reflected in the
mappings.
Mappings left invalid or inconsistent by such changes must be detected
and
updated. As large, complicated schemas become more prevalent, and as
data is reused
in more and more applications, manually maintaining mappings (even
simple
mappings like view definitions) is becoming impractical. ToMAS is a
novel
framework and tool for automatically adapting mappings as schemas
evolve. It
continuously monitors mappings and automatically detects mappings that
are
affected by modifications of schemas. Such mappings are then rewritten
in
accordance with the modified schemas. An important feature of ToMAS is
that it
treats mappings and schemas as first class citizens of a repository and
a query
language. This means that schemas and mappings can be used in queries,
thus
enabling their management.
In
the second approach, the focus is on semantic mappings
from database schemas to ontologies. Such mappings will help make the
Semantic
Web a reality by facilitating access to the content of legacy databases
made
available on the web -- part of the
"deep web". Although ontologies with rich semantics provide a way to
improve interoperability between heterogeneous data sources,
constructing the
mappings is difficult, prone to error and may require both technical
and domain
expertise. As part of the MAPONTO
project, we are developing a prototype interactive tool to address the
problem
of finding logical connections between entire database tables and
domain
ontologies. The tool works by first finding and expressing
correspondences
between table columns and ontology components, and then using these
correspondences and heuristics to find logical formulas that are
reasonable
candidates to express the semantic connections between the ontology
components
and the table. We have evaluated our tool over existing relational
schemas and
ontologies, with good results.
Funding Agencies |
|
Principle Investigator |
|
|