|
|
 |
|
Active Projects |
| |
-
Contextual data quality for business
intelligence: Business analytics applications require data quality
assessment at high levels of abstraction, where subjectivity, usefulness,
sense and interpretation play a central role. From this perspective, the
meaning and quality of the data are context dependent. In
our framework, the context is given as a system of integrated data and
metadata of which the data source under quality assessment is a particular
and special component. In addition to the data under assessment, the context
can have an expanded schema, additional data, or even be virtually defined
as a system of integrated views. Clean
answers to queries posed to the data under assessment will be relative to what
is available in the context. More details on our contextual framework can be found
here. This project is part of the
Business Intelligence Network.
-
User-centric, model-driven data
integration: Data warehouses were envisioned to facilitate reporting and analysis by
providing a model for the flow of data from operational systems to decision
support environments. Typically, there is an impedance mismatch between the conceptual,
high-level
view of business intelligence users (and tools) accessing the data
warehouse and the physical representation of the multidimensional data,
often stored in DBMSs. To bridge these two levels of
abstraction we developed the Conceptual Integration Modeling (CIM)
Framework. The CIM tool takes the user's high-level visual specification and
compiles it into a complex set of mappings and views to be used at runtime
by business analytics applications, as described
here. The CIM models and system
architecture are described in
CoRR (arXiv:1009.0255).
CIM can also be integrated with a business layer by providing data to
business processes and key performance indicators via mappings, as described
here. This project is part of the
Business Intelligence Network.
|
|
Past Projects |
| |
-
Papyrus: A
multinational European project for building a
cross-discipline digital library engine that draws content from one
domain and makes it available to a community of users who belong to a
totally different discipline. Some ontology management research issues in
this context
include modeling concept evolution and semantic updates, and support for
dynamic attributes (attributes for which domains and ranges are specified
declaratively).
-
DescribeX: A Framework for Exploring and Querying XML Web Collections.
My PhD thesis introduced a framework that supports constructing heterogeneous
XML synopses that can be declaratively defined and manipulated by means of
regular expressions on XPath axes. The tool implementing this framework is tailored to data intensive
applications in information integration, XQuery/XPath evaluation, XML
retrieval and Web services. The thesis can be downloaded from
CoRR (arXiv:0807.2972) and the
University of
Toronto Libraries TSpace website.
-
Temporal XML:
A proposal for modeling and querying temporal data in XML. Our
implementation validates temporal XML documents against the temporal
constraints imposed by the model and summarizes metadata by adding the
time dimension to structural path summaries.
-
ToX
(the Toronto XML Server): A
repository of XML data and metadata that provides the key functions in
document management, including registering documents, indexing document
structure (with ToXin),
defining logical views of distributed data sources, and querying document
content and structure. My master's thesis introducing ToXin can be
downloaded from here and the
University of
Toronto Libraries TSpace website.
|
| |
|