UofT DBGroup  Flavio Rizzolo
  Home | Projects | Publications | Links  
 


  Active Projects
 
  • Contextual data quality for business intelligence: Business analytics applications require data quality assessment at high levels of abstraction, where subjectivity, usefulness, sense and interpretation play a central role. From this perspective, the meaning and quality of the data are context dependent. In our framework, the context is given as a system of integrated data and metadata of which the data source under quality assessment is a particular and special component. In addition to the data under assessment, the context can have an expanded schema, additional data, or even be virtually defined as a system of integrated views. Clean answers to queries posed to the data under assessment will be relative to what is available in the context. More details on our contextual framework can be found here. This project  is part of the Business Intelligence Network.

  • User-centric, model-driven data integration: Data warehouses were envisioned to facilitate reporting and analysis by providing a model for the flow of data from operational systems to decision support environments. Typically, there is an impedance mismatch between the conceptual, high-level view of business intelligence users (and tools) accessing the data warehouse and the physical representation of the multidimensional data, often stored in DBMSs. To bridge these two levels of abstraction we developed the Conceptual Integration Modeling (CIM) Framework. The CIM tool takes the user's high-level visual specification and compiles it into a complex set of mappings and views to be used at runtime by business analytics applications, as described here. The CIM models and system architecture are described in CoRR (arXiv:1009.0255).  CIM can also be integrated with a business layer by providing data to business processes and key performance indicators via mappings, as described here. This project  is part of the Business Intelligence Network.

  Past Projects
 
  • Papyrus: A multinational European project for building a cross-discipline digital library engine that draws content from one domain and makes it available to a community of users who belong to a totally different discipline. Some ontology management research issues in this context include modeling concept evolution and semantic updates, and support for dynamic attributes (attributes for which domains and ranges are specified declaratively). 

  • DescribeX: A Framework for Exploring and Querying XML Web Collections. My PhD thesis introduced a framework that supports constructing heterogeneous XML synopses that can be declaratively defined and manipulated by means of regular expressions on XPath axes. The tool implementing this framework is tailored to data intensive applications in information integration, XQuery/XPath evaluation, XML retrieval and Web services. The thesis can be downloaded from CoRR (arXiv:0807.2972) and the University of Toronto Libraries TSpace website

  • Temporal XML: A proposal for modeling and querying temporal data in XML. Our implementation validates temporal XML documents against the temporal constraints imposed by the model and summarizes metadata by adding the time dimension to structural path summaries.

  • ToX (the Toronto XML Server): A repository of XML data and metadata that provides the key functions in document management, including registering documents, indexing document structure (with ToXin), defining logical views of distributed data sources, and querying document content and structure. My master's thesis introducing ToXin can be downloaded from here and the University of Toronto Libraries TSpace website.

 
Last Updated January, 2012