Wade D. Cook (Schulich, York U, Canada) and Joe Zhu (Worcester Polytechnic Institute, USA)

TITLE: Classifying Inputs and Outputs in Data Envelopment Analysis

DATE: May 14, 6:40pm-7:00pm (Session 3)

ABSTRACT: In conventional data envelopment analysis it is assumed that the input versus output status of each of the chosen performance measures is known. In some situations, however, certain performance measures can play either input or output roles. We call these performance measures flexible measures. This paper presents a modification of the standard constant returns to scale DEA model to accommodate such flexible measures. Both an individual DMU model and an aggregate model are suggested as methodologies for deriving the most appropriate designations for flexible measures. We illustrate the application of these models in two practical problem settings.


Igor Jurisica (Ontario Cancer Institute, PMH/UHN, Canada)

TITLE: Avoiding fusion of illusion and confusion: Integrated cancer informatics

DATE: May 15, 12:20pm-12:40pm (Session 7)

ABSTRACT: Researchers, clinicians and biological methods all have specific biases. Many data sets provide useful, but not always fully accurate information on molecular cancer profiles, and we are struggling to interpret context from aggregated interactomes. Despite the introduction of diverse and powerful chemotherapeutic agents over the past two decades, most cancers remain diseases with devastating mortality rates. The accumulation of data from systematic high-throughput experiments has brought the potential to construct models of how biological systems work at the cell or whole organism level. How to integrate multiple information levels to achieve this task is not trivial, and we discuss some of the possible approaches. We will focus on the high resolution, interactive visualization of large networks of interacting proteins. Our goal is to understand cancer at molecular level to develop early detection methods, accurate prognosis and effective therapies. We can increase our understanding of the disease origin and tumorigenesis by integrating existing large scale genomic and proteomic data sets. This requires new analysis methods to combine, consolidate and interpret heterogeneous data. No single database or algorithm will be successful at solving these complex analytical problems.


Thodoros Topaloglou (U of Toronto, Canada)

TITLE: Managing Data in High Throughput Laboratories: An Experience Report

DATE: May 15, 12:40pm-1:00pm (Session 7)

ABSTRACT: Scientific laboratories are rich in data management challenges. This talk describes an end-to-end information management infrastructure for a high throughput proteomics industrial laboratory. A unique feature of the platform is a data and applications integration framework that is employed for the integration of heterogeneous data, applications and processes across the entire laboratory production workflow. We also define reference architecture for implementing similar solutions organized according to the laboratory data lifecycle phases. Each phase is modeled by a set of workflows integrating programs and databases in sequences of steps and associated communication and data transfers. We discuss the issues associated with each phase, and describe how these issues were approached in the proteomics implementation.


Wladyslaw Skarbek (Warsaw U of Technology, Poland)

TITLE: Face Image Recognition

DATE: May 15, 2:30pm-2:50pm (Session 10)

ABSTRACT: In this presentation, beside the classical chain of functional modules used in face recognition systems, a novel point of view onto discriminant models used for biometric verification is presented. Various linear discriminant algorithms based on Fisher-like class separation measures are incorporated into discriminant analysis diagram (DAD). This new methodology can be used for design of special class of pattern recognition systems. Namely, pattern recognition embracing verification, identification, and indexing of patterns are based on intra-class errors when pattern classes used in training time are different than classes recognized in system exploiting time. This is typical case in biometric identity verification. The point is illustrated well by analysis of recent advances in development of face recognition algorithms.


I. Burhan Turksen (U of Toronto, Canada / TOBB ETU, Turkey)

TITLE: A Review of Essential Fuzzy System Modeling Approaches

DATE: May 16, 11:00am-11:20am (Session 13)

ABSTRACT: We present a rewiew of fuzzy system modeling approaches from "fuzzy rule bases" to "fuzzy functions". Well known basic fuzzy rule bases are: (1) Zadeh (Sugeno-Yasukawa) type, (2)Takagi-Sugeno type. Historically, fuzzy rule bases are formed with information extracted from the experts. Fuzzy functions are initially introduced by Turksen and further developed by Celikyilmaz and Turksen. They are suitable for cases where there is a fuzzy data base and one is able to extract fuzzy membership values with a fuzzy data mining technique such as FCM. They are different from Hathaway and Bezdek or Tanaka et al types. Turksen type "fuzzy functions" take as their arguments membership values and/or their transformations in addition to original input variables. Whereas Hathaway and Bezdek or Tanaka et al types introduce fuzzy coefficients for original input variables. It is found that Turksen type fuzzy functions generally produce better results.


Victoria Eastwood and Dominik Slezak (Infobright Inc., Canada)

TITLE: BrightHouse - A New Database Engine Based on Rough Sets

DATE: May 16, 12:00pm-1:pm (Session 15)

ABSTRACT: Performing complex queries against vast amounts of information in a data warehouse can be impossible. This presentation will discuss how BrightHouse - the database engine developed by Infobright Inc. - achieves 10:1 compression of raw data and uses knowledge about the data to efficiently execute analytic queries. This talk will discuss how the principles of the rough set theory allow BrightHouse to use knowledge about the data to query a large amount of compressed data with a limited need of data decompression. This talk will also outline how BrightHouse takes the advantage of integration with MySQL framework for the data storage engines.