Sushmita Mitra (Indian Statistical Institute, India)
TITLE: Hybrid Rough Set Methods in Data Mining
DATE: May 15, 9:50am-10:40am
ABSTRACT: The talk will focus on the use of soft computing, particularly rough sets and its hybridizations with some of the other paradigms like fuzzy sets, neural networks, and genetic algorithms for data mining. The tasks considered include feature selection and classification. Applications will be provided on gene expression data analysis and face recognition.
Xindong Wu (U of Vermont, USA)
TITLE: Top 10 Algorithms in Data Mining
DATE: May 16, 3:00pm-3:50pm
ABSTRACT: This talk discusses the top 10 data mining algorithms identified by the IEEE International Conference on Data Mining (ICDM, http://www.cs.uvm.edu/~icdm/) in 2006: C4.5, K-Means, SVM, Apriori, EM, PageRank, AdaBoost, kNN, Naïve Bayes, and CART. With each algorithm, we provide a description of the algorithm, discuss the impact of the algorithm, and review current and further research on the algorithm. These 10 algorithms cover classification, clustering, statistical learning, association analysis, and link mining, which are all among the most important topics in data mining research and development.
Domenico Talia (U of Calabria, Italy)
TITLE: From Parallel Data Mining to Grid-enabled Distributed Knowledge Discovery
DATE: May 16, 3:50pm-4:40pm
ABSTRACT: Data mining often is a computing intensive and time requiring process. For this reason, several data mining systems have been implemented on parallel computing platforms to achieve high performance in the analysis of large data sets. Moreover, when large data repositories are coupled with geographical distribution of data, users and systems, more sophisticated technologies are needed to implement high-performance distributed KDD systems. Recently computational Grids emerged as privileged platforms for distributed computing and a growing number of Grid-based KDD systems have been designed. In this talk we first discuss different ways to exploit parallelism in the main data mining techniques and algorithms. Then we discuss Grid-based KDD systems. Finally, we introduce the Knowledge Grid, an environment which makes use of standard Grid middleware to support development of parallel and distributed knowledge discovery applications.

