High Performance Data Mining
Workshop on High Performance Data Mining
May 27-30, 2007, Beijing, China

Held in conjunction with
International Conference on Computational Science, Beijing, China, 2007

Description Topics Program Submissions Important Dates Organizers Program Committee

Workshop Description

Motivation
With the unprecedented growth rate at which data is being collected today in almost all fields of human endeavor, there is an emerging economic and scientific need to extract useful information from the data. Many data warehouses are filling up with huge amounts of data. Data mining, also known as knowledge discovery, attempts to develop automatic procedures that search these enormous data sets to obtain useful information that would otherwise remain undiscovered. Such knowledge can take the form of patterns, rules, clusters, or anomalies that exist in the massive datasets. These discoveries could be of great significance to scientific or business organizations. Given the enormous size and dimensionality of the datasets, high performance (parallel, distributed, grid-based) algorithms are crucial to any successful data mining solution.

Recently major processor manufacturers have announced a dramatic shift in their paradigm to increase computing power over the coming years. Instead of focusing on faster clock speeds and more powerful single-core CPUs, the trend clearly goes towards multi-core systems. Therefore, supercomputers, large-scale distributed computing infrastructures, and grid-based computing environments provide new opportunities for high performance data mining. Research on the corresponding algorithms must hence be kept on the forefront of this fast evolving field in order to keep pushing the performance envelope of data mining applications to meet the requirements.

Goals
The goal of this workshop is to bring researchers and practitioners together in a setting where they can discuss the design, implementation, and deployment of large-scale, parallel, distributed, or grid-based data mining systems, which can manipulate data obtained from very large enterprise or scientific databases, regardless of whether the data are located centrally or are globally distributed.

Workshop Topics
  • Scalable parallel and distributed data mining algorithms
  • Grid-based data mining systems (middleware and system design)
  • Streaming data mining algorithms and systems
  • Data mining with heterogeneous data sources (e.g. the Web, sequence data)
  • Novel data mining applications and algorithms
  • Incremental and interactive data mining
  • Frameworks for parallel or distributed data mining
  • Agent-based approaches for high performance data mining
  • Parallel data mining work flow management
  • Special-purpose architectures for high performance data mining
  • Memory management techniques for mining very large data sets
  • Security in large distributed data mining systems
  • Performance analysis for large data mining systems
  • Impact of processor-level architectures to data mining

Program
Not available yet.

Submission Guidelines
All submissions should be submitted electronically, by the submission deadline of December 31, 2006. Please submit it to the conference paper submission system All submissions should be made in PDF or PostScript format. Submissions should be a maximum of 8 pages and should use the the format according to the rules of LNCS (for formatting information see Information for LNCS Authors). All accepted papers will be published in the Springer-Verlag Lecture Notes in Computer Science (LNCS) Series. In addition, the accepted papers will be scheduled for oral presentation.

Submitted papers will be reviewed by members of the program committee. Authors will be notified of the acceptance or rejection of their paper by Feburary 5, 2007. Camera-ready version of the papers are due Feburary 19, 2007.

Please do not hesitate to email the workshop contact if you have any questions.

Important Dates

December 31, 2006 Deadline for electronic submission of full papers
Feburary 5, 2007 Notification of accepted papers
Feburary 19, 2007 Camera Ready Copies
Feburary 19, 2007 Early Registration
March 30, 2007 Late Registration

Workshop Co-Chairs

   Note: for inquiries please send email to yingliu@gucas.ac.cn
Alok Choudhary Northwestern University, USA
Ying Liu Graduate University of Chinese Academy of Sciences, Research Center on Data Technology and Knowledge Economy of Chinese Academy of Sciences
Steve Chiu Idaho State University, USA

Program Committee

Jayaprakash Pisharath Intel Corporation, USA
Wei-keng Liao Northwestern University, USA
Zhiling Lan Illinois Institute of Technology, USA
Xingquan Zhu Florida Atlantic University, USA
Yingjie Tian Research Center on Data Technology and Knowledge Economy of Chinese Academy of Sciences
C.D. Schou Idaho State University, USA
Joseph Zambreno Iowa State University, USA
Jun Xu Microsoft Research Asia
Jianwei Li