ACM Thirteenth Conference on
Information and Knowledge Management (CIKM)
CIKM and Workshops 2004
Keynote Address

The EDAM Project: Mining Mass Spectra and More

Dr. Raghu Ramakrishnan
Professor of Computer Sciences
University of Wisconsin-Madison

Abstract

The EDAM project is a collaborative effort between computer scientists and environmental chemists at Carleton College and UW-Madison. The goal is to develop data mining techniques for advancing the state of the art in analyzing atmospheric aerosol datasets. The traditional approach for particle measurement, which is the collection of bulk samples of particulates on filters, is not adequate for studying particle dynamics and real-time correlations. This has led to the development of a new generation of real-time instruments that provide continuous or semi-continuous streams of data about certain aerosol properties. However, these instruments have added a significant level of complexity to atmospheric aerosol data, and dramatically increased the amounts of data to be collected, managed, and analyzed. We are investigating techniques for automatically labeling mass spectra from different kinds of aerosol mass spectrometers, and then analyzing and exploring the rich spatiotemporal information collected from multiple geographically distributed instruments. In this talk, I will present an overview of some novel data mining problems, describe some of the techniques we are developing to address them, and discuss the broader applicability of these techniques to problems from other domains.

Biography

Raghu Ramakrishnan got his B.Tech. from IIT Madras in 1983 and his Ph.D. from the University of Texas at Austin in 1987. He has been a member of the Database Systems Group in the Computer Sciences Department at the University of Wisconsin-Madison since 1987. In 1999, he founded QUIQ, a company that developed innovative collaborative customer support solutions used by companies such as Business Objects, Compaq, Informatica, National Instruments, Sun Microsystems, and others, and served as the Chairman and CTO until 2003, when QUIQ was acquired by Kanisa.

His research is in the area of database systems, with a focus on data retrieval, analysis, and mining. He and his group have developed scalable algorithms for clustering, decision-tree construction, and itemset counting, and were among the first to investigate mining of continuously evolving and streaming data. His work on query optimization has found its way into several commercial database systems, and his work on extending SQL to deal with queries over sequences has influenced the design of window functions in SQL:1999. None of this would have been possible without a great group of former students; of all his contributions, he is proudest of this list.

Dr. Ramakrishnan is a Fellow of the Association for Computing Machinery (ACM), and has received several awards, including a David and Lucile Packard Foundation Fellowship in Science and Engineering, an NSF Presidential Young Investigator Award, Faculty awards from IBM and Microsoft, and an ACM SIGMOD Contributions Award. He has authored over 100 technical papers and written the widely-used text Database Management Systems (WCB/McGraw-Hill), now in its third edition (with J. Gehrke). He is on the Board of Trustees of the VLDB Endowment, editor-in-chief of the Journal of Data Mining and Knowledge Discovery, and has maintained the dbworld mailing list since creating it in 1987.