The EDAM Project: Mining Mass Spectra and More
Dr. Raghu Ramakrishnan
Professor of Computer Sciences
University of Wisconsin-Madison
Abstract
The EDAM project is a collaborative effort between computer scientists
and environmental chemists at Carleton College and UW-Madison. The goal
is to develop data mining techniques for advancing the state of the art
in analyzing atmospheric aerosol datasets. The traditional approach for
particle measurement, which is the collection of bulk samples of
particulates on filters, is not adequate for studying particle dynamics
and real-time correlations. This has led to the development of a new
generation of real-time instruments that provide continuous or
semi-continuous streams of data about certain aerosol properties.
However, these instruments have added a significant level of complexity
to atmospheric aerosol data, and dramatically increased the amounts of
data to be collected, managed, and analyzed. We are investigating
techniques for automatically labeling mass spectra from different kinds
of aerosol mass spectrometers, and then analyzing and exploring the rich
spatiotemporal information collected from multiple geographically
distributed instruments. In this talk, I will present an overview of
some novel data mining problems, describe some of the techniques we are
developing to address them, and discuss the broader applicability of
these techniques to problems from other domains.
Biography
Raghu Ramakrishnan got his B.Tech. from IIT Madras in 1983 and his
Ph.D. from the University of Texas at Austin in 1987. He has been a
member of the Database Systems Group in the Computer Sciences
Department at the University of Wisconsin-Madison since 1987. In 1999,
he founded QUIQ, a company that developed innovative collaborative
customer support solutions used by companies such as Business Objects,
Compaq, Informatica, National Instruments, Sun Microsystems, and
others, and served as the Chairman and CTO until 2003, when QUIQ was
acquired by Kanisa.
His research is in the area of database systems, with a focus on data
retrieval, analysis, and mining. He and his group have developed
scalable algorithms for clustering, decision-tree construction, and
itemset counting, and were among the first to investigate mining of
continuously evolving and streaming data. His work on query
optimization has found its way into several commercial database
systems, and his work on extending SQL to deal with queries over
sequences has influenced the design of window functions in
SQL:1999. None of this would have been possible without a great group
of former students; of all his contributions, he is proudest of this
list.
Dr. Ramakrishnan is a Fellow of the Association for Computing
Machinery (ACM), and has received several awards, including a David
and Lucile Packard Foundation Fellowship in Science and Engineering,
an NSF Presidential Young Investigator Award, Faculty awards from IBM
and Microsoft, and an ACM SIGMOD Contributions Award. He has authored
over 100 technical papers and written the widely-used text Database
Management Systems (WCB/McGraw-Hill), now in its third edition (with
J. Gehrke). He is on the Board of Trustees of the VLDB Endowment,
editor-in-chief of the Journal of Data Mining and Knowledge Discovery,
and has maintained the dbworld mailing list since creating it in 1987.
|
|