Clustering in High Dimension: Algorithms and Applications

Project Participants

Project Summary

The present project is born from current inter-disciplinary collaborative research among the team members, and focuses on specific clustering problems in two domains of applications, namely Earth observation science and post-genomic biology.

Earth observation science. The project focuses on the identification/discrimination of phytoplankton functional types for biogeochemistry applications, as well as on the determination of aerosol types for radiation budget studies and remote sensing of aerosol and surfaces properties.

Post-genomic biology. The project focuses on the detection of cancer subtypes from tumor profiling, as well as on the detection of groups of genes forming functional pathways from the clustering of gene expression time series.

These applications, yet originating from different scientific domains, both give rise to large amounts of complex and high-dimensional data, and yield similar methodological questions. The objectives of the project are to address these questions in a unified framework, to develop improved clustering algorithms for this type of complex data, and to study their mathematical properties.