Home RWTH-Aachen
Home
Lehrstuhl für Informatik 9
Datenmanagement und Exploration
Univ.-Prof. Dr. rer. nat. Thomas Seidl
RWTH-Aachen
RWTH-Aachen - Lehrstuhl für Informatik 9  » Lehrstuhl
 Navigation
Lehrstuhl
Anfahrt
Lehre
Forschung
Publikationen
Team
Algorithmus der Woche
Sitemap
Impressum
Intern
 Sprache
  Deutsch
  English

Subspace Clustering

Effiziente und effektive Subspace Cluster Suche in hochdimensionalen Datenbanken

 

Increasingly large data resources in life sciences, mobile information and communication, e-commerce, and other application domains require automatic techniques for gaining knowledge. One of the major knowledge discovery tasks is clustering which aims at grouping data such that objects within groups are similar while objects in different groups are dissimilar. In scenarios with many attributes or with noise, clusters are often hidden in subspaces of the data and do not show up in the full dimensional space. For these applications, subspace clustering methods aim at detecting clusters in any subspace.

 

We propose new subspace clustering models which remove redundant information and ensure the comparability of different clusters to enhance the quality and interpretability of the clustering results. At the same time the efficiency of the clustering process is guaranteed by the development of new algorithms. Additionally we focus our research on the evaluation and visualization of patterns to benefit from human cognitive abilities for the knowledge generation.

Beteiligte Mitarbeiter

Seidl T., Assent I., Krieger R., Müller E., Günnemann S.

Publikationen

  1. EN Assent I., Krieger R., Welter P., Herbers J.,Seidl T.: (2009)
    Data Mining For Robust Flight Scheduling
    In: Longbing Cao, Philip S. Yu, Chengqi Zhang, Huaifeng Zhang (eds.): Data Mining for Business Applications. Springer 2009. 267-282
    [Springer Link]

  2. EN Müller E., Günnemann S., Assent I., Seidl T.: (2009)
    Evaluating Clustering in Subspace Projections of High Dimensional Data
    Proc. 35th International Conference on Very Large Data Bases (VLDB 2009), Lyon, France, PVLDB Journal, Vol. 2, No. 1, 1270-1281 (Experiments and Analyses track, acceptance rate 23.1%)
    [VLDB 2009] [Kaufen]

  3. EN Müller E., Assent I., Günnemann S., Krieger R., Seidl T.: (2009)
    Relevant Subspace Clustering: Mining the Most Interesting Non-Redundant Concepts in High Dimensional Data
    Proc. IEEE International Conference on Data Mining (ICDM 2009), Miami, USA (full paper acceptance rate 8.9%)
    [ICDM 2009]

  4. EN Müller E., Assent I., Krieger R., Günnemann S., Seidl T.: (2009)
    DensEst: Density Estimation for Data Mining in High Dimensional Spaces
    Proc. SIAM International Conference on Data Mining (SDM 2009), Sparks, Nevada, USA. 173-184 (full paper acceptance rate 15.6%)
    [SDM 2009]

  5. EN Günnemann S., Müller E., Färber I., Seidl T.: (2009)
    Detection of Orthogonal Concepts in Subspaces of High Dimensional Data
    Proc. 18th ACM Conference on Information and Knowledge Management (CIKM 2009), Hong Kong, China (full paper acceptance rate 14.5%)
    [CIKM 2009]

  6. EN Müller E., Assent I., Seidl T.: (2009)
    HSM: Heterogeneous Subspace Mining in High Dimensional Data
    Proc. 21st International Conference on Scientific and Statistical Database Management (SSDBM 2009), New Orleans, Louisiana, USA 497-516
    [SSDBM 2009]

  7. EN Müller E., Assent I., Günnemann S., Jansen T., Seidl T.: (2009)
    OpenSubspace: An Open Source Framework for Evaluation and Exploration of Subspace Clustering Algorithms in WEKA
    Proc. 1st Open Source in Data Mining Workshop (OSDM 2009) in conjunction with 13th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2009), Bangkok, Thailand 2-13
    [OpenSubspace Project]

  8. EN Schiffer M., Müller E., Seidl T.: (2009)
    SubRank: Ranking Local Outliers in Projections of High-Dimensional Spaces
    Datenbank-Spektrum Vol. 9 Issue 29 53-55 (BTW-Studierendenprogramm)
    [DB Spektrum]

  9. EN Matthias Schiffer: (2009)
    SubRank: Ranking local outliers in projections of high-dimensional spaces
    Studierendenprogramm at the 13th GI-conference on Databases, Technology and Web (BTW 2009), Münster, Germany
    [BTW 2009 Studierendenprogramm]

  10. EN Assent I., Krieger R., Glavic B., Seidl T.: (Jul 2008)
    Clustering Multidimensional Sequences in Spatial and Temporal Databases
    In: International Journal on Knowledge and Information Systems (KAIS) Vol. 16, Issue 1 29-51
    [KAIS]

  11. EN Assent I., Krieger R., Müller E., Seidl T.: (2008)
    INSCY: Indexing Subspace Clusters with In-Process-Removal of Redundancy
    Proc. IEEE International Conference on Data Mining (ICDM 2008), Pisa, Italy 719-724 (acceptance rate 20%)
    [ICDM 2008]

  12. EN Assent I., Krieger R., Müller E., Seidl T.: (2008)
    EDSC: Efficient Density-Based Subspace Clustering
    Proc. ACM 17th Conference on Information and Knowledge Management (CIKM 2008), Napa Valley, USA 1093-1102 (full paper acceptance rate 17%)
    [CIKM 2008]

  13. EN Assent I., Krieger R., Welter P., Herbers J., Seidl T.: (2008)
    SubClass: Classification of Multidimensional Noisy Data Using Subspace Clusters
    Proc. 12th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2008), Springer LNCS/LNAI, Osaka, Japan
    [PAKDD 2008]

  14. DE Ines Färber: (2009)
    Mining orthogonaler Konzepte in hochdimensionalen Datenbanken
    GI Informatiktage 27./28. März 2009 in Bonn
    [Informatiktage]

  15. EN Müller E., Assent I., Steinhausen U., Seidl T.: (2008)
    OutRank: ranking outliers in high dimensional data
    Proc. 2nd International Workshop on Ranking in Databases (DBRank 2008) in conjunction with IEEE 24th International Conference on Data Engineering (ICDE 2008), Cancun, Mexico 600-603
    [ICDE 2008 Workshops]

  16. EN Müller E., Assent I., Krieger R., Jansen T., Seidl T.: (2008)
    Morpheus: Interactive Exploration of Subspace Clustering
    Proc. 14th ACM SIGKDD International Conference on Knowledge Discovery in Databases (KDD 2008), Las Vegas, USA 1089-1092 (Demo)
    [KDD 2008]

  17. EN Assent I., Müller E., Krieger R., Jansen T., Seidl T.: (2008)
    Pleiades: Subspace Clustering and Evaluation
    Proc. European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2008), Antwerp, Belgium, Springer LNCS 5212. 666-671 (Demo)
    [ECML PKKD 2008]

  18. EN Assent I., Krieger R., Müller E., Seidl T.: (Dec 2007)
    VISA: Visual Subspace Clustering Analysis
    ACM SIGKDD Explorations Special Issue on Visual Analytics, Vol. 9, Issue 2 5-12
    [SIGKDD Explorations]

  19. EN Assent I., Krieger R., Müller E., Seidl T.: (2007)
    DUSC: Dimensionality Unbiased Subspace Clustering
    Proc. IEEE International Conference on Data Mining (ICDM 2007), Omaha, Nebraska, USA 409-414 (acceptance rate 19%)
    [ICDM 2007] [Full Text PDF]

  20. EN Seidl T., Müller E., Assent I., Steinhausen U.: (2008)
    Outlier detection and ranking based on subspace clustering
    Dagstuhl Seminar 08421 on Uncertainty Management in Information Systems.
    [Dagstuhl seminar 08421]

  21. EN Assent I., Krieger R., Glavic B., Seidl T.: (Dec 2006)
    Spatial Multidimensional Sequence Clustering
    Proc. 1st International Workshop on Spatial and Spatio-temporal Data Mining (SSTDM 2006) In conjunction with ICDM 2006, Hong Kong
    [SSTDM 2006] [PDF]

  22. EN Assent I., Krieger R., Müller E., Seidl T.: (2007)
    Subspace outlier mining in large multimedia databases
    In: M. Berthold, K. Morik, A. Siebes(eds.): Parallel Universes and Local Patterns, Dagstuhl Seminar 07181
    [Dagstuhl seminar homepage]

  23. DE Emmanuel Müller: (2007)
    Subspace Clustering für die Analyse von CGH Daten
    Studierendenprogramm at the 12th GI-conference on Databases, Technology and Web (BTW 2007), Aachen, Germany: 31-33
    [BTW 2007]

Diplom-/Master-arbeiten

Erkennung von Fehlerursachen in hochdimensionalen Produktionsdatenbanken
mit AUCOS GmbH (Egbert König)
Student: Thomas RammBetreuer: Müller E.
Indexierung hochdimensionaler Daten mittels hierarchischem Subspace Clustering
Student: Dominik LenhardBetreuer: Günnemann S., Kremer H.
Mining orthogonaler Konzepte in hochdimensionalen Datenbanken
mit Exzellenzcluster UMIC
Studentin: Ines FärberBetreuer: Müller E., Günnemann S.
Outlier Mining mittels lokaler Dichteschätzung in statistisch relevanten Projektionen
mit Exzellenzcluster UMIC
Student: Matthias SchifferBetreuer: Müller E.
Approximations for efficient subspace clustering in high-dimensional databases
mit Exzellenzcluster UMIC
Student: Stephan GünnemannBetreuer: Müller E., Assent I., Krieger R.
Outlier Detection in heterogenen Sensordaten
mit National Instruments (Stefan Romainczyk)
Student: Uwe SteinhausenBetreuer: Müller E., Assent I., Krieger R.
Effizientes dichte-basiertes Subspace Clustering
Student: Emmanuel MüllerBetreuer: Ralph Krieger, Ira Assent
Lokal selektive Klassifikation mit Subspace-Clustern
mit INFORM GmbH (Dr. Herbers, Dr. Dorndorf)
Studentin: Petra WelterBetreuer: Ira Assent, Ralph Krieger
Subspace Clustering for Sequences of Ordered Categorical Data
mit Lehr- und Forschungsgebiet Ingenieurhydrologie (Prof. Dr.-Ing. H. Nacken, Dr.-Ing. H. Sewilam, S. Bartusseck)
Student: Boris GlavicBetreuer: Ralph Krieger, Ira Assent

Haftungsausschluss By I9 2003