Survey of state of art of mixed-data clustering

Partitionnal clustering

Model-based

Hierarchical clustering

Algorithms

Gower's similarity Matrix

K-prototype

"Clustering large data sets with mixed numeric and categorical values"

+++

New cost function and distance measure

"Ak-mean clustering algorithm for mixed numeric
and categorical data"

Similarity / Weights of numeric features

+++

W-K-prototypes

+++

Frequency of features values

"Automated variable weighting
in k-means type clustering"

"K-centers algorithm for clustering mixed
type data"

data clustering algorithm for data streams with mixed numeric and categorical

``A fast density-based data stream clustering
algorithm with cluster centers self-determined for mixed data,''

+++

New distance

"An equi-biased k-prototypes algorithm for clustering mixed-type data"

FGKA for mixed data

"Genetic k-means clustering algorithm for
mixed numeric and categorical data sets''

Less time

Grid-based technique

An effcient grid-based k-prototypes algorithm for sustainable decision-making on spatial
objects

Ahmed and dey

Huand and al

Zhao and al.

Roy and Sharma

Jang and al.

Sangam and Om

Che and He

``Extensions to the K-means algorithm for clustering large data sets with categorical values,''

Converting mixed data

Uses polar or spherical coordinates

"Geometrical codiication for clustering
mixed categorical and numerical databases,''

+++

Barcelo-Rico and Jose-Luis

‘‘Mixed data cluster analysis: An illustration
using Cypriot hooked-tang weapons,’’

new Similarity measure

Philip and Ottaway

A general coefficient of similarity and some of its properties,’

Gower

A robust scalable clustering algorithm for mixed type attributes in large database environment

BIRCH

Fang and ai

‘‘BIRCH: A new data clustering algorithm and its applications,’’

Zhang an ai

SBAC (Goodall similiarity measure)

‘‘Unsupervised learning with mixed numeric and nominal data,

Li and Biwas

A new similarity index based on probability,''

Goodall

Distance hierarchy by using concept hierarchy

‘Hierarchical clustering of mixed data based on distance hierarchy

Hsu and aI

AUTOCLASS

‘‘Bayesian classification (AutoClass): Theory
and results,’’

Cheeseman and schutz

Bayesian methods

Everitt

‘A finite mixture model for the clustering of mixed mode data 1988

Use of thresholds for categorical features

Krzanowski

Extension of homogenous conditional model

‘‘Mixture separation for mixedmode data,’’

Moustaki

Latent class mixture model

‘Latent class models for mixed variables
with applications in archaeometry 2005

Browne

‘‘Model-based clustering, classification, and discriminant analysis of data with mixed type,’ 2012

Latent variable model + expectation - maximization framework

Andreopopoulos

Pseudo bayesian process with categorical clustering data to guide numerical clustering

‘‘Bi-level clustering of mixed
categorical and numerical biomedical data,’ 2006

Hunt and Jogensen

‘‘Mixture model clustering of data sets with
categorical and continuous variables,’’ 1996

Finite mixture of multivariate distributions is fitted to data

Can be applied to mixed data with mixed value

ClustMD method

‘Model based clustering for mixed data:
ClustMD,

McParcalnd and aI

McParcland and Gormely

‘‘Clustering high-dimensional mixed data to uncover subphenotypes: Joint analysis of phenotypic and genotypic data, 2017

Latent variable model

Bayesian finite mixture model

Good results

Saadoui et aI.

‘A dimensionally reduced clustering methodology for heterogeneous occupational medicine data mining,’ 2015

Projection of categorical features

Rajan

Gaussian mixture copula

‘‘Dependency clustering of mixed data with
Gaussian mixture copulas 2016

Outperforms other algorithms

Tekmulla and aI

Vine copulas for mixed data: Multi-view clustering for mixed data beyond meta-Gaussian dependencies 2017

Vine copulas and Dirichlet process of vines

Marbac and ai

‘Model-based clustering of gaussian copulas for mixed data 2017

Mixture models of Gaussian copulas

KAMILA

‘‘A semiparametric method for clustering mixed data 2016

K-means algorithm + Gaussian multionomial mixture model

Doring and aI

‘‘Fuzzy clustering of quantitative
and qualitative data,’’ 2004

Fuzzy algorithm for mixed data

Doring

Chatzis

‘A fuzzy c-means-type algorithm for clustering of data with mixed numeric and categorical attributes employing a probabilistic dissimilarity functional 2011

FCM-type clustering
algorithm for mixed data

Pathak and pal

‘‘Clustering of mixed data by integrating fuzzy,
probabilistic, and collaborative clustering framework, 2016

Fuzzy + probabilistic + collaborative clustering

Neural networks

Hsu and Lin

Visualized analysis of multivariate mixed-type data
via an extended self-organizing map 2006

Hsu and lin

Fixed SMO

Growing SMO

‘Visualized analysis of mixed numeric and
categorical data via extended self-organizing map,2012

Tai and HSU

Generalized SMO + growing SMO

‘‘Growing self-organizing map with cross insert
for mixed-type data clustering,’’ 2012

Chen and Marques

SMO based + Hamming distance

‘An extension of self-organizing maps to
categorical data,’

Problem : gives more weigth to categorical features

+++

Coso and aI

Mixing numerical and categorical data in a self-organizing map by means of frequency neurons,’

Modify the distane measure

Noorbehbahani et al.

‘‘An incremental mixed data clustering method using a new distance measure,

incremental mixed-data clustering using SMO

Lam and aI

Fuzzy ART + K means clustering

‘‘Clustering data of mixed categorical
and numerical type with unsupervised feature learning,

Other

Spectral clustering

SpectralCAT

‘‘Clustering mixed data based on evidence accumulation,’

Niu and aI

A coupled user clustering algorithm based on mixed data for Web-based learning
systems,

+++

+++

Combination of similarity matrices