Public Member Functions | Protected Member Functions | Protected Attributes

CKMeans Class Reference


Detailed Description

KMeans clustering, partitions the data into k (a-priori specified) clusters.

It minimizes

\[ \sum_{i=1}^k\sum_{x_j\in S_i} (x_j-\mu_i)^2 \]

where $\mu_i$ are the cluster centers and $S_i,\;i=1,\dots,k$ are the index sets of the clusters.

Beware that this algorithm obtains only a local optimum.

cf. http://en.wikipedia.org/wiki/K-means_algorithm

Definition at line 39 of file KMeans.h.

Inheritance diagram for CKMeans:
Inheritance graph
[legend]

List of all members.

Public Member Functions

 CKMeans ()
 CKMeans (int32_t k, CDistance *d)
virtual ~CKMeans ()
virtual EClassifierType get_classifier_type ()
virtual bool train (CFeatures *data=NULL)
virtual bool load (FILE *srcfile)
virtual bool save (FILE *dstfile)
void set_k (int32_t p_k)
int32_t get_k ()
void set_max_iter (int32_t iter)
float64_t get_max_iter ()
void get_radi (float64_t *&radi, int32_t &num)
void get_centers (float64_t *&centers, int32_t &dim, int32_t &num)
void get_radiuses (float64_t **radii, int32_t *num)
void get_cluster_centers (float64_t **centers, int32_t *dim, int32_t *num)
int32_t get_dimensions ()

Protected Member Functions

void sqdist (float64_t *x, CSimpleFeatures< float64_t > *y, float64_t *z, int32_t n1, int32_t offs, int32_t n2, int32_t m)
void clustknb (bool use_old_mus, float64_t *mus_start)
virtual CLabelsclassify ()
virtual CLabelsclassify (CFeatures *data)
virtual const char * get_name () const

Protected Attributes

int32_t max_iter
 maximum number of iterations
int32_t k
 the k parameter in KMeans
int32_t dimensions
 number of dimensions
float64_tR
 radi of the clusters (size k)
float64_tmus
 centers of the clusters (size dimensions x k)

Constructor & Destructor Documentation

CKMeans (  )

default constructor

Definition at line 29 of file KMeans.cpp.

CKMeans ( int32_t  k,
CDistance d 
)

constructor

Parameters:
kparameter k
ddistance

Definition at line 35 of file KMeans.cpp.

~CKMeans (  ) [virtual]

Definition at line 42 of file KMeans.cpp.


Member Function Documentation

virtual CLabels* classify (  ) [protected, virtual]

classify objects using the currently set features

Returns:
classified labels

Implements CClassifier.

Definition at line 213 of file KMeans.h.

virtual CLabels* classify ( CFeatures data ) [protected, virtual]

classify objects

Parameters:
data(test)data to be classified
Returns:
classified labels

Implements CClassifier.

Definition at line 224 of file KMeans.h.

void clustknb ( bool  use_old_mus,
float64_t mus_start 
) [protected]

clustknb

Parameters:
use_old_musif old mus shall be used
mus_startmus start

replace rhs feature vectors

set rhs to mus_start

update rhs

sqdist(mus, lhs, dists, k, Pat, 1, dimensions);

Definition at line 177 of file KMeans.cpp.

void get_centers ( float64_t *&  centers,
int32_t &  dim,
int32_t &  num 
)

get centers

Parameters:
centerscurrent centers are stored in here
dimdimensions are stored in here
numnumber of centers is stored in here

Definition at line 138 of file KMeans.h.

virtual EClassifierType get_classifier_type (  ) [virtual]

get classifier type

Returns:
classifier type KMEANS

Reimplemented from CClassifier.

Definition at line 57 of file KMeans.h.

void get_cluster_centers ( float64_t **  centers,
int32_t *  dim,
int32_t *  num 
)

get cluster centers (swig compatible)

Parameters:
centerscurrent cluster centers are stored in here
dimdimensions are stored in here
numnumber of centers is stored in here

Definition at line 166 of file KMeans.h.

int32_t get_dimensions (  )

get dimensions

Returns:
number of dimensions

Definition at line 182 of file KMeans.h.

int32_t get_k (  )

get k

Returns:
the parameter k

Definition at line 97 of file KMeans.h.

float64_t get_max_iter (  )

get maximum number of iterations

Returns:
maximum number of iterations

Definition at line 116 of file KMeans.h.

virtual const char* get_name (  ) const [protected, virtual]
Returns:
object name

Implements CSGObject.

Definition at line 233 of file KMeans.h.

void get_radi ( float64_t *&  radi,
int32_t &  num 
)

get radi

Parameters:
radicurrent radi are stored in here
numnumber of radi is stored in here

Definition at line 126 of file KMeans.h.

void get_radiuses ( float64_t **  radii,
int32_t *  num 
)

get radiuses (swig compatible)

Parameters:
radiicurrent radiuses are stored in here
numnumber of radiuses is stored in here

Definition at line 150 of file KMeans.h.

bool load ( FILE *  srcfile ) [virtual]

load distance machine from file

Parameters:
srcfilefile to load from
Returns:
if loading was successful

Reimplemented from CClassifier.

Definition at line 72 of file KMeans.cpp.

bool save ( FILE *  dstfile ) [virtual]

save distance machine to file

Parameters:
dstfilefile to save to
Returns:
if saving was successful

Reimplemented from CClassifier.

Definition at line 77 of file KMeans.cpp.

void set_k ( int32_t  p_k )

set k

Parameters:
p_knew k

Definition at line 87 of file KMeans.h.

void set_max_iter ( int32_t  iter )

set maximum number of iterations

Parameters:
iterthe new maximum

Definition at line 106 of file KMeans.h.

void sqdist ( float64_t x,
CSimpleFeatures< float64_t > *  y,
float64_t z,
int32_t  n1,
int32_t  offs,
int32_t  n2,
int32_t  m 
) [protected]

sqdist

Parameters:
xx
yy
zz
n1n1
offsoffset
n2n2
mm

Definition at line 129 of file KMeans.cpp.

bool train ( CFeatures data = NULL ) [virtual]

train k-means

Parameters:
datatraining data (parameter can be avoided if distance or kernel-based classifiers are used and distance/kernels are initialized with train data)
Returns:
whether training was successful

Reimplemented from CClassifier.

Definition at line 48 of file KMeans.cpp.


Member Data Documentation

int32_t dimensions [protected]

number of dimensions

Definition at line 243 of file KMeans.h.

int32_t k [protected]

the k parameter in KMeans

Definition at line 240 of file KMeans.h.

int32_t max_iter [protected]

maximum number of iterations

Definition at line 237 of file KMeans.h.

float64_t* mus [protected]

centers of the clusters (size dimensions x k)

Definition at line 249 of file KMeans.h.

float64_t* R [protected]

radi of the clusters (size k)

Definition at line 246 of file KMeans.h.


The documentation for this class was generated from the following files:
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Defines

SHOGUN Machine Learning Toolbox - Documentation