CCommUlongStringKernel Class Reference


Detailed Description

The CommUlongString kernel may be used to compute the spectrum kernel from strings that have been mapped into unsigned 64bit integers.

These 64bit integers correspond to k-mers. To be applicable in this kernel they need to be sorted (e.g. via the SortUlongString pre-processor).

It basically uses the algorithm in the unix "comm" command (hence the name) to compute:

\[ k({\bf x},({\bf x'})= \Phi_k({\bf x})\cdot \Phi_k({\bf x'}) \]

where $\Phi_k$ maps a sequence ${\bf x}$ that consists of letters in $\Sigma$ to a feature vector of size $|\Sigma|^k$. In this feature vector each entry denotes how often the k-mer appears in that ${\bf x}$.

Note that this representation enables spectrum kernels of order 8 for 8bit alphabets (like binaries) and order 32 for 2-bit alphabets like DNA.

For this kernel the linadd speedups are implemented (though there is room for improvement here when a whole set of sequences is ADDed) using sorted lists.

Definition at line 43 of file CommUlongStringKernel.h.

Inheritance diagram for CCommUlongStringKernel:
Inheritance graph
[legend]

List of all members.

Public Member Functions

 CCommUlongStringKernel (int32_t size=10, bool use_sign=false)
 CCommUlongStringKernel (CStringFeatures< uint64_t > *l, CStringFeatures< uint64_t > *r, bool use_sign=false, int32_t size=10)
virtual ~CCommUlongStringKernel ()
virtual bool init (CFeatures *l, CFeatures *r)
virtual void cleanup ()
bool load_init (FILE *src)
bool save_init (FILE *dest)
virtual EKernelType get_kernel_type ()
virtual const char * get_name () const
virtual bool init_optimization (int32_t count, int32_t *IDX, float64_t *weights)
virtual bool delete_optimization ()
virtual float64_t compute_optimized (int32_t idx)
void merge_dictionaries (int32_t &t, int32_t j, int32_t &k, uint64_t *vec, uint64_t *dic, float64_t *dic_weights, float64_t weight, int32_t vec_idx)
virtual void add_to_normal (int32_t idx, float64_t weight)
virtual void clear_normal ()
virtual void remove_lhs ()
virtual void remove_rhs ()
virtual EFeatureType get_feature_type ()
void get_dictionary (int32_t &dsize, uint64_t *&dict, float64_t *&dweights)

Protected Member Functions

float64_t compute (int32_t idx_a, int32_t idx_b)

Protected Attributes

CDynamicArray< uint64_t > dictionary
CDynamicArray< float64_tdictionary_weights
bool use_sign

Constructor & Destructor Documentation

CCommUlongStringKernel::CCommUlongStringKernel ( int32_t  size = 10,
bool  use_sign = false 
)

constructor

Parameters:
size cache size
use_sign if sign shall be used

Definition at line 17 of file CommUlongStringKernel.cpp.

CCommUlongStringKernel::CCommUlongStringKernel ( CStringFeatures< uint64_t > *  l,
CStringFeatures< uint64_t > *  r,
bool  use_sign = false,
int32_t  size = 10 
)

constructor

Parameters:
l features of left-hand side
r features of right-hand side
use_sign if sign shall be used
size cache size

Definition at line 26 of file CommUlongStringKernel.cpp.

CCommUlongStringKernel::~CCommUlongStringKernel (  )  [virtual]

Definition at line 37 of file CommUlongStringKernel.cpp.


Member Function Documentation

void CCommUlongStringKernel::add_to_normal ( int32_t  idx,
float64_t  weight 
) [virtual]

add to normal

Parameters:
idx where to add
weight what to add

Reimplemented from CKernel.

Definition at line 150 of file CommUlongStringKernel.cpp.

void CCommUlongStringKernel::cleanup (  )  [virtual]

clean up kernel

Reimplemented from CKernel.

Definition at line 71 of file CommUlongStringKernel.cpp.

void CCommUlongStringKernel::clear_normal (  )  [virtual]

clear normal

Reimplemented from CKernel.

Definition at line 216 of file CommUlongStringKernel.cpp.

float64_t CCommUlongStringKernel::compute ( int32_t  idx_a,
int32_t  idx_b 
) [protected, virtual]

compute kernel function for features a and b idx_{a,b} denote the index of the feature vectors in the corresponding feature object

Parameters:
idx_a index a
idx_b index b
Returns:
computed kernel function at indices a,b

Implements CKernel.

Definition at line 88 of file CommUlongStringKernel.cpp.

float64_t CCommUlongStringKernel::compute_optimized ( int32_t  idx  )  [virtual]

compute optimized

Parameters:
idx index to compute
Returns:
optimized value at given index

Reimplemented from CKernel.

Definition at line 260 of file CommUlongStringKernel.cpp.

bool CCommUlongStringKernel::delete_optimization (  )  [virtual]

delete optimization

Returns:
if deleting was successful

Reimplemented from CKernel.

Definition at line 251 of file CommUlongStringKernel.cpp.

void CCommUlongStringKernel::get_dictionary ( int32_t &  dsize,
uint64_t *&  dict,
float64_t *&  dweights 
)

get dictionary

Parameters:
dsize dictionary size will be stored in here
dict dictionary will be stored in here
dweights dictionary weights will be stored in here

Definition at line 192 of file CommUlongStringKernel.h.

virtual EFeatureType CCommUlongStringKernel::get_feature_type (  )  [virtual]

return feature type the kernel can deal with

Returns:
feature type ULONG

Reimplemented from CStringKernel< uint64_t >.

Definition at line 184 of file CommUlongStringKernel.h.

virtual EKernelType CCommUlongStringKernel::get_kernel_type (  )  [virtual]

return what type of kernel we are

Returns:
kernel type COMMULONGSTRING

Implements CKernel.

Definition at line 96 of file CommUlongStringKernel.h.

virtual const char* CCommUlongStringKernel::get_name (  )  const [virtual]

return the kernel's name

Returns:
name CommUlongString

Implements CSGObject.

Definition at line 102 of file CommUlongStringKernel.h.

bool CCommUlongStringKernel::init ( CFeatures l,
CFeatures r 
) [virtual]

initialize kernel

Parameters:
l features of left-hand side
r features of right-hand side
Returns:
if initializing was successful

Reimplemented from CStringKernel< uint64_t >.

Definition at line 65 of file CommUlongStringKernel.cpp.

bool CCommUlongStringKernel::init_optimization ( int32_t  count,
int32_t *  IDX,
float64_t weights 
) [virtual]

initialize optimization

Parameters:
count count
IDX index
weights weights
Returns:
if initializing was successful

Reimplemented from CKernel.

Definition at line 223 of file CommUlongStringKernel.cpp.

bool CCommUlongStringKernel::load_init ( FILE *  src  )  [virtual]

load kernel init_data

Parameters:
src file to load from
Returns:
if loading was successful

Implements CKernel.

Definition at line 78 of file CommUlongStringKernel.cpp.

void CCommUlongStringKernel::merge_dictionaries ( int32_t &  t,
int32_t  j,
int32_t &  k,
uint64_t *  vec,
uint64_t *  dic,
float64_t dic_weights,
float64_t  weight,
int32_t  vec_idx 
)

merge dictionaries

Parameters:
t t
j j
k k
vec vector
dic dictionary
dic_weights dictionary weights
weight weight
vec_idx vector index

Definition at line 138 of file CommUlongStringKernel.h.

void CCommUlongStringKernel::remove_lhs (  )  [virtual]

remove lhs from kernel

Reimplemented from CKernel.

Definition at line 42 of file CommUlongStringKernel.cpp.

void CCommUlongStringKernel::remove_rhs (  )  [virtual]

remove rhs from kernel

Reimplemented from CKernel.

Definition at line 55 of file CommUlongStringKernel.cpp.

bool CCommUlongStringKernel::save_init ( FILE *  dest  )  [virtual]

save kernel init_data

Parameters:
dest file to save to
Returns:
if saving was successful

Implements CKernel.

Definition at line 83 of file CommUlongStringKernel.cpp.


Member Data Documentation

dictionary

Definition at line 213 of file CommUlongStringKernel.h.

dictionary weights

Definition at line 215 of file CommUlongStringKernel.h.

if sign shall be used

Definition at line 218 of file CommUlongStringKernel.h.


The documentation for this class was generated from the following files:

SHOGUN Machine Learning Toolbox - Documentation