CHMM Class Reference


Detailed Description

Hidden Markov Model.

Structure and Function collection. This Class implements a Hidden Markov Model. For a tutorial on HMMs see Rabiner et.al A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition, 1989

Several functions for tasks such as training,reading/writing models, reading observations, calculation of derivatives are supplied.

Definition at line 361 of file HMM.h.

Inheritance diagram for CHMM:
Inheritance graph
[legend]

List of all members.

Public Member Functions

bool alloc_state_dependend_arrays ()
 allocates memory that depends on N
void free_state_dependend_arrays ()
 free memory that depends on N
bool linear_train (bool right_align=false)
 estimates linear model from observations.
bool permutation_entropy (int32_t window_width, int32_t sequence_number)
 compute permutation entropy
virtual const char * get_name () const
Constructor/Destructor and helper function

Train definitions. Encapsulates Modelparameters that are constant/shall be learned. Consists of structures and access functions for learning only defined transitions and constants.



 CHMM (int32_t N, int32_t M, CModel *model, float64_t PSEUDO)
 CHMM (CStringFeatures< uint16_t > *obs, int32_t N, int32_t M, float64_t PSEUDO)
 CHMM (int32_t N, float64_t *p, float64_t *q, float64_t *a)
 CHMM (int32_t N, float64_t *p, float64_t *q, int32_t num_trans, float64_t *a_trans)
 CHMM (FILE *model_file, float64_t PSEUDO)
 CHMM (CHMM *h)
 Constructor - Clone model h.
virtual ~CHMM ()
 Destructor - Cleanup.
virtual bool train ()
virtual int32_t get_num_model_parameters ()
virtual float64_t get_log_model_parameter (int32_t num_param)
virtual float64_t get_log_derivative (int32_t num_param, int32_t num_example)
virtual float64_t get_log_likelihood_example (int32_t num_example)
bool initialize (CModel *model, float64_t PSEUDO, FILE *model_file=NULL)
probability functions.

forward/backward/viterbi algorithm



float64_t forward_comp (int32_t time, int32_t state, int32_t dimension)
float64_t forward_comp_old (int32_t time, int32_t state, int32_t dimension)
float64_t backward_comp (int32_t time, int32_t state, int32_t dimension)
float64_t backward_comp_old (int32_t time, int32_t state, int32_t dimension)
float64_t best_path (int32_t dimension)
uint16_t get_best_path_state (int32_t dim, int32_t t)
float64_t model_probability_comp ()
float64_t model_probability (int32_t dimension=-1)
 inline proxy for model probability.
float64_t linear_model_probability (int32_t dimension)
convergence criteria



bool set_iterations (int32_t num)
int32_t get_iterations ()
bool set_epsilon (float64_t eps)
float64_t get_epsilon ()
bool baum_welch_viterbi_train (BaumWelchViterbiType type)
model training



void estimate_model_baum_welch (CHMM *train)
void estimate_model_baum_welch_trans (CHMM *train)
void estimate_model_baum_welch_old (CHMM *train)
void estimate_model_baum_welch_defined (CHMM *train)
void estimate_model_viterbi (CHMM *train)
void estimate_model_viterbi_defined (CHMM *train)
output functions.



void output_model (bool verbose=false)
void output_model_defined (bool verbose=false)
 performs output_model only for the defined transitions etc
model helper functions.



void normalize (bool keep_dead_states=false)
 normalize the model to satisfy stochasticity
void add_states (int32_t num_states, float64_t default_val=0)
bool append_model (CHMM *append_model, float64_t *cur_out, float64_t *app_out)
bool append_model (CHMM *append_model)
void chop (float64_t value)
 set any model parameter with probability smaller than value to ZERO
void convert_to_log ()
 convert model to log probabilities
void init_model_random ()
 init model with random values
void init_model_defined ()
void clear_model ()
 initializes model with log(PSEUDO)
void clear_model_defined ()
 initializes only parameters in learn_x with log(PSEUDO)
void copy_model (CHMM *l)
 copies the the modelparameters from l
void invalidate_model ()
bool get_status () const
float64_t get_pseudo () const
 returns current pseudo value
void set_pseudo (float64_t pseudo)
 sets current pseudo value



void set_observations (CStringFeatures< uint16_t > *obs, CHMM *hmm=NULL)
void set_observation_nocache (CStringFeatures< uint16_t > *obs)
CStringFeatures< uint16_t > * get_observations ()
 return observation pointer
load/save functions.

for observations/model/traindefinitions



bool load_definitions (FILE *file, bool verbose, bool initialize=true)
bool load_model (FILE *file)
bool save_model (FILE *file)
bool save_model_derivatives (FILE *file)
bool save_model_derivatives_bin (FILE *file)
bool save_model_bin (FILE *file)
bool check_model_derivatives ()
 numerically check whether derivates were calculated right
bool check_model_derivatives_combined ()
T_STATESget_path (int32_t dim, float64_t &prob)
bool save_path (FILE *file)
bool save_path_derivatives (FILE *file)
bool save_path_derivatives_bin (FILE *file)
bool save_likelihood_bin (FILE *file)
bool save_likelihood (FILE *file)
access functions for model parameters

for all the arrays a,b,p,q,A,B,psi and scalar model parameters like N,M



T_STATES get_N () const
 access function for number of states N
int32_t get_M () const
 access function for number of observations M
void set_q (T_STATES offset, float64_t value)
void set_p (T_STATES offset, float64_t value)
void set_A (T_STATES line_, T_STATES column, float64_t value)
void set_a (T_STATES line_, T_STATES column, float64_t value)
void set_B (T_STATES line_, uint16_t column, float64_t value)
void set_b (T_STATES line_, uint16_t column, float64_t value)
void set_psi (int32_t time, T_STATES state, T_STATES value, int32_t dimension)
float64_t get_q (T_STATES offset) const
float64_t get_p (T_STATES offset) const
float64_t get_A (T_STATES line_, T_STATES column) const
float64_t get_a (T_STATES line_, T_STATES column) const
float64_t get_B (T_STATES line_, uint16_t column) const
float64_t get_b (T_STATES line_, uint16_t column) const
T_STATES get_psi (int32_t time, T_STATES state, int32_t dimension) const
functions for observations

management and access functions for observation matrix



float64_t state_probability (int32_t time, int32_t state, int32_t dimension)
 calculates probability of being in state i at time t for dimension
float64_t transition_probability (int32_t time, int32_t state_i, int32_t state_j, int32_t dimension)
 calculates probability of being in state i at time t and state j at time t+1 for dimension
derivatives of model probabilities.

computes log dp(lambda)/d lambda_i

Parameters:
dimension dimension for that derivatives are calculated
i,j parameter specific


float64_t linear_model_derivative (T_STATES i, uint16_t j, int32_t dimension)
float64_t model_derivative_p (T_STATES i, int32_t dimension)
float64_t model_derivative_q (T_STATES i, int32_t dimension)
float64_t model_derivative_a (T_STATES i, T_STATES j, int32_t dimension)
 computes log dp(lambda)/d a_ij.
float64_t model_derivative_b (T_STATES i, uint16_t j, int32_t dimension)
 computes log dp(lambda)/d b_ij.
derivatives of path probabilities.

computes d log p(lambda,best_path)/d lambda_i

Parameters:
dimension dimension for that derivatives are calculated
i,j parameter specific


float64_t path_derivative_p (T_STATES i, int32_t dimension)
 computes d log p(lambda,best_path)/d p_i
float64_t path_derivative_q (T_STATES i, int32_t dimension)
 computes d log p(lambda,best_path)/d q_i
float64_t path_derivative_a (T_STATES i, T_STATES j, int32_t dimension)
 computes d log p(lambda,best_path)/d a_ij
float64_t path_derivative_b (T_STATES i, uint16_t j, int32_t dimension)
 computes d log p(lambda,best_path)/d b_ij

Protected Member Functions

void prepare_path_derivative (int32_t dim)
 initialization function that is called before path_derivatives are calculated
float64_t forward (int32_t time, int32_t state, int32_t dimension)
 inline proxies for forward pass
float64_t backward (int32_t time, int32_t state, int32_t dimension)
 inline proxies for backward pass
input helper functions.

for reading model/definition/observation files



bool get_numbuffer (FILE *file, char *buffer, int32_t length)
 put a sequence of numbers into the buffer
void open_bracket (FILE *file)
 expect open bracket.
void close_bracket (FILE *file)
 expect closing bracket
bool comma_or_space (FILE *file)
 expect comma or space.
void error (int32_t p_line, const char *str)
 parse error messages

Protected Attributes

float64_tarrayN1
float64_tarrayN2
T_ALPHA_BETA alpha_cache
 cache for forward variables can be terrible HUGE O(T*N)
T_ALPHA_BETA beta_cache
 cache for backward variables can be terrible HUGE O(T*N)
T_STATESstates_per_observation_psi
 backtracking table for viterbi can be terrible HUGE O(T*N)
T_STATESpath
 best path (=state sequence) through model
bool path_prob_updated
 true if path probability is up to date
int32_t path_prob_dimension
 dimension for which path_prob was calculated
model specific variables.

these are p,q,a,b,N,M etc



int32_t M
 number of observation symbols eg. ACGT -> 0123
int32_t N
 number of states
float64_t PSEUDO
 define pseudocounts against overfitting
int32_t line
CStringFeatures< uint16_t > * p_observations
 observation matrix
CModelmodel
float64_ttransition_matrix_A
 matrix of absolute counts of transitions
float64_tobservation_matrix_B
 matrix of absolute counts of observations within each state
float64_ttransition_matrix_a
 transition matrix
float64_tinitial_state_distribution_p
 initial distribution of states
float64_tend_state_distribution_q
 distribution of end-states
float64_tobservation_matrix_b
 distribution of observations within each state
int32_t iterations
 convergence criterion iterations
int32_t iteration_count
float64_t epsilon
 convergence criterion epsilon
int32_t conv_it
float64_t all_pat_prob
 probability of best path
float64_t pat_prob
 probability of best path
float64_t mod_prob
 probability of model
bool mod_prob_updated
 true if model probability is up to date
bool all_path_prob_updated
 true if path probability is up to date
int32_t path_deriv_dimension
 dimension for which path_deriv was calculated
bool path_deriv_updated
 true if path derivative is up to date
bool loglikelihood
bool status
bool reused_caches

Static Protected Attributes

static const int32_t GOTN = (1<<1)
static const int32_t GOTM = (1<<2)
static const int32_t GOTO = (1<<3)
static const int32_t GOTa = (1<<4)
static const int32_t GOTb = (1<<5)
static const int32_t GOTp = (1<<6)
static const int32_t GOTq = (1<<7)
static const int32_t GOTlearn_a = (1<<1)
static const int32_t GOTlearn_b = (1<<2)
static const int32_t GOTlearn_p = (1<<3)
static const int32_t GOTlearn_q = (1<<4)
static const int32_t GOTconst_a = (1<<5)
static const int32_t GOTconst_b = (1<<6)
static const int32_t GOTconst_p = (1<<7)
static const int32_t GOTconst_q = (1<<8)

Constructor & Destructor Documentation

CHMM::CHMM ( int32_t  N,
int32_t  M,
CModel model,
float64_t  PSEUDO 
)

Constructor

Parameters:
N number of states
M number of emissions
model model which holds definitions of states to be learned + consts
PSEUDO Pseudo Value

Definition at line 153 of file HMM.cpp.

CHMM::CHMM ( CStringFeatures< uint16_t > *  obs,
int32_t  N,
int32_t  M,
float64_t  PSEUDO 
)

Definition at line 165 of file HMM.cpp.

CHMM::CHMM ( int32_t  N,
float64_t p,
float64_t q,
float64_t a 
)

Definition at line 180 of file HMM.cpp.

CHMM::CHMM ( int32_t  N,
float64_t p,
float64_t q,
int32_t  num_trans,
float64_t a_trans 
)

Definition at line 232 of file HMM.cpp.

CHMM::CHMM ( FILE *  model_file,
float64_t  PSEUDO 
)

Constructor - Initialization from model file.

Parameters:
model_file Filehandle to a hmm model file (*.mod)
PSEUDO Pseudo Value

Definition at line 344 of file HMM.cpp.

CHMM::CHMM ( CHMM h  ) 

Constructor - Clone model h.

Definition at line 141 of file HMM.cpp.

CHMM::~CHMM (  )  [virtual]

Destructor - Cleanup.

Definition at line 352 of file HMM.cpp.


Member Function Documentation

void CHMM::add_states ( int32_t  num_states,
float64_t  default_val = 0 
)

increases the number of states by num_states the new a/b/p/q values are given the value default_val where 0<=default_val<=1

Definition at line 4992 of file HMM.cpp.

bool CHMM::alloc_state_dependend_arrays (  ) 

allocates memory that depends on N

Definition at line 434 of file HMM.cpp.

bool CHMM::append_model ( CHMM append_model  ) 

appends the append_model to the current hmm, here no extra states are created. former q_i are multiplied by q_ji to give the a_ij from the current hmm to the append_model

Definition at line 4792 of file HMM.cpp.

bool CHMM::append_model ( CHMM append_model,
float64_t cur_out,
float64_t app_out 
)

appends the append_model to the current hmm, i.e. two extra states are created. one is the end state of the current hmm with outputs cur_out (of size M) and the other state is the start state of the append_model. transition probability from state 1 to states 1 is 1

Definition at line 4884 of file HMM.cpp.

float64_t CHMM::backward ( int32_t  time,
int32_t  state,
int32_t  dimension 
) [protected]

inline proxies for backward pass

Definition at line 1538 of file HMM.h.

float64_t CHMM::backward_comp ( int32_t  time,
int32_t  state,
int32_t  dimension 
)

backward algorithm. calculates Pr[O_t+1,O_t+2, ..., O_T-1| q_time=S_i, lambda] for 0<= time <= T-1 Pr[O|lambda] for time >= T

Parameters:
time t
state i
dimension dimension of observation (observations are a matrix, where a row stands for one dimension i.e. 0_0,O_1,...,O_{T-1}

Definition at line 851 of file HMM.cpp.

float64_t CHMM::backward_comp_old ( int32_t  time,
int32_t  state,
int32_t  dimension 
)

Definition at line 950 of file HMM.cpp.

bool CHMM::baum_welch_viterbi_train ( BaumWelchViterbiType  type  ) 

interface for e.g. GUIHMM to run BaumWelch or Viterbi training

Parameters:
type type of BaumWelch/Viterbi training

Definition at line 5499 of file HMM.cpp.

float64_t CHMM::best_path ( int32_t  dimension  ) 

calculates probability of best state sequence s_0,...,s_T-1 AND path itself using viterbi algorithm. The path can be found in the array PATH(dimension)[0..T-1] afterwards

Parameters:
dimension dimension of observation for which the most probable path is calculated (observations are a matrix, where a row stands for one dimension i.e. 0_0,O_1,...,O_{T-1}

Definition at line 1082 of file HMM.cpp.

bool CHMM::check_model_derivatives (  ) 

numerically check whether derivates were calculated right

Definition at line 4549 of file HMM.cpp.

bool CHMM::check_model_derivatives_combined (  ) 

Definition at line 4479 of file HMM.cpp.

void CHMM::chop ( float64_t  value  ) 

set any model parameter with probability smaller than value to ZERO

Definition at line 5052 of file HMM.cpp.

void CHMM::clear_model (  ) 

initializes model with log(PSEUDO)

Definition at line 2591 of file HMM.cpp.

void CHMM::clear_model_defined (  ) 

initializes only parameters in learn_x with log(PSEUDO)

Definition at line 2607 of file HMM.cpp.

void CHMM::close_bracket ( FILE *  file  )  [protected]

expect closing bracket

Definition at line 2754 of file HMM.cpp.

bool CHMM::comma_or_space ( FILE *  file  )  [protected]

expect comma or space.

Definition at line 2767 of file HMM.cpp.

void CHMM::convert_to_log (  ) 

convert model to log probabilities

Definition at line 2324 of file HMM.cpp.

void CHMM::copy_model ( CHMM l  ) 

copies the the modelparameters from l

Definition at line 2630 of file HMM.cpp.

void CHMM::error ( int32_t  p_line,
const char *  str 
) [protected]

parse error messages

Definition at line 1483 of file HMM.h.

void CHMM::estimate_model_baum_welch ( CHMM train  ) 

uses baum-welch-algorithm to train a fully connected HMM.

Parameters:
train model from which the new model is estimated

Definition at line 1459 of file HMM.cpp.

void CHMM::estimate_model_baum_welch_defined ( CHMM train  ) 

uses baum-welch-algorithm to train the defined transitions etc.

Parameters:
train model from which the new model is estimated

Definition at line 1700 of file HMM.cpp.

void CHMM::estimate_model_baum_welch_old ( CHMM train  ) 

Definition at line 1545 of file HMM.cpp.

void CHMM::estimate_model_baum_welch_trans ( CHMM train  ) 

Definition at line 1630 of file HMM.cpp.

void CHMM::estimate_model_viterbi ( CHMM train  ) 

uses viterbi training to train a fully connected HMM

Parameters:
train model from which the new model is estimated

Definition at line 1876 of file HMM.cpp.

void CHMM::estimate_model_viterbi_defined ( CHMM train  ) 

uses viterbi training to train the defined transitions etc.

Parameters:
train model from which the new model is estimated

Definition at line 2003 of file HMM.cpp.

float64_t CHMM::forward ( int32_t  time,
int32_t  state,
int32_t  dimension 
) [protected]

inline proxies for forward pass

Definition at line 1521 of file HMM.h.

float64_t CHMM::forward_comp ( int32_t  time,
int32_t  state,
int32_t  dimension 
)

forward algorithm. calculates Pr[O_0,O_1, ..., O_t, q_time=S_i| lambda] for 0<= time <= T-1 Pr[O|lambda] for time > T

Parameters:
time t
state i
dimension dimension of observation (observations are a matrix, where a row stands for one dimension i.e. 0_0,O_1,...,O_{T-1}

Definition at line 615 of file HMM.cpp.

float64_t CHMM::forward_comp_old ( int32_t  time,
int32_t  state,
int32_t  dimension 
)

Definition at line 719 of file HMM.cpp.

void CHMM::free_state_dependend_arrays (  ) 

free memory that depends on N

Definition at line 491 of file HMM.cpp.

float64_t CHMM::get_a ( T_STATES  line_,
T_STATES  column 
) const

access function for matrix a

Parameters:
line_ row in matrix 0...N-1
column column in matrix 0...N-1
Returns:
value at position line colum

Definition at line 1108 of file HMM.h.

float64_t CHMM::get_A ( T_STATES  line_,
T_STATES  column 
) const

access function for matrix A

Parameters:
line_ row in matrix 0...N-1
column column in matrix 0...N-1
Returns:
value at position line colum

Definition at line 1094 of file HMM.h.

float64_t CHMM::get_b ( T_STATES  line_,
uint16_t  column 
) const

access function for matrix b

Parameters:
line_ row in matrix 0...N-1
column column in matrix 0...M-1
Returns:
value at position line colum

Definition at line 1136 of file HMM.h.

float64_t CHMM::get_B ( T_STATES  line_,
uint16_t  column 
) const

access function for matrix B

Parameters:
line_ row in matrix 0...N-1
column column in matrix 0...M-1
Returns:
value at position line colum

Definition at line 1122 of file HMM.h.

uint16_t CHMM::get_best_path_state ( int32_t  dim,
int32_t  t 
)

Definition at line 544 of file HMM.h.

float64_t CHMM::get_epsilon (  ) 

Definition at line 607 of file HMM.h.

int32_t CHMM::get_iterations (  ) 

Definition at line 605 of file HMM.h.

float64_t CHMM::get_log_derivative ( int32_t  num_param,
int32_t  num_example 
) [virtual]

get partial derivative of likelihood function (logarithmic)

abstract base method

Parameters:
num_param derivative against which param
num_example which example
Returns:
derivative of likelihood (logarithmic)

Implements CDistribution.

Definition at line 5432 of file HMM.cpp.

virtual float64_t CHMM::get_log_likelihood_example ( int32_t  num_example  )  [virtual]

compute log likelihood for example

abstract base method

Parameters:
num_example which example
Returns:
log likelihood for example

Implements CDistribution.

Definition at line 494 of file HMM.h.

float64_t CHMM::get_log_model_parameter ( int32_t  num_param  )  [virtual]

get model parameter (logarithmic)

abstrac base method

Returns:
model parameter (logarithmic)

Implements CDistribution.

Definition at line 5457 of file HMM.cpp.

int32_t CHMM::get_M (  )  const

access function for number of observations M

Definition at line 963 of file HMM.h.

T_STATES CHMM::get_N (  )  const

access function for number of states N

Definition at line 960 of file HMM.h.

virtual const char* CHMM::get_name (  )  const [virtual]
Returns:
object name

Implements CSGObject.

Definition at line 1165 of file HMM.h.

virtual int32_t CHMM::get_num_model_parameters (  )  [virtual]

get number of parameters in model

abstract base method

Returns:
number of parameters in model

Implements CDistribution.

Definition at line 491 of file HMM.h.

bool CHMM::get_numbuffer ( FILE *  file,
char *  buffer,
int32_t  length 
) [protected]

put a sequence of numbers into the buffer

Definition at line 2794 of file HMM.cpp.

CStringFeatures<uint16_t>* CHMM::get_observations (  ) 

return observation pointer

Definition at line 778 of file HMM.h.

float64_t CHMM::get_p ( T_STATES  offset  )  const

access function for probability of initial states

Parameters:
offset index 0...N-1
Returns:
value at offset

Definition at line 1080 of file HMM.h.

T_STATES * CHMM::get_path ( int32_t  dim,
float64_t prob 
)

get viterbi path and path probability

Parameters:
dim dimension for which to obtain best path
prob likelihood of path
Returns:
viterbi path

Definition at line 4002 of file HMM.cpp.

float64_t CHMM::get_pseudo (  )  const

returns current pseudo value

Definition at line 731 of file HMM.h.

T_STATES CHMM::get_psi ( int32_t  time,
T_STATES  state,
int32_t  dimension 
) const

access function for backtracking table psi

Parameters:
time time 0...T-1
state state 0...N-1
dimension dimension of observations 0...DIMENSION-1
Returns:
state at specified time and position

Definition at line 1152 of file HMM.h.

float64_t CHMM::get_q ( T_STATES  offset  )  const

access function for probability of end states

Parameters:
offset index 0...N-1
Returns:
value at offset

Definition at line 1067 of file HMM.h.

bool CHMM::get_status (  )  const

get status

Returns:
true if everything is ok, else false

Definition at line 725 of file HMM.h.

void CHMM::init_model_defined (  ) 

init model according to const_x, learn_x. first model is initialized with 0 for all parameters then parameters in learn_x are initialized with random values finally const_x parameters are set and model is normalized.

Definition at line 2437 of file HMM.cpp.

void CHMM::init_model_random (  ) 

init model with random values

Definition at line 2371 of file HMM.cpp.

bool CHMM::initialize ( CModel model,
float64_t  PSEUDO,
FILE *  model_file = NULL 
)

initialization function - gets called by constructors.

Parameters:
model model which holds definitions of states to be learned + consts
PSEUDO Pseudo Value
model_file Filehandle to a hmm model file (*.mod)

Definition at line 526 of file HMM.cpp.

void CHMM::invalidate_model (  ) 

invalidates all caches. this function has to be called when direct changes to the model have been made. this is necessary for the forward/backward/viterbi algorithms to not work with old tables

Definition at line 2646 of file HMM.cpp.

float64_t CHMM::linear_model_derivative ( T_STATES  i,
uint16_t  j,
int32_t  dimension 
)

computes log dp(lambda)/d b_ij for linear model

Definition at line 1371 of file HMM.h.

float64_t CHMM::linear_model_probability ( int32_t  dimension  ) 

calculates likelihood for linear model on observations in MEMORY

Parameters:
dimension dimension for which probability is calculated
Returns:
model probability

Definition at line 574 of file HMM.h.

bool CHMM::linear_train ( bool  right_align = false  ) 

estimates linear model from observations.

Definition at line 5080 of file HMM.cpp.

bool CHMM::load_definitions ( FILE *  file,
bool  verbose,
bool  initialize = true 
)

read definitions file (learn_x,const_x) used for training. -format specs: definition_file (train.def) % HMM-TRAIN - specification % learn_a - elements in state_transition_matrix to be learned % learn_b - elements in oberservation_per_state_matrix to be learned % note: each line stands for % state, observation(0), observation(1)...observation(NOW) % learn_p - elements in initial distribution to be learned % learn_q - elements in the end-state distribution to be learned % % const_x - specifies initial values of elements % rest is assumed to be 0.0 % % NOTE: IMPLICIT DEFINES: % define A 0 % define C 1 % define G 2 % define T 3

learn_a=[ [int32_t,int32_t]; [int32_t,int32_t]; [int32_t,int32_t]; ........ [int32_t,int32_t]; [-1,-1]; ];

learn_b=[ [int32_t,int32_t,int32_t,...,int32_t]; [int32_t,int32_t,int32_t,...,int32_t]; [int32_t,int32_t,int32_t,...,int32_t]; ........ [int32_t,int32_t,int32_t,...,int32_t]; [-1,-1]; ];

learn_p= [ int32_t, ... , int32_t, -1 ];

learn_q= [ int32_t, ... , int32_t, -1 ];

const_a=[ [int32_t,int32_t,float64_t]; [int32_t,int32_t,float64_t]; [int32_t,int32_t,float64_t]; ........ [int32_t,int32_t,float64_t]; [-1,-1,-1]; ];

const_b=[ [int32_t,int32_t,int32_t,...,int32_t,float64_t]; [int32_t,int32_t,int32_t,...,int32_t,float64_t]; [int32_t,int32_t,int32_t,...,int32_t,<DOUBLE]; ........ [int32_t,int32_t,int32_t,...,int32_t,float64_t]; [-1,-1,-1]; ];

const_p[]=[ [int32_t, float64_t], ... , [int32_t,float64_t], [-1,-1] ]; const_q[]=[ [int32_t, float64_t], ... , [int32_t,float64_t], [-1,-1] ];

Parameters:
file filehandle to definitions file
verbose true for verbose messages
initialize true to initialize to underlying HMM

Definition at line 3201 of file HMM.cpp.

bool CHMM::load_model ( FILE *  file  ) 

read model from file. -format specs: model_file (model.hmm) % HMM - specification % N - number of states % M - number of observation_tokens % a is state_transition_matrix % size(a)= [N,N] % % b is observation_per_state_matrix % size(b)= [N,M] % % p is initial distribution % size(p)= [1, N]

N=int32_t; M=int32_t;

p=[float64_t,float64_t...float64_t]; q=[float64_t,float64_t...float64_t];

a=[ [float64_t,float64_t...float64_t]; [float64_t,float64_t...float64_t]; [float64_t,float64_t...float64_t]; [float64_t,float64_t...float64_t]; [float64_t,float64_t...float64_t]; ];

b=[ [float64_t,float64_t...float64_t]; [float64_t,float64_t...float64_t]; [float64_t,float64_t...float64_t]; [float64_t,float64_t...float64_t]; [float64_t,float64_t...float64_t]; ];

Parameters:
file filehandle to model file

Definition at line 2903 of file HMM.cpp.

float64_t CHMM::model_derivative_a ( T_STATES  i,
T_STATES  j,
int32_t  dimension 
)

computes log dp(lambda)/d a_ij.

Definition at line 1402 of file HMM.h.

float64_t CHMM::model_derivative_b ( T_STATES  i,
uint16_t  j,
int32_t  dimension 
)

computes log dp(lambda)/d b_ij.

Definition at line 1413 of file HMM.h.

float64_t CHMM::model_derivative_p ( T_STATES  i,
int32_t  dimension 
)

computes log dp(lambda)/d p_i. backward path downto time 0 multiplied by observing first symbol in path at state i

Definition at line 1388 of file HMM.h.

float64_t CHMM::model_derivative_q ( T_STATES  i,
int32_t  dimension 
)

computes log dp(lambda)/d q_i. forward path upto time T-1

Definition at line 1396 of file HMM.h.

float64_t CHMM::model_probability ( int32_t  dimension = -1  ) 

inline proxy for model probability.

Definition at line 555 of file HMM.h.

float64_t CHMM::model_probability_comp (  ) 

calculates probability that observations were generated by the model using forward algorithm.

Definition at line 1211 of file HMM.cpp.

void CHMM::normalize ( bool  keep_dead_states = false  ) 

normalize the model to satisfy stochasticity

Definition at line 4757 of file HMM.cpp.

void CHMM::open_bracket ( FILE *  file  )  [protected]

expect open bracket.

Definition at line 2733 of file HMM.cpp.

void CHMM::output_model ( bool  verbose = false  ) 

prints the model parameters on screen.

Parameters:
verbose when false only the model probability will be printed when true the whole model will be printed additionally

Definition at line 2185 of file HMM.cpp.

void CHMM::output_model_defined ( bool  verbose = false  ) 

performs output_model only for the defined transitions etc

Definition at line 2269 of file HMM.cpp.

float64_t CHMM::path_derivative_a ( T_STATES  i,
T_STATES  j,
int32_t  dimension 
)

computes d log p(lambda,best_path)/d a_ij

Definition at line 1449 of file HMM.h.

float64_t CHMM::path_derivative_b ( T_STATES  i,
uint16_t  j,
int32_t  dimension 
)

computes d log p(lambda,best_path)/d b_ij

Definition at line 1456 of file HMM.h.

float64_t CHMM::path_derivative_p ( T_STATES  i,
int32_t  dimension 
)

computes d log p(lambda,best_path)/d p_i

Definition at line 1435 of file HMM.h.

float64_t CHMM::path_derivative_q ( T_STATES  i,
int32_t  dimension 
)

computes d log p(lambda,best_path)/d q_i

Definition at line 1442 of file HMM.h.

bool CHMM::permutation_entropy ( int32_t  window_width,
int32_t  sequence_number 
)

compute permutation entropy

Definition at line 5376 of file HMM.cpp.

void CHMM::prepare_path_derivative ( int32_t  dim  )  [protected]

initialization function that is called before path_derivatives are calculated

Definition at line 1493 of file HMM.h.

bool CHMM::save_likelihood ( FILE *  file  ) 

save model probability in ascii format

Parameters:
file filehandle

Definition at line 4056 of file HMM.cpp.

bool CHMM::save_likelihood_bin ( FILE *  file  ) 

save model probability in binary format

Parameters:
file filehandle

Definition at line 4039 of file HMM.cpp.

bool CHMM::save_model ( FILE *  file  ) 

save model to file.

Parameters:
file filehandle to model file

Definition at line 3906 of file HMM.cpp.

bool CHMM::save_model_bin ( FILE *  file  ) 

save model in binary format.

Parameters:
file filehandle

Definition at line 4077 of file HMM.cpp.

bool CHMM::save_model_derivatives ( FILE *  file  ) 

save model derivatives to file in ascii format.

Parameters:
file filehandle

Definition at line 4431 of file HMM.cpp.

bool CHMM::save_model_derivatives_bin ( FILE *  file  ) 

save model derivatives to file in binary format.

Parameters:
file filehandle

Definition at line 4310 of file HMM.cpp.

bool CHMM::save_path ( FILE *  file  ) 

save viterbi path in ascii format

Parameters:
file filehandle

Definition at line 4015 of file HMM.cpp.

bool CHMM::save_path_derivatives ( FILE *  file  ) 

save viterbi path in ascii format

Parameters:
file filehandle

Definition at line 4179 of file HMM.cpp.

bool CHMM::save_path_derivatives_bin ( FILE *  file  ) 

save viterbi path in binary format

Parameters:
file filehandle

Definition at line 4227 of file HMM.cpp.

void CHMM::set_a ( T_STATES  line_,
T_STATES  column,
float64_t  value 
)

access function for matrix a

Parameters:
line_ row in matrix 0...N-1
column column in matrix 0...N-1
value value to be set

Definition at line 1010 of file HMM.h.

void CHMM::set_A ( T_STATES  line_,
T_STATES  column,
float64_t  value 
)

access function for matrix A

Parameters:
line_ row in matrix 0...N-1
column column in matrix 0...N-1
value value to be set

Definition at line 996 of file HMM.h.

void CHMM::set_b ( T_STATES  line_,
uint16_t  column,
float64_t  value 
)

access function for matrix b

Parameters:
line_ row in matrix 0...N-1
column column in matrix 0...M-1
value value to be set

Definition at line 1038 of file HMM.h.

void CHMM::set_B ( T_STATES  line_,
uint16_t  column,
float64_t  value 
)

access function for matrix B

Parameters:
line_ row in matrix 0...N-1
column column in matrix 0...M-1
value value to be set

Definition at line 1024 of file HMM.h.

bool CHMM::set_epsilon ( float64_t  eps  ) 

Definition at line 606 of file HMM.h.

bool CHMM::set_iterations ( int32_t  num  ) 

Definition at line 604 of file HMM.h.

void CHMM::set_observation_nocache ( CStringFeatures< uint16_t > *  obs  ) 

set new observations only set the observation pointer and drop caches if there were any

Definition at line 5191 of file HMM.cpp.

void CHMM::set_observations ( CStringFeatures< uint16_t > *  obs,
CHMM hmm = NULL 
)

observation functions set/get observation matrix set new observations sets the observation pointer and initializes observation-dependent caches if hmm is given, then the caches of the model hmm are used

Definition at line 5233 of file HMM.cpp.

void CHMM::set_p ( T_STATES  offset,
float64_t  value 
)

access function for probability of first state

Parameters:
offset index 0...N-1
value value to be set

Definition at line 982 of file HMM.h.

void CHMM::set_pseudo ( float64_t  pseudo  ) 

sets current pseudo value

Definition at line 737 of file HMM.h.

void CHMM::set_psi ( int32_t  time,
T_STATES  state,
T_STATES  value,
int32_t  dimension 
)

access function for backtracking table psi

Parameters:
time time 0...T-1
state state 0...N-1
value value to be set
dimension dimension of observations 0...DIMENSION-1

Definition at line 1053 of file HMM.h.

void CHMM::set_q ( T_STATES  offset,
float64_t  value 
)

access function for probability of end states

Parameters:
offset index 0...N-1
value value to be set

Definition at line 969 of file HMM.h.

float64_t CHMM::state_probability ( int32_t  time,
int32_t  state,
int32_t  dimension 
)

calculates probability of being in state i at time t for dimension

Definition at line 1347 of file HMM.h.

virtual bool CHMM::train (  )  [virtual]

train distribution

abstrace base method

Returns:
if training was successful

Implements CDistribution.

Definition at line 490 of file HMM.h.

float64_t CHMM::transition_probability ( int32_t  time,
int32_t  state_i,
int32_t  state_j,
int32_t  dimension 
)

calculates probability of being in state i at time t and state j at time t+1 for dimension

Definition at line 1354 of file HMM.h.


Member Data Documentation

probability of best path

Definition at line 1216 of file HMM.h.

bool CHMM::all_path_prob_updated [protected]

true if path probability is up to date

Definition at line 1228 of file HMM.h.

T_ALPHA_BETA CHMM::alpha_cache [protected]

cache for forward variables can be terrible HUGE O(T*N)

Definition at line 1289 of file HMM.h.

float64_t* CHMM::arrayN1 [protected]

array of size N for temporary calculations

Definition at line 1253 of file HMM.h.

float64_t* CHMM::arrayN2 [protected]

array of size N for temporary calculations

Definition at line 1255 of file HMM.h.

T_ALPHA_BETA CHMM::beta_cache [protected]

cache for backward variables can be terrible HUGE O(T*N)

Definition at line 1291 of file HMM.h.

int32_t CHMM::conv_it [protected]

Definition at line 1213 of file HMM.h.

distribution of end-states

Definition at line 1202 of file HMM.h.

float64_t CHMM::epsilon [protected]

convergence criterion epsilon

Definition at line 1212 of file HMM.h.

const int32_t CHMM::GOTa = (1<<4) [static, protected]

GOTa

Definition at line 1315 of file HMM.h.

const int32_t CHMM::GOTb = (1<<5) [static, protected]

GOTb

Definition at line 1317 of file HMM.h.

const int32_t CHMM::GOTconst_a = (1<<5) [static, protected]

GOTconst_a

Definition at line 1332 of file HMM.h.

const int32_t CHMM::GOTconst_b = (1<<6) [static, protected]

GOTconst_b

Definition at line 1334 of file HMM.h.

const int32_t CHMM::GOTconst_p = (1<<7) [static, protected]

GOTconst_p

Definition at line 1336 of file HMM.h.

const int32_t CHMM::GOTconst_q = (1<<8) [static, protected]

GOTconst_q

Definition at line 1338 of file HMM.h.

const int32_t CHMM::GOTlearn_a = (1<<1) [static, protected]

GOTlearn_a

Definition at line 1324 of file HMM.h.

const int32_t CHMM::GOTlearn_b = (1<<2) [static, protected]

GOTlearn_b

Definition at line 1326 of file HMM.h.

const int32_t CHMM::GOTlearn_p = (1<<3) [static, protected]

GOTlearn_p

Definition at line 1328 of file HMM.h.

const int32_t CHMM::GOTlearn_q = (1<<4) [static, protected]

GOTlearn_q

Definition at line 1330 of file HMM.h.

const int32_t CHMM::GOTM = (1<<2) [static, protected]

GOTM

Definition at line 1311 of file HMM.h.

const int32_t CHMM::GOTN = (1<<1) [static, protected]

GOTN

Definition at line 1309 of file HMM.h.

const int32_t CHMM::GOTO = (1<<3) [static, protected]

GOTO

Definition at line 1313 of file HMM.h.

const int32_t CHMM::GOTp = (1<<6) [static, protected]

GOTp

Definition at line 1319 of file HMM.h.

const int32_t CHMM::GOTq = (1<<7) [static, protected]

GOTq

Definition at line 1321 of file HMM.h.

initial distribution of states

Definition at line 1199 of file HMM.h.

int32_t CHMM::iteration_count [protected]

Definition at line 1209 of file HMM.h.

int32_t CHMM::iterations [protected]

convergence criterion iterations

Definition at line 1208 of file HMM.h.

int32_t CHMM::line [protected]

Definition at line 1181 of file HMM.h.

bool CHMM::loglikelihood [protected]

Definition at line 1237 of file HMM.h.

int32_t CHMM::M [protected]

number of observation symbols eg. ACGT -> 0123

Definition at line 1172 of file HMM.h.

probability of model

Definition at line 1222 of file HMM.h.

bool CHMM::mod_prob_updated [protected]

true if model probability is up to date

Definition at line 1225 of file HMM.h.

CModel* CHMM::model [protected]

Definition at line 1187 of file HMM.h.

int32_t CHMM::N [protected]

number of states

Definition at line 1175 of file HMM.h.

distribution of observations within each state

Definition at line 1205 of file HMM.h.

matrix of absolute counts of observations within each state

Definition at line 1193 of file HMM.h.

CStringFeatures<uint16_t>* CHMM::p_observations [protected]

observation matrix

Definition at line 1184 of file HMM.h.

probability of best path

Definition at line 1219 of file HMM.h.

T_STATES* CHMM::path [protected]

best path (=state sequence) through model

Definition at line 1297 of file HMM.h.

int32_t CHMM::path_deriv_dimension [protected]

dimension for which path_deriv was calculated

Definition at line 1231 of file HMM.h.

bool CHMM::path_deriv_updated [protected]

true if path derivative is up to date

Definition at line 1234 of file HMM.h.

int32_t CHMM::path_prob_dimension [protected]

dimension for which path_prob was calculated

Definition at line 1303 of file HMM.h.

bool CHMM::path_prob_updated [protected]

true if path probability is up to date

Definition at line 1300 of file HMM.h.

float64_t CHMM::PSEUDO [protected]

define pseudocounts against overfitting

Definition at line 1178 of file HMM.h.

bool CHMM::reused_caches [protected]

Definition at line 1243 of file HMM.h.

backtracking table for viterbi can be terrible HUGE O(T*N)

Definition at line 1294 of file HMM.h.

bool CHMM::status [protected]

Definition at line 1240 of file HMM.h.

transition matrix

Definition at line 1196 of file HMM.h.

matrix of absolute counts of transitions

Definition at line 1190 of file HMM.h.


The documentation for this class was generated from the following files:

SHOGUN Machine Learning Toolbox - Documentation