公有成员 | 保护成员 | 保护属性 | 静态保护属性

CHMM类参考


详细描述

Hidden Markov Model.

Structure and Function collection. This Class implements a Hidden Markov Model. For a tutorial on HMMs see Rabiner et.al A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition, 1989

Several functions for tasks such as training,reading/writing models, reading observations, calculation of derivatives are supplied.

在文件HMM.h365行定义。

继承图,类CHMM
Inheritance graph
[图例]

所有成员的列表。

公有成员

bool alloc_state_dependend_arrays ()
 allocates memory that depends on N
void free_state_dependend_arrays ()
 free memory that depends on N
bool linear_train (bool right_align=false)
 estimates linear model from observations.
bool permutation_entropy (int32_t window_width, int32_t sequence_number)
 compute permutation entropy
virtual const char * get_name () const
Constructor/Destructor and helper function

Train definitions. Encapsulates Modelparameters that are constant/shall be learned. Consists of structures and access functions for learning only defined transitions and constants.

 CHMM (int32_t N, int32_t M, CModel *model, float64_t PSEUDO)
 CHMM (CStringFeatures< uint16_t > *obs, int32_t N, int32_t M, float64_t PSEUDO)
 CHMM (int32_t N, float64_t *p, float64_t *q, float64_t *a)
 CHMM (int32_t N, float64_t *p, float64_t *q, int32_t num_trans, float64_t *a_trans)
 CHMM (FILE *model_file, float64_t PSEUDO)
 CHMM (CHMM *h)
 Constructor - Clone model h.
virtual ~CHMM ()
 Destructor - Cleanup.
virtual bool train (CFeatures *data=NULL)
virtual int32_t get_num_model_parameters ()
virtual float64_t get_log_model_parameter (int32_t num_param)
virtual float64_t get_log_derivative (int32_t num_param, int32_t num_example)
virtual float64_t get_log_likelihood_example (int32_t num_example)
bool initialize (CModel *model, float64_t PSEUDO, FILE *model_file=NULL)
probability functions.

forward/backward/viterbi algorithm

float64_t forward_comp (int32_t time, int32_t state, int32_t dimension)
float64_t forward_comp_old (int32_t time, int32_t state, int32_t dimension)
float64_t backward_comp (int32_t time, int32_t state, int32_t dimension)
float64_t backward_comp_old (int32_t time, int32_t state, int32_t dimension)
float64_t best_path (int32_t dimension)
uint16_t get_best_path_state (int32_t dim, int32_t t)
float64_t model_probability_comp ()
float64_t model_probability (int32_t dimension=-1)
 inline proxy for model probability.
float64_t linear_model_probability (int32_t dimension)
convergence criteria

bool set_iterations (int32_t num)
int32_t get_iterations ()
bool set_epsilon (float64_t eps)
float64_t get_epsilon ()
bool baum_welch_viterbi_train (BaumWelchViterbiType type)
model training

void estimate_model_baum_welch (CHMM *train)
void estimate_model_baum_welch_trans (CHMM *train)
void estimate_model_baum_welch_old (CHMM *train)
void estimate_model_baum_welch_defined (CHMM *train)
void estimate_model_viterbi (CHMM *train)
void estimate_model_viterbi_defined (CHMM *train)
output functions.

void output_model (bool verbose=false)
void output_model_defined (bool verbose=false)
 performs output_model only for the defined transitions etc
model helper functions.

void normalize (bool keep_dead_states=false)
 normalize the model to satisfy stochasticity
void add_states (int32_t num_states, float64_t default_val=0)
bool append_model (CHMM *append_model, float64_t *cur_out, float64_t *app_out)
bool append_model (CHMM *append_model)
void chop (float64_t value)
 set any model parameter with probability smaller than value to ZERO
void convert_to_log ()
 convert model to log probabilities
void init_model_random ()
 init model with random values
void init_model_defined ()
void clear_model ()
 initializes model with log(PSEUDO)
void clear_model_defined ()
 initializes only parameters in learn_x with log(PSEUDO)
void copy_model (CHMM *l)
 copies the the modelparameters from l
void invalidate_model ()
bool get_status () const
float64_t get_pseudo () const
 returns current pseudo value
void set_pseudo (float64_t pseudo)
 sets current pseudo value

void set_observations (CStringFeatures< uint16_t > *obs, CHMM *hmm=NULL)
void set_observation_nocache (CStringFeatures< uint16_t > *obs)
CStringFeatures< uint16_t > * get_observations ()
 return observation pointer
load/save functions.

for observations/model/traindefinitions

bool load_definitions (FILE *file, bool verbose, bool initialize=true)
bool load_model (FILE *file)
bool save_model (FILE *file)
bool save_model_derivatives (FILE *file)
bool save_model_derivatives_bin (FILE *file)
bool save_model_bin (FILE *file)
bool check_model_derivatives ()
 numerically check whether derivates were calculated right
bool check_model_derivatives_combined ()
T_STATESget_path (int32_t dim, float64_t &prob)
bool save_path (FILE *file)
bool save_path_derivatives (FILE *file)
bool save_path_derivatives_bin (FILE *file)
bool save_likelihood_bin (FILE *file)
bool save_likelihood (FILE *file)
access functions for model parameters

for all the arrays a,b,p,q,A,B,psi and scalar model parameters like N,M

T_STATES get_N () const
 access function for number of states N
int32_t get_M () const
 access function for number of observations M
void set_q (T_STATES offset, float64_t value)
void set_p (T_STATES offset, float64_t value)
void set_A (T_STATES line_, T_STATES column, float64_t value)
void set_a (T_STATES line_, T_STATES column, float64_t value)
void set_B (T_STATES line_, uint16_t column, float64_t value)
void set_b (T_STATES line_, uint16_t column, float64_t value)
void set_psi (int32_t time, T_STATES state, T_STATES value, int32_t dimension)
float64_t get_q (T_STATES offset) const
float64_t get_p (T_STATES offset) const
float64_t get_A (T_STATES line_, T_STATES column) const
float64_t get_a (T_STATES line_, T_STATES column) const
float64_t get_B (T_STATES line_, uint16_t column) const
float64_t get_b (T_STATES line_, uint16_t column) const
T_STATES get_psi (int32_t time, T_STATES state, int32_t dimension) const
functions for observations

management and access functions for observation matrix

float64_t state_probability (int32_t time, int32_t state, int32_t dimension)
 calculates probability of being in state i at time t for dimension
float64_t transition_probability (int32_t time, int32_t state_i, int32_t state_j, int32_t dimension)
 calculates probability of being in state i at time t and state j at time t+1 for dimension
derivatives of model probabilities.

computes log dp(lambda)/d lambda_i

参数:
dimension dimension for that derivatives are calculated
i,j parameter specific
float64_t linear_model_derivative (T_STATES i, uint16_t j, int32_t dimension)
float64_t model_derivative_p (T_STATES i, int32_t dimension)
float64_t model_derivative_q (T_STATES i, int32_t dimension)
float64_t model_derivative_a (T_STATES i, T_STATES j, int32_t dimension)
 computes log dp(lambda)/d a_ij.
float64_t model_derivative_b (T_STATES i, uint16_t j, int32_t dimension)
 computes log dp(lambda)/d b_ij.
derivatives of path probabilities.

computes d log p(lambda,best_path)/d lambda_i

参数:
dimension dimension for that derivatives are calculated
i,j parameter specific
float64_t path_derivative_p (T_STATES i, int32_t dimension)
 computes d log p(lambda,best_path)/d p_i
float64_t path_derivative_q (T_STATES i, int32_t dimension)
 computes d log p(lambda,best_path)/d q_i
float64_t path_derivative_a (T_STATES i, T_STATES j, int32_t dimension)
 computes d log p(lambda,best_path)/d a_ij
float64_t path_derivative_b (T_STATES i, uint16_t j, int32_t dimension)
 computes d log p(lambda,best_path)/d b_ij

保护成员

void prepare_path_derivative (int32_t dim)
 initialization function that is called before path_derivatives are calculated
float64_t forward (int32_t time, int32_t state, int32_t dimension)
 inline proxies for forward pass
float64_t backward (int32_t time, int32_t state, int32_t dimension)
 inline proxies for backward pass
input helper functions.

for reading model/definition/observation files

bool get_numbuffer (FILE *file, char *buffer, int32_t length)
 put a sequence of numbers into the buffer
void open_bracket (FILE *file)
 expect open bracket.
void close_bracket (FILE *file)
 expect closing bracket
bool comma_or_space (FILE *file)
 expect comma or space.
void error (int32_t p_line, const char *str)
 parse error messages

保护属性

float64_tarrayN1
float64_tarrayN2
T_ALPHA_BETA alpha_cache
 cache for forward variables can be terrible HUGE O(T*N)
T_ALPHA_BETA beta_cache
 cache for backward variables can be terrible HUGE O(T*N)
T_STATESstates_per_observation_psi
 backtracking table for viterbi can be terrible HUGE O(T*N)
T_STATESpath
 best path (=state sequence) through model
bool path_prob_updated
 true if path probability is up to date
int32_t path_prob_dimension
 dimension for which path_prob was calculated
model specific variables.

these are p,q,a,b,N,M etc

int32_t M
 number of observation symbols eg. ACGT -> 0123
int32_t N
 number of states
float64_t PSEUDO
 define pseudocounts against overfitting
int32_t line
CStringFeatures< uint16_t > * p_observations
 observation matrix
CModelmodel
float64_ttransition_matrix_A
 matrix of absolute counts of transitions
float64_tobservation_matrix_B
 matrix of absolute counts of observations within each state
float64_ttransition_matrix_a
 transition matrix
float64_tinitial_state_distribution_p
 initial distribution of states
float64_tend_state_distribution_q
 distribution of end-states
float64_tobservation_matrix_b
 distribution of observations within each state
int32_t iterations
 convergence criterion iterations
int32_t iteration_count
float64_t epsilon
 convergence criterion epsilon
int32_t conv_it
float64_t all_pat_prob
 probability of best path
float64_t pat_prob
 probability of best path
float64_t mod_prob
 probability of model
bool mod_prob_updated
 true if model probability is up to date
bool all_path_prob_updated
 true if path probability is up to date
int32_t path_deriv_dimension
 dimension for which path_deriv was calculated
bool path_deriv_updated
 true if path derivative is up to date
bool loglikelihood
bool status
bool reused_caches

静态保护属性

static const int32_t GOTN = (1<<1)
static const int32_t GOTM = (1<<2)
static const int32_t GOTO = (1<<3)
static const int32_t GOTa = (1<<4)
static const int32_t GOTb = (1<<5)
static const int32_t GOTp = (1<<6)
static const int32_t GOTq = (1<<7)
static const int32_t GOTlearn_a = (1<<1)
static const int32_t GOTlearn_b = (1<<2)
static const int32_t GOTlearn_p = (1<<3)
static const int32_t GOTlearn_q = (1<<4)
static const int32_t GOTconst_a = (1<<5)
static const int32_t GOTconst_b = (1<<6)
static const int32_t GOTconst_p = (1<<7)
static const int32_t GOTconst_q = (1<<8)

构造及析构函数文档

CHMM ( int32_t  N,
int32_t  M,
CModel model,
float64_t  PSEUDO 
)

Constructor

参数:
N number of states
M number of emissions
model model which holds definitions of states to be learned + consts
PSEUDO Pseudo Value

在文件HMM.cpp155行定义。

CHMM ( CStringFeatures< uint16_t > *  obs,
int32_t  N,
int32_t  M,
float64_t  PSEUDO 
)

在文件HMM.cpp167行定义。

CHMM ( int32_t  N,
float64_t p,
float64_t q,
float64_t a 
)

在文件HMM.cpp182行定义。

CHMM ( int32_t  N,
float64_t p,
float64_t q,
int32_t  num_trans,
float64_t a_trans 
)

在文件HMM.cpp234行定义。

CHMM ( FILE *  model_file,
float64_t  PSEUDO 
)

Constructor - Initialization from model file.

参数:
model_file Filehandle to a hmm model file (*.mod)
PSEUDO Pseudo Value

在文件HMM.cpp346行定义。

CHMM ( CHMM h  ) 

Constructor - Clone model h.

在文件HMM.cpp143行定义。

~CHMM (  )  [virtual]

Destructor - Cleanup.

在文件HMM.cpp354行定义。


成员函数文档

void add_states ( int32_t  num_states,
float64_t  default_val = 0 
)

increases the number of states by num_states the new a/b/p/q values are given the value default_val where 0<=default_val<=1

在文件HMM.cpp5007行定义。

bool alloc_state_dependend_arrays (  ) 

allocates memory that depends on N

在文件HMM.cpp450行定义。

bool append_model ( CHMM append_model,
float64_t cur_out,
float64_t app_out 
)

appends the append_model to the current hmm, i.e. two extra states are created. one is the end state of the current hmm with outputs cur_out (of size M) and the other state is the start state of the append_model. transition probability from state 1 to states 1 is 1

在文件HMM.cpp4899行定义。

bool append_model ( CHMM append_model  ) 

appends the append_model to the current hmm, here no extra states are created. former q_i are multiplied by q_ji to give the a_ij from the current hmm to the append_model

在文件HMM.cpp4807行定义。

float64_t backward ( int32_t  time,
int32_t  state,
int32_t  dimension 
) [protected]

inline proxies for backward pass

在文件HMM.h1552行定义。

float64_t backward_comp ( int32_t  time,
int32_t  state,
int32_t  dimension 
)

backward algorithm. calculates Pr[O_t+1,O_t+2, ..., O_T-1| q_time=S_i, lambda] for 0<= time <= T-1 Pr[O|lambda] for time >= T

参数:
time t
state i
dimension dimension of observation (observations are a matrix, where a row stands for one dimension i.e. 0_0,O_1,...,O_{T-1}

在文件HMM.cpp867行定义。

float64_t backward_comp_old ( int32_t  time,
int32_t  state,
int32_t  dimension 
)

在文件HMM.cpp966行定义。

bool baum_welch_viterbi_train ( BaumWelchViterbiType  type  ) 

interface for e.g. GUIHMM to run BaumWelch or Viterbi training

参数:
type type of BaumWelch/Viterbi training

在文件HMM.cpp5522行定义。

float64_t best_path ( int32_t  dimension  ) 

calculates probability of best state sequence s_0,...,s_T-1 AND path itself using viterbi algorithm. The path can be found in the array PATH(dimension)[0..T-1] afterwards

参数:
dimension dimension of observation for which the most probable path is calculated (observations are a matrix, where a row stands for one dimension i.e. 0_0,O_1,...,O_{T-1}

在文件HMM.cpp1098行定义。

bool check_model_derivatives (  ) 

numerically check whether derivates were calculated right

在文件HMM.cpp4564行定义。

bool check_model_derivatives_combined (  ) 

在文件HMM.cpp4494行定义。

void chop ( float64_t  value  ) 

set any model parameter with probability smaller than value to ZERO

在文件HMM.cpp5067行定义。

void clear_model (  ) 

initializes model with log(PSEUDO)

在文件HMM.cpp2606行定义。

void clear_model_defined (  ) 

initializes only parameters in learn_x with log(PSEUDO)

在文件HMM.cpp2622行定义。

void close_bracket ( FILE *  file  )  [protected]

expect closing bracket

在文件HMM.cpp2769行定义。

bool comma_or_space ( FILE *  file  )  [protected]

expect comma or space.

在文件HMM.cpp2782行定义。

void convert_to_log (  ) 

convert model to log probabilities

在文件HMM.cpp2339行定义。

void copy_model ( CHMM l  ) 

copies the the modelparameters from l

在文件HMM.cpp2645行定义。

void error ( int32_t  p_line,
const char *  str 
) [protected]

parse error messages

在文件HMM.h1497行定义。

void estimate_model_baum_welch ( CHMM train  ) 

uses baum-welch-algorithm to train a fully connected HMM.

参数:
train model from which the new model is estimated

在文件HMM.cpp1474行定义。

void estimate_model_baum_welch_defined ( CHMM train  ) 

uses baum-welch-algorithm to train the defined transitions etc.

参数:
train model from which the new model is estimated

在文件HMM.cpp1715行定义。

void estimate_model_baum_welch_old ( CHMM train  ) 

在文件HMM.cpp1560行定义。

void estimate_model_baum_welch_trans ( CHMM train  ) 

在文件HMM.cpp1645行定义。

void estimate_model_viterbi ( CHMM train  ) 

uses viterbi training to train a fully connected HMM

参数:
train model from which the new model is estimated

在文件HMM.cpp1891行定义。

void estimate_model_viterbi_defined ( CHMM train  ) 

uses viterbi training to train the defined transitions etc.

参数:
train model from which the new model is estimated

在文件HMM.cpp2018行定义。

float64_t forward ( int32_t  time,
int32_t  state,
int32_t  dimension 
) [protected]

inline proxies for forward pass

在文件HMM.h1535行定义。

float64_t forward_comp ( int32_t  time,
int32_t  state,
int32_t  dimension 
)

forward algorithm. calculates Pr[O_0,O_1, ..., O_t, q_time=S_i| lambda] for 0<= time <= T-1 Pr[O|lambda] for time > T

参数:
time t
state i
dimension dimension of observation (observations are a matrix, where a row stands for one dimension i.e. 0_0,O_1,...,O_{T-1}

在文件HMM.cpp631行定义。

float64_t forward_comp_old ( int32_t  time,
int32_t  state,
int32_t  dimension 
)

在文件HMM.cpp735行定义。

void free_state_dependend_arrays (  ) 

free memory that depends on N

在文件HMM.cpp507行定义。

float64_t get_A ( T_STATES  line_,
T_STATES  column 
) const

access function for matrix A

参数:
line_ row in matrix 0...N-1
column column in matrix 0...N-1
返回:
value at position line colum

在文件HMM.h1108行定义。

float64_t get_a ( T_STATES  line_,
T_STATES  column 
) const

access function for matrix a

参数:
line_ row in matrix 0...N-1
column column in matrix 0...N-1
返回:
value at position line colum

在文件HMM.h1122行定义。

float64_t get_B ( T_STATES  line_,
uint16_t  column 
) const

access function for matrix B

参数:
line_ row in matrix 0...N-1
column column in matrix 0...M-1
返回:
value at position line colum

在文件HMM.h1136行定义。

float64_t get_b ( T_STATES  line_,
uint16_t  column 
) const

access function for matrix b

参数:
line_ row in matrix 0...N-1
column column in matrix 0...M-1
返回:
value at position line colum

在文件HMM.h1150行定义。

uint16_t get_best_path_state ( int32_t  dim,
int32_t  t 
)

在文件HMM.h556行定义。

float64_t get_epsilon (  ) 

在文件HMM.h621行定义。

int32_t get_iterations (  ) 

在文件HMM.h619行定义。

float64_t get_log_derivative ( int32_t  num_param,
int32_t  num_example 
) [virtual]

get partial derivative of likelihood function (logarithmic)

abstract base method

参数:
num_param derivative against which param
num_example which example
返回:
derivative of likelihood (logarithmic)

实现了CDistribution

在文件HMM.cpp5455行定义。

virtual float64_t get_log_likelihood_example ( int32_t  num_example  )  [virtual]

compute log likelihood for example

abstract base method

参数:
num_example which example
返回:
log likelihood for example

实现了CDistribution

在文件HMM.h506行定义。

float64_t get_log_model_parameter ( int32_t  num_param  )  [virtual]

get model parameter (logarithmic)

abstrac base method

返回:
model parameter (logarithmic)

实现了CDistribution

在文件HMM.cpp5480行定义。

int32_t get_M (  )  const

access function for number of observations M

在文件HMM.h977行定义。

T_STATES get_N (  )  const

access function for number of states N

在文件HMM.h974行定义。

virtual const char* get_name (  )  const [virtual]
返回:
object name

实现了CSGObject

在文件HMM.h1179行定义。

virtual int32_t get_num_model_parameters (  )  [virtual]

get number of parameters in model

abstract base method

返回:
number of parameters in model

实现了CDistribution

在文件HMM.h503行定义。

bool get_numbuffer ( FILE *  file,
char *  buffer,
int32_t  length 
) [protected]

put a sequence of numbers into the buffer

在文件HMM.cpp2809行定义。

CStringFeatures<uint16_t>* get_observations (  ) 

return observation pointer

在文件HMM.h792行定义。

float64_t get_p ( T_STATES  offset  )  const

access function for probability of initial states

参数:
offset index 0...N-1
返回:
value at offset

在文件HMM.h1094行定义。

T_STATES * get_path ( int32_t  dim,
float64_t prob 
)

get viterbi path and path probability

参数:
dim dimension for which to obtain best path
prob likelihood of path
返回:
viterbi path

在文件HMM.cpp4017行定义。

float64_t get_pseudo (  )  const

returns current pseudo value

在文件HMM.h745行定义。

T_STATES get_psi ( int32_t  time,
T_STATES  state,
int32_t  dimension 
) const

access function for backtracking table psi

参数:
time time 0...T-1
state state 0...N-1
dimension dimension of observations 0...DIMENSION-1
返回:
state at specified time and position

在文件HMM.h1166行定义。

float64_t get_q ( T_STATES  offset  )  const

access function for probability of end states

参数:
offset index 0...N-1
返回:
value at offset

在文件HMM.h1081行定义。

bool get_status (  )  const

get status

返回:
true if everything is ok, else false

在文件HMM.h739行定义。

void init_model_defined (  ) 

init model according to const_x, learn_x. first model is initialized with 0 for all parameters then parameters in learn_x are initialized with random values finally const_x parameters are set and model is normalized.

在文件HMM.cpp2452行定义。

void init_model_random (  ) 

init model with random values

在文件HMM.cpp2386行定义。

bool initialize ( CModel model,
float64_t  PSEUDO,
FILE *  model_file = NULL 
)

initialization function - gets called by constructors.

参数:
model model which holds definitions of states to be learned + consts
PSEUDO Pseudo Value
model_file Filehandle to a hmm model file (*.mod)

在文件HMM.cpp542行定义。

void invalidate_model (  ) 

invalidates all caches. this function has to be called when direct changes to the model have been made. this is necessary for the forward/backward/viterbi algorithms to not work with old tables

在文件HMM.cpp2661行定义。

float64_t linear_model_derivative ( T_STATES  i,
uint16_t  j,
int32_t  dimension 
)

computes log dp(lambda)/d b_ij for linear model

在文件HMM.h1385行定义。

float64_t linear_model_probability ( int32_t  dimension  ) 

calculates likelihood for linear model on observations in MEMORY

参数:
dimension dimension for which probability is calculated
返回:
model probability

在文件HMM.h586行定义。

bool linear_train ( bool  right_align = false  ) 

estimates linear model from observations.

在文件HMM.cpp5095行定义。

bool load_definitions ( FILE *  file,
bool  verbose,
bool  initialize = true 
)

read definitions file (learn_x,const_x) used for training. -format specs: definition_file (train.def) % HMM-TRAIN - specification % learn_a - elements in state_transition_matrix to be learned % learn_b - elements in oberservation_per_state_matrix to be learned % note: each line stands for % state, observation(0), observation(1)...observation(NOW) % learn_p - elements in initial distribution to be learned % learn_q - elements in the end-state distribution to be learned % % const_x - specifies initial values of elements % rest is assumed to be 0.0 % % NOTE: IMPLICIT DEFINES: % define A 0 % define C 1 % define G 2 % define T 3

learn_a=[ [int32_t,int32_t]; [int32_t,int32_t]; [int32_t,int32_t]; ........ [int32_t,int32_t]; [-1,-1]; ];

learn_b=[ [int32_t,int32_t,int32_t,...,int32_t]; [int32_t,int32_t,int32_t,...,int32_t]; [int32_t,int32_t,int32_t,...,int32_t]; ........ [int32_t,int32_t,int32_t,...,int32_t]; [-1,-1]; ];

learn_p= [ int32_t, ... , int32_t, -1 ];

learn_q= [ int32_t, ... , int32_t, -1 ];

const_a=[ [int32_t,int32_t,float64_t]; [int32_t,int32_t,float64_t]; [int32_t,int32_t,float64_t]; ........ [int32_t,int32_t,float64_t]; [-1,-1,-1]; ];

const_b=[ [int32_t,int32_t,int32_t,...,int32_t,float64_t]; [int32_t,int32_t,int32_t,...,int32_t,float64_t]; [int32_t,int32_t,int32_t,...,int32_t,<DOUBLE]; ........ [int32_t,int32_t,int32_t,...,int32_t,float64_t]; [-1,-1,-1]; ];

const_p[]=[ [int32_t, float64_t], ... , [int32_t,float64_t], [-1,-1] ]; const_q[]=[ [int32_t, float64_t], ... , [int32_t,float64_t], [-1,-1] ];

参数:
file filehandle to definitions file
verbose true for verbose messages
initialize true to initialize to underlying HMM

在文件HMM.cpp3216行定义。

bool load_model ( FILE *  file  ) 

read model from file. -format specs: model_file (model.hmm) % HMM - specification % N - number of states % M - number of observation_tokens % a is state_transition_matrix % size(a)= [N,N] % % b is observation_per_state_matrix % size(b)= [N,M] % % p is initial distribution % size(p)= [1, N]

N=int32_t; M=int32_t;

p=[float64_t,float64_t...float64_t]; q=[float64_t,float64_t...float64_t];

a=[ [float64_t,float64_t...float64_t]; [float64_t,float64_t...float64_t]; [float64_t,float64_t...float64_t]; [float64_t,float64_t...float64_t]; [float64_t,float64_t...float64_t]; ];

b=[ [float64_t,float64_t...float64_t]; [float64_t,float64_t...float64_t]; [float64_t,float64_t...float64_t]; [float64_t,float64_t...float64_t]; [float64_t,float64_t...float64_t]; ];

参数:
file filehandle to model file

在文件HMM.cpp2918行定义。

float64_t model_derivative_a ( T_STATES  i,
T_STATES  j,
int32_t  dimension 
)

computes log dp(lambda)/d a_ij.

在文件HMM.h1416行定义。

float64_t model_derivative_b ( T_STATES  i,
uint16_t  j,
int32_t  dimension 
)

computes log dp(lambda)/d b_ij.

在文件HMM.h1427行定义。

float64_t model_derivative_p ( T_STATES  i,
int32_t  dimension 
)

computes log dp(lambda)/d p_i. backward path downto time 0 multiplied by observing first symbol in path at state i

在文件HMM.h1402行定义。

float64_t model_derivative_q ( T_STATES  i,
int32_t  dimension 
)

computes log dp(lambda)/d q_i. forward path upto time T-1

在文件HMM.h1410行定义。

float64_t model_probability ( int32_t  dimension = -1  ) 

inline proxy for model probability.

在文件HMM.h567行定义。

float64_t model_probability_comp (  ) 

calculates probability that observations were generated by the model using forward algorithm.

在文件HMM.cpp1226行定义。

void normalize ( bool  keep_dead_states = false  ) 

normalize the model to satisfy stochasticity

在文件HMM.cpp4772行定义。

void open_bracket ( FILE *  file  )  [protected]

expect open bracket.

在文件HMM.cpp2748行定义。

void output_model ( bool  verbose = false  ) 

prints the model parameters on screen.

参数:
verbose when false only the model probability will be printed when true the whole model will be printed additionally

在文件HMM.cpp2200行定义。

void output_model_defined ( bool  verbose = false  ) 

performs output_model only for the defined transitions etc

在文件HMM.cpp2284行定义。

float64_t path_derivative_a ( T_STATES  i,
T_STATES  j,
int32_t  dimension 
)

computes d log p(lambda,best_path)/d a_ij

在文件HMM.h1463行定义。

float64_t path_derivative_b ( T_STATES  i,
uint16_t  j,
int32_t  dimension 
)

computes d log p(lambda,best_path)/d b_ij

在文件HMM.h1470行定义。

float64_t path_derivative_p ( T_STATES  i,
int32_t  dimension 
)

computes d log p(lambda,best_path)/d p_i

在文件HMM.h1449行定义。

float64_t path_derivative_q ( T_STATES  i,
int32_t  dimension 
)

computes d log p(lambda,best_path)/d q_i

在文件HMM.h1456行定义。

bool permutation_entropy ( int32_t  window_width,
int32_t  sequence_number 
)

compute permutation entropy

在文件HMM.cpp5397行定义。

void prepare_path_derivative ( int32_t  dim  )  [protected]

initialization function that is called before path_derivatives are calculated

在文件HMM.h1507行定义。

bool save_likelihood ( FILE *  file  ) 

save model probability in ascii format

参数:
file filehandle

在文件HMM.cpp4071行定义。

bool save_likelihood_bin ( FILE *  file  ) 

save model probability in binary format

参数:
file filehandle

在文件HMM.cpp4054行定义。

bool save_model ( FILE *  file  ) 

save model to file.

参数:
file filehandle to model file

在文件HMM.cpp3921行定义。

bool save_model_bin ( FILE *  file  ) 

save model in binary format.

参数:
file filehandle

在文件HMM.cpp4092行定义。

bool save_model_derivatives ( FILE *  file  ) 

save model derivatives to file in ascii format.

参数:
file filehandle

在文件HMM.cpp4446行定义。

bool save_model_derivatives_bin ( FILE *  file  ) 

save model derivatives to file in binary format.

参数:
file filehandle

在文件HMM.cpp4325行定义。

bool save_path ( FILE *  file  ) 

save viterbi path in ascii format

参数:
file filehandle

在文件HMM.cpp4030行定义。

bool save_path_derivatives ( FILE *  file  ) 

save viterbi path in ascii format

参数:
file filehandle

在文件HMM.cpp4194行定义。

bool save_path_derivatives_bin ( FILE *  file  ) 

save viterbi path in binary format

参数:
file filehandle

在文件HMM.cpp4242行定义。

void set_a ( T_STATES  line_,
T_STATES  column,
float64_t  value 
)

access function for matrix a

参数:
line_ row in matrix 0...N-1
column column in matrix 0...N-1
value value to be set

在文件HMM.h1024行定义。

void set_A ( T_STATES  line_,
T_STATES  column,
float64_t  value 
)

access function for matrix A

参数:
line_ row in matrix 0...N-1
column column in matrix 0...N-1
value value to be set

在文件HMM.h1010行定义。

void set_B ( T_STATES  line_,
uint16_t  column,
float64_t  value 
)

access function for matrix B

参数:
line_ row in matrix 0...N-1
column column in matrix 0...M-1
value value to be set

在文件HMM.h1038行定义。

void set_b ( T_STATES  line_,
uint16_t  column,
float64_t  value 
)

access function for matrix b

参数:
line_ row in matrix 0...N-1
column column in matrix 0...M-1
value value to be set

在文件HMM.h1052行定义。

bool set_epsilon ( float64_t  eps  ) 

在文件HMM.h620行定义。

bool set_iterations ( int32_t  num  ) 

在文件HMM.h618行定义。

void set_observation_nocache ( CStringFeatures< uint16_t > *  obs  ) 

set new observations only set the observation pointer and drop caches if there were any

在文件HMM.cpp5212行定义。

void set_observations ( CStringFeatures< uint16_t > *  obs,
CHMM hmm = NULL 
)

observation functions set/get observation matrix set new observations sets the observation pointer and initializes observation-dependent caches if hmm is given, then the caches of the model hmm are used

在文件HMM.cpp5254行定义。

void set_p ( T_STATES  offset,
float64_t  value 
)

access function for probability of first state

参数:
offset index 0...N-1
value value to be set

在文件HMM.h996行定义。

void set_pseudo ( float64_t  pseudo  ) 

sets current pseudo value

在文件HMM.h751行定义。

void set_psi ( int32_t  time,
T_STATES  state,
T_STATES  value,
int32_t  dimension 
)

access function for backtracking table psi

参数:
time time 0...T-1
state state 0...N-1
value value to be set
dimension dimension of observations 0...DIMENSION-1

在文件HMM.h1067行定义。

void set_q ( T_STATES  offset,
float64_t  value 
)

access function for probability of end states

参数:
offset index 0...N-1
value value to be set

在文件HMM.h983行定义。

float64_t state_probability ( int32_t  time,
int32_t  state,
int32_t  dimension 
)

calculates probability of being in state i at time t for dimension

在文件HMM.h1361行定义。

bool train ( CFeatures data = NULL  )  [virtual]

learn distribution

参数:
data training data (parameter can be avoided if distance or kernel-based classifiers are used and distance/kernels are initialized with train data)
返回:
whether training was successful

实现了CDistribution

在文件HMM.cpp436行定义。

float64_t transition_probability ( int32_t  time,
int32_t  state_i,
int32_t  state_j,
int32_t  dimension 
)

calculates probability of being in state i at time t and state j at time t+1 for dimension

在文件HMM.h1368行定义。


成员数据文档

float64_t all_pat_prob [protected]

probability of best path

在文件HMM.h1230行定义。

bool all_path_prob_updated [protected]

true if path probability is up to date

在文件HMM.h1242行定义。

T_ALPHA_BETA alpha_cache [protected]

cache for forward variables can be terrible HUGE O(T*N)

在文件HMM.h1303行定义。

float64_t* arrayN1 [protected]

array of size N for temporary calculations

在文件HMM.h1267行定义。

float64_t* arrayN2 [protected]

array of size N for temporary calculations

在文件HMM.h1269行定义。

T_ALPHA_BETA beta_cache [protected]

cache for backward variables can be terrible HUGE O(T*N)

在文件HMM.h1305行定义。

int32_t conv_it [protected]

在文件HMM.h1227行定义。

distribution of end-states

在文件HMM.h1216行定义。

float64_t epsilon [protected]

convergence criterion epsilon

在文件HMM.h1226行定义。

const int32_t GOTa = (1<<4) [static, protected]

GOTa

在文件HMM.h1329行定义。

const int32_t GOTb = (1<<5) [static, protected]

GOTb

在文件HMM.h1331行定义。

const int32_t GOTconst_a = (1<<5) [static, protected]

GOTconst_a

在文件HMM.h1346行定义。

const int32_t GOTconst_b = (1<<6) [static, protected]

GOTconst_b

在文件HMM.h1348行定义。

const int32_t GOTconst_p = (1<<7) [static, protected]

GOTconst_p

在文件HMM.h1350行定义。

const int32_t GOTconst_q = (1<<8) [static, protected]

GOTconst_q

在文件HMM.h1352行定义。

const int32_t GOTlearn_a = (1<<1) [static, protected]

GOTlearn_a

在文件HMM.h1338行定义。

const int32_t GOTlearn_b = (1<<2) [static, protected]

GOTlearn_b

在文件HMM.h1340行定义。

const int32_t GOTlearn_p = (1<<3) [static, protected]

GOTlearn_p

在文件HMM.h1342行定义。

const int32_t GOTlearn_q = (1<<4) [static, protected]

GOTlearn_q

在文件HMM.h1344行定义。

const int32_t GOTM = (1<<2) [static, protected]

GOTM

在文件HMM.h1325行定义。

const int32_t GOTN = (1<<1) [static, protected]

GOTN

在文件HMM.h1323行定义。

const int32_t GOTO = (1<<3) [static, protected]

GOTO

在文件HMM.h1327行定义。

const int32_t GOTp = (1<<6) [static, protected]

GOTp

在文件HMM.h1333行定义。

const int32_t GOTq = (1<<7) [static, protected]

GOTq

在文件HMM.h1335行定义。

initial distribution of states

在文件HMM.h1213行定义。

int32_t iteration_count [protected]

在文件HMM.h1223行定义。

int32_t iterations [protected]

convergence criterion iterations

在文件HMM.h1222行定义。

int32_t line [protected]

在文件HMM.h1195行定义。

bool loglikelihood [protected]

在文件HMM.h1251行定义。

int32_t M [protected]

number of observation symbols eg. ACGT -> 0123

在文件HMM.h1186行定义。

float64_t mod_prob [protected]

probability of model

在文件HMM.h1236行定义。

bool mod_prob_updated [protected]

true if model probability is up to date

在文件HMM.h1239行定义。

CModel* model [protected]

在文件HMM.h1201行定义。

int32_t N [protected]

number of states

在文件HMM.h1189行定义。

matrix of absolute counts of observations within each state

在文件HMM.h1207行定义。

distribution of observations within each state

在文件HMM.h1219行定义。

CStringFeatures<uint16_t>* p_observations [protected]

observation matrix

在文件HMM.h1198行定义。

float64_t pat_prob [protected]

probability of best path

在文件HMM.h1233行定义。

T_STATES* path [protected]

best path (=state sequence) through model

在文件HMM.h1311行定义。

int32_t path_deriv_dimension [protected]

dimension for which path_deriv was calculated

在文件HMM.h1245行定义。

bool path_deriv_updated [protected]

true if path derivative is up to date

在文件HMM.h1248行定义。

int32_t path_prob_dimension [protected]

dimension for which path_prob was calculated

在文件HMM.h1317行定义。

bool path_prob_updated [protected]

true if path probability is up to date

在文件HMM.h1314行定义。

float64_t PSEUDO [protected]

define pseudocounts against overfitting

在文件HMM.h1192行定义。

bool reused_caches [protected]

在文件HMM.h1257行定义。

backtracking table for viterbi can be terrible HUGE O(T*N)

在文件HMM.h1308行定义。

bool status [protected]

在文件HMM.h1254行定义。

transition matrix

在文件HMM.h1210行定义。

matrix of absolute counts of transitions

在文件HMM.h1204行定义。


该类的文档由以下文件生成:

SHOGUN Machine Learning Toolbox - Documentation