org.apache.uima.examples.tagger
Class HMMModelTrainer

java.lang.Object
  extended by org.apache.uima.analysis_component.AnalysisComponent_ImplBase
      extended by org.apache.uima.analysis_component.Annotator_ImplBase
          extended by org.apache.uima.analysis_component.JCasAnnotator_ImplBase
              extended by org.apache.uima.examples.tagger.HMMModelTrainer
All Implemented Interfaces:
org.apache.uima.analysis_component.AnalysisComponent

public class HMMModelTrainer
extends org.apache.uima.analysis_component.JCasAnnotator_ImplBase

This analysis engine trains an N-gram model for the HMM tagger. It uses a training corpus as reference. This corpus must contain annotations on words with an attribute corresponding to the POS value to be learned. The configuration of this analysis engine is done through several parameters:

BEWARE: this analysis engine does not allow multiple deployment ! NB. At the moment: both bi and trigram statistics are saved in one model file.


Field Summary
static java.lang.String PARAM_FILE
          Name of the parameter for the model export path
static java.lang.String PARAM_POSFP
          Name of the parameter for the feature path to the POS
static java.lang.String PARAM_VIEW
          Name of the parameter for the view
 
Constructor Summary
HMMModelTrainer()
           
 
Method Summary
 void collectionProcessComplete()
          Called at the end of the processing.
 void initialize(org.apache.uima.UimaContext aContext)
          Initialization of the component
 void process(org.apache.uima.jcas.JCas cas)
          Processing.
 
Methods inherited from class org.apache.uima.analysis_component.JCasAnnotator_ImplBase
getRequiredCasInterface, process
 
Methods inherited from class org.apache.uima.analysis_component.Annotator_ImplBase
getCasInstancesRequired, hasNext, next
 
Methods inherited from class org.apache.uima.analysis_component.AnalysisComponent_ImplBase
batchProcessComplete, destroy, getContext, getResultSpecification, reconfigure, setResultSpecification
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

PARAM_VIEW

public static java.lang.String PARAM_VIEW
Name of the parameter for the view


PARAM_FILE

public static java.lang.String PARAM_FILE
Name of the parameter for the model export path


PARAM_POSFP

public static java.lang.String PARAM_POSFP
Name of the parameter for the feature path to the POS

Constructor Detail

HMMModelTrainer

public HMMModelTrainer()
Method Detail

initialize

public void initialize(org.apache.uima.UimaContext aContext)
                throws org.apache.uima.resource.ResourceInitializationException
Initialization of the component

Specified by:
initialize in interface org.apache.uima.analysis_component.AnalysisComponent
Overrides:
initialize in class org.apache.uima.analysis_component.AnalysisComponent_ImplBase
Throws:
org.apache.uima.resource.ResourceInitializationException

process

public void process(org.apache.uima.jcas.JCas cas)
             throws org.apache.uima.analysis_engine.AnalysisEngineProcessException
Processing. Browse the annotations of the type theTokenTypeName that must inherit from the type tcas.Annotation and build the list of tokens that will be learned by the HMMTagger.

Specified by:
process in class org.apache.uima.analysis_component.JCasAnnotator_ImplBase
Throws:
org.apache.uima.analysis_engine.AnalysisEngineProcessException

collectionProcessComplete

public void collectionProcessComplete()
                               throws org.apache.uima.analysis_engine.AnalysisEngineProcessException
Called at the end of the processing. When the whole collection has been processed, we create the model from the elements we collected.

Specified by:
collectionProcessComplete in interface org.apache.uima.analysis_component.AnalysisComponent
Overrides:
collectionProcessComplete in class org.apache.uima.analysis_component.AnalysisComponent_ImplBase
Throws:
org.apache.uima.analysis_engine.AnalysisEngineProcessException


Copyright © 2011. All Rights Reserved.