org.apache.uima.examples.tagger
Class HMMTagger

java.lang.Object
  extended by org.apache.uima.analysis_component.AnalysisComponent_ImplBase
      extended by org.apache.uima.analysis_component.Annotator_ImplBase
          extended by org.apache.uima.analysis_component.JCasAnnotator_ImplBase
              extended by org.apache.uima.examples.tagger.HMMTagger
All Implemented Interfaces:
org.apache.uima.analysis_component.AnalysisComponent, Tagger

public class HMMTagger
extends org.apache.uima.analysis_component.JCasAnnotator_ImplBase
implements Tagger

UIMA Analysis Engine that invokes HMM POS tagger. HMM POS tagger generates part-of-speech tags for every token. This annotator assumes that sentences and tokens have already been annotated in the CAS with Sentence and Token annotations, respectively. We iterate over sentences, then iterate over tokens in the current sentence to accumulate a list of words, then invoke the HMM POS tagger on the list of words. For each Token we then update the posTag field with the POS tag. The model file for the HMM POS tagger is specified as a parameter (MODEL_FILE_PARAM).

The configuration of this analysis engine is done through several parameters:


Field Summary
 ModelGeneration my_model
           
 int N
          for a bigram model: N = 2, for a trigram model N=3 N is defined in parameter file
static java.lang.String PARAM_IMPORT_MODEL_FILE
          Name of the parameter for the model import path
static java.lang.String PARAM_INPUT_VIEW
          Name of the parameter for the input view
static java.lang.String PARAM_SENTENCE
          Name of the parameter for the annotation type which covers token annotations
static java.lang.String PARAM_TOKEN_FP
          Name of the parameter for the feature path to the token feature to be tagged
 
Constructor Summary
HMMTagger()
           
 
Method Summary
static ModelGeneration get_model(java.lang.String filename)
          Reads a saved MODEL object from a file
static org.apache.uima.cas.Type getType(org.apache.uima.jcas.JCas aJCas, java.lang.String annotationString)
          Get the type of a given annotation name and check if it exists
 void initialize(org.apache.uima.UimaContext aContext)
          Initialize the Annotator.
 void process(org.apache.uima.jcas.JCas aJCas)
          Process a CAS.
 
Methods inherited from class org.apache.uima.analysis_component.JCasAnnotator_ImplBase
getRequiredCasInterface, process
 
Methods inherited from class org.apache.uima.analysis_component.Annotator_ImplBase
getCasInstancesRequired, hasNext, next
 
Methods inherited from class org.apache.uima.analysis_component.AnalysisComponent_ImplBase
batchProcessComplete, collectionProcessComplete, destroy, getContext, getResultSpecification, reconfigure, setResultSpecification
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

PARAM_INPUT_VIEW

public static java.lang.String PARAM_INPUT_VIEW
Name of the parameter for the input view


PARAM_IMPORT_MODEL_FILE

public static java.lang.String PARAM_IMPORT_MODEL_FILE
Name of the parameter for the model import path


PARAM_SENTENCE

public static java.lang.String PARAM_SENTENCE
Name of the parameter for the annotation type which covers token annotations


PARAM_TOKEN_FP

public static java.lang.String PARAM_TOKEN_FP
Name of the parameter for the feature path to the token feature to be tagged


N

public int N
for a bigram model: N = 2, for a trigram model N=3 N is defined in parameter file


my_model

public ModelGeneration my_model
Constructor Detail

HMMTagger

public HMMTagger()
Method Detail

initialize

public void initialize(org.apache.uima.UimaContext aContext)
                throws org.apache.uima.resource.ResourceInitializationException
Initialize the Annotator.

Specified by:
initialize in interface org.apache.uima.analysis_component.AnalysisComponent
Specified by:
initialize in interface Tagger
Overrides:
initialize in class org.apache.uima.analysis_component.AnalysisComponent_ImplBase
Throws:
org.apache.uima.resource.ResourceInitializationException
See Also:
AnalysisComponent_ImplBase.initialize(UimaContext)

get_model

public static ModelGeneration get_model(java.lang.String filename)
Reads a saved MODEL object from a file

Parameters:
filename - model file
Returns:
ModelGeneration

process

public void process(org.apache.uima.jcas.JCas aJCas)
             throws org.apache.uima.analysis_engine.AnalysisEngineProcessException
Process a CAS.

Specified by:
process in interface Tagger
Specified by:
process in class org.apache.uima.analysis_component.JCasAnnotator_ImplBase
Throws:
org.apache.uima.analysis_engine.AnalysisEngineProcessException
See Also:
JCasAnnotator_ImplBase.process(JCas)

getType

public static org.apache.uima.cas.Type getType(org.apache.uima.jcas.JCas aJCas,
                                               java.lang.String annotationString)
                                        throws org.apache.uima.analysis_engine.AnalysisEngineProcessException
Get the type of a given annotation name and check if it exists

Parameters:
aJCas -
annotationString -
Returns:
annotationType
Throws:
org.apache.uima.analysis_engine.AnalysisEngineProcessException


Copyright © 2011. All Rights Reserved.