uk.ac.cam.ch.wwmm.opsin
Class ParseRules

java.lang.Object
  extended by uk.ac.cam.ch.wwmm.opsin.ParseRules

public class ParseRules
extends java.lang.Object

Instantiate via NameToStructure.getOpsinParser() Performs finite-state allocation of roles ("annotations") to tokens: The chemical name is broken down into tokens e.g. ethyl -->eth yl by applying the chemical grammar in regexes.xml The tokens eth and yl are associated with a letter which is referred to here as an annotation which is the role of the token. These letters are defined in regexes.xml and would in this case have the meaning alkaneStem and inlineSuffix The chemical grammar employs the annotations associated with the tokens when deciding what may follow what has already been seen e.g. you cannot start a chemical name with yl and an optional e is valid after an arylGroup


Method Summary
 ParseRulesResults getParses(java.lang.String chemicalWord)
          Determines the possible annotations for a chemical word Returns a list of parses and how much of the word could not be interpreted e.g.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

getParses

public ParseRulesResults getParses(java.lang.String chemicalWord)
                            throws ParsingException
Determines the possible annotations for a chemical word Returns a list of parses and how much of the word could not be interpreted e.g. usually the list will have only one parse and the string will equal "" For something like ethyloxime. The list will contain the parse for ethyl and the string will equal "oxime" as it was unparsable For something like eth no parses would be found and the string will equal "eth"

Parameters:
chemicalWord -
Returns:
Throws:
ParsingException