Last modified: November 22, 2010
[object] bbox_merging (int Ex = -1, int Ey = -1)
Operates on: | Image [OneBit] |
---|---|
Returns: | [object] |
Category: | PageSegmentation |
Defined in: | pagesegmentation.py |
Author: | Rene Baston, Karl MacMillan, and Christoph Dalitz |
Segments a page by extending and merging the bounding boxes of the connected components on the page.
How much the segments are extended is controlled by the arguments Ex and Ey. Depending on their value, the returned segments can be lines or paragraphs or something else.
The return value is a list of 'CCs' where each 'CC' represents a found segment. Note that the input image is changed such that each pixel is set to its segment label.
Arguments:
How much each CC is extended to the top and bottom before merging. When -1, it is set to twice the average size of all CCs. This will typically segemtn into paragraphs.
If you want to segment into lines, set Ey to something small like one sixth of the median symbol height.
[object] projection_cutting (int Tx = 0, int Ty = 0, int noise = 0, Choice [cut|ignore] gap_treatment = cut)
Operates on: | Image [OneBit] |
---|---|
Returns: | [object] |
Category: | PageSegmentation |
Defined in: | pagesegmentation.py |
Author: | Maria Elhachimi and Robert Butz |
Segments a page with the Iterative Projection Profile Cuttings method.
The image is split recursively in the horizontal and vertical direction by looking for 'gaps' in the projection profile. A 'gap' is a contiguous sequence of projection values smaller than noise pixels. The splitting is done for each gap wider or higher than given thresholds Tx or Ty. When no further split points are found, the recursion stops.
Whether the resulting segments represent lines, columns or paragraphs depends on the values for Tx and Ty. The return value is a list of 'CCs' where each 'CC' represents a found segment. Note that the input image is changed such that each pixel is set to its CC label.
[object] runlength_smearing (int Cx = -1, int Cy = -1, int Csm = -1)
Operates on: | Image [OneBit] |
---|---|
Returns: | [object] |
Category: | PageSegmentation |
Defined in: | pagesegmentation.py |
Author: | Christoph Dalitz and Iliya Stoyanov |
Segments a page with the Run Length Smearing algorithm.
The algorithm converts white horizontal and vertical runs shorter than given thresholds Cx and Cy to black pixels (this is the so-called 'run length smearing').
The intersection of both smeared images yields the page segments as black regions. As this typically still consists small white horizontal gaps, these gaps narrower than Csm are in a final step also filled out.
The return value is a list of 'CCs' where each 'CC' represents a found segment. Note that the input image is changed such that each pixel is set to its CC label.
Arguments:
tuple sub_cc_analysis ([object cclist])
Operates on: | Image [OneBit] |
---|---|
Returns: | tuple |
Category: | PageSegmentation |
Defined in: | pagesegmentation.py |
Author: | Stephan Ruloff and Christoph Dalitz |
Further subsegments the result of a page segmentation algorithm into groups of actual connected components.
The result of a page segmenattion plugin is a list of 'CCs' where each 'CC' does not represent a 'connected component', but a page segment (typically a line of text). In a practical OCR application you will however need the actual connected components (which should roughly corresond to the glyphs) in groups of lines. That is what this plugin is meant for.
The input image must be an image that has been processed with a page segmentation plugin, i.e. all pixels in the image must be labeled with a segment label. The input parameter cclist is the list of segments returned by the page segmentation algorithm.
The return value is a tuple with two entries:
Note
The groups will be returned in the same order as given in cclist. This means that you can sort the page segments by reading order before passing them to sub_cc_analysis.
Note that the order of the returned CCs within each group is not well defined. Hence you will generally need to sort each subgroup by reading order.
[object] textline_reading_order ([object lineccs])
Returns: | [object] |
---|---|
Category: | PageSegmentation |
Defined in: | pagesegmentation.py |
Author: | Christoph Dalitz |
Sorts a list of Images (CCs) representing textlines by reading order and returns the sorted list. Incidentally, this will not only work on textlines, but also on paragraphs, but not on actual Connected Components.
The algorithm sorts all lines in topological order, based on the following criteria for the pairwise order of two lines:
In the reference "High Performance Document Analysis" by T.M. Breuel (Symposium on Document Image Understanding, USA, pp. 209-218, 2003), an additional constraint is made for the first criteria by demanding that no other segment may be between a and b that opverlaps horizontally with both. This constraint for taking multi column headings that interrupt columns into account is replaced in this implementation with an a priori sort of all textlines by y-position. This results in a preference of rows over columns (in case of ambiguity) in the depth-first-search utilized in the topological sorting.
As this function is not an image method, but a free function, it is not automatically imported with all plugins and you must import it explicitly with
from gamera.plugins.pagesegmentation import textline_reading_order