LiuSamarabanduTextExtractorBasic (OpenIMAJ master project 1.3.10 API)

java.lang.Object
- org.openimaj.image.text.extraction.TextExtractor<FImage>
- - org.openimaj.image.text.extraction.LiuSamarabanduTextExtractorBasic

All Implemented Interfaces:

ImageProcessor<FImage>, Processor<FImage>
```
@Reference(type=Inproceedings,
           author={"Xiaoqing Liu","Samarabandu, J."},
           title="An edge-based text region extraction algorithm for indoor mobile robot navigation",
           year="2005",
           booktitle="Mechatronics and Automation, 2005 IEEE International Conference",
           pages={" 701 "," 706 Vol. 2"},
           month="July-1 Aug.",
           number="",
           volume="2",
           customData={"keywords","edge-based text region extraction; feature extraction; scene text; text localization; vision-based mobile robot navigation; character recognition; edge detection; feature extraction; mobile robots; navigation; path planning; robot vision;","doi","10.1109/ICMA.2005.1626635","ISSN",""})
public class LiuSamarabanduTextExtractorBasic
extends TextExtractor<FImage>
```
A processor that attempts to extract text from an image. It uses a 3-stage process: 1) find possible text regions; 2) filter then extract those regions; 3) OCR.
In the first stage it builds a feature map which is an image where the pixel intensity is the likelihood of a pixel being within a text region. It does this by a series of convolutions and morphological operations that find regions that have short edges in multiple directions.
In the second stage, the regions are turned into blobs and those blobs that are too small or inappropriately shaped are removed. The regions are then extracted from the original image as subimages containing text. The extracted subimages can have an expansion multipler applied to the box to ensure that enough surrounding information is contained within the extracted subimage for the OCR to work. Use setBoundingBoxPaddingPc(float) with a multipler to expand the bounding boxes with; i.e. 1.05 will expand the bounding box by 5%.
The third stage simply uses an OCRProcessor to process the subimages and extract textual strings. Use the TextExtractor.setOCRProcessor(OCRProcessor) to set the OCRProcessor to use to extract text. Note that by default no processor is set. If the processor is executed without an OCRProcessor being set, the OCR stage will not occur. This part of the implementation has moved into TextExtractor super class.
The output of the processor can be retrieved using getTextRegions() which returns a map where the key is a bounding box of every detected text region and the value is a pair of subimage to extracted text.
From: [paper 01626635.pdf] Xiaoqing Liu and Jagath Samarabandu; An Edge-based Text Region Extraction Algorithm for Indoor Mobile Robot Navigation, Proceedings of the IEEE International Conference on Mechatronics & Automation Niagara Falls, Canada, July 2005

Author:

David Dupplaw (dpd@ecs.soton.ac.uk)

See Also:

"http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1626635"

Created:

29 Jul 2011

Field Summary

Fields
Modifier and Type Field and Description

static boolean DEBUG
Whether to debug the text extractor - displaying images as it goes

Fields
Modifier and Type	Field and Description
`static boolean`	`DEBUG` Whether to debug the text extractor - displaying images as it goes

Constructor Summary

Constructors
Constructor and Description

LiuSamarabanduTextExtractorBasic()

Constructors
Constructor and Description
`LiuSamarabanduTextExtractorBasic()`

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`List<IndependentPair<Point2d,Point2d>>`	`calculateHomography(Polygon p)` Calculates the point pairing for a given distorted polygon into orthogonal space.
`float`	`getBoundingBoxPaddingPc()` Get the expansion value of the bounding boxes that are generated for the text regions.
`Map<Rectangle,FImage>`	`getTextRegions()` Returns a map of bounding box to image and textual string.
`void`	`processFeatureMap(FImage fmap, FImage image)` Process a feature map.
`void`	`processImage(FImage image)` Process an image.
`void`	`setBoundingBoxPaddingPc(float boundingBoxPaddingPc)` Set the expansion value for the subimage extraction.
`FImage`	`textRegionDetection(FImage image)` Calculate the feature map that give the approximate localisation of candidate text regions.
`Map<Rectangle,FImage>`	`textRegionLocalisation(FImage fmap, FImage image)` Extract the regions that probably contain text (as given by the feature map)

Methods inherited from class org.openimaj.image.text.extraction.TextExtractor
getOCRProcessor, getText, getTextStrings, setOCRProcessor

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - DEBUG
```
public static final boolean DEBUG
```
    Whether to debug the text extractor - displaying images as it goes
    
    See Also:
    
    Constant Field Values
- Constructor Detail
  - LiuSamarabanduTextExtractorBasic
```
public LiuSamarabanduTextExtractorBasic()
```
- Method Detail
  - processImage
```
public void processImage(FImage image)
```
    Process an image. Implementing classes must alter the image passed in-place or assign the output to the input using Image.internalAssign(Image).
    
    Parameters:
    
    image - The image to process in place.
    
    See Also:
    
    ImageProcessor.processImage(org.openimaj.image.Image)
  - processFeatureMap
```
public void processFeatureMap(FImage fmap,
                              FImage image)
```
    Process a feature map. This function will side affect the field textRegions in this class. Use getTextRegions() to retrieve the text regions extracted from this method.
    
    Parameters:
    
    fmap - The feature map to process
    
    image - The original image.
  - textRegionDetection
```
public FImage textRegionDetection(FImage image)
```
    Calculate the feature map that give the approximate localisation of candidate text regions.
    
    Parameters:
    
    image - The image to process.
    
    Returns:
    
    The feature map
  - textRegionLocalisation
```
public Map<Rectangle,FImage> textRegionLocalisation(FImage fmap,
                                                    FImage image)
```
    Extract the regions that probably contain text (as given by the feature map)
    
    Parameters:
    
    fmap - The feature map calculated from textRegionDetection(FImage)
    
    image - The original image
    
    Returns:
    
    A map of boundingbox->images that area localised text regions
  - calculateHomography
```
public List<IndependentPair<Point2d,Point2d>> calculateHomography(Polygon p)
```
    Calculates the point pairing for a given distorted polygon into orthogonal space.
    
    Parameters:
    
    p - The polygon with 4 points
    
    Returns:
    
    A list of point pairs
  - getBoundingBoxPaddingPc
```
public float getBoundingBoxPaddingPc()
```
    Get the expansion value of the bounding boxes that are generated for the text regions.
    
    Returns:
    
    the new bounding box expansion multiplier.
  - setBoundingBoxPaddingPc
```
public void setBoundingBoxPaddingPc(float boundingBoxPaddingPc)
```
    Set the expansion value for the subimage extraction.
    
    Parameters:
    
    boundingBoxPaddingPc - the new multiplier
  - getTextRegions
```
public Map<Rectangle,FImage> getTextRegions()
```
    Returns a map of bounding box to image and textual string.
    
    Specified by:
    
    getTextRegions in class TextExtractor<FImage>
    
    Returns:
    
    A map of image bounding box to subimage and text string.

Class LiuSamarabanduTextExtractorBasic

Field Summary

Constructor Summary

Method Summary

Methods inherited from class org.openimaj.image.text.extraction.TextExtractor

Methods inherited from class java.lang.Object

Field Detail

DEBUG

Constructor Detail

LiuSamarabanduTextExtractorBasic

Method Detail

processImage

processFeatureMap

textRegionDetection

textRegionLocalisation

calculateHomography

getBoundingBoxPaddingPc

setBoundingBoxPaddingPc

getTextRegions