Known Indirect Subclasses |
Common interface for every entity across the hierarchy of recognized text. An entity may contain other smaller entities, or may be an atom.
Public Method Summary
abstractRect |
getBoundingBox()
Axis-aligned bounding box containing the text.
|
abstractList<? extendsText> |
getComponents()
Smaller components that comprise this entity, if any.
|
abstractPoint[] |
getCornerPoints()
Four corner points in clockwise direction starting with top-left.
|
abstractString |
getLanguage()
Prevailing language in the text, if any.
|
abstractString |
getValue()
Retrieve the recognized text as a string.
|
Public Methods
public abstractRectgetBoundingBox()
Axis-aligned bounding box containing the text. The bounding box may extend past the image boundary.
public abstractList<? extendsText> getComponents()
Smaller components that comprise this entity, if any. If this entity is an atom, an
empty list is returned.TextBlock
is at the top of the Text hierarchy.TextBlock
containsLine
objects, which containsElement
s.
Element
s
are atoms. We may decide to add character-level objects in later versions.
For example, a client could draw bounding boxes for recognized text in different colors for paragraphs, lines, words, and Alpha bets by repeatedly traversing down the tree with this method.
public abstractPoint[] getCornerPoints()
Four corner points in clockwise direction starting with top-left. Due to the possible perspective distortions, this is not necessarily a rectangle. Parts of the region could be outside of the image.
public abstractStringgetLanguage()
Prevailing language in the text, if any. The format is in BCP47 (e.g. "en" or "sr-Latn-BA" ) or "und" if the language could not be determined.
public abstractStringgetValue()
Retrieve the recognized text as a string. Returned in reading order for the language. For Latin, this is top to bottom within a TextBlock, and left-to-right within Lines.