Text

public interfaceText
Known Indirect Subclasses

Common interface for every entity across the hierarchy of recognized text. An entity may contain other smaller entities, or may be an atom.

Public Method Summary

abstractRect
getBoundingBox()
Axis-aligned bounding box containing the text.
abstractList<? extendsText>
getComponents()
Smaller components that comprise this entity, if any.
abstractPoint[]
getCornerPoints()
Four corner points in clockwise direction starting with top-left.
abstractString
getLanguage()
Prevailing language in the text, if any.
abstractString
getValue()
Retrieve the recognized text as a string.

Public Methods

public abstractRectgetBoundingBox()

Axis-aligned bounding box containing the text. The bounding box may extend past the image boundary.

public abstractList<? extendsText> getComponents()

Smaller components that comprise this entity, if any. If this entity is an atom, an empty list is returned.TextBlock is at the top of the Text hierarchy.TextBlock containsLine objects, which containsElements. Elements are atoms. We may decide to add character-level objects in later versions.

For example, a client could draw bounding boxes for recognized text in different colors for paragraphs, lines, words, and Alpha bets by repeatedly traversing down the tree with this method.

public abstractPoint[] getCornerPoints()

Four corner points in clockwise direction starting with top-left. Due to the possible perspective distortions, this is not necessarily a rectangle. Parts of the region could be outside of the image.

public abstractStringgetLanguage()

Prevailing language in the text, if any. The format is in BCP47 (e.g. "en" or "sr-Latn-BA" ) or "und" if the language could not be determined.

public abstractStringgetValue()

Retrieve the recognized text as a string. Returned in reading order for the language. For Latin, this is top to bottom within a TextBlock, and left-to-right within Lines.