public class PDFPage
extends org.apache.pdfbox.util.PDFTextStripper
showCharacter(TextPosition)
org.apache.pdfbox.util.PDFStreamEngine#processStream(org.apache.pdfbox.pdmodel.PDPage, org.apache.pdfbox.pdmodel.PDResources, org.pdfbox.cos.COSStream)
the implemented method showCharacter is called.PDFTextStripper
Modifier and Type | Class and Description |
---|---|
class |
PDFPage.MyInvoke |
Modifier and Type | Field and Description |
---|---|
protected float |
effectivePageHeight
The effective page height.
|
protected float |
max_character_ypos
The maximum (lowest) y position of a character.
|
protected float |
max_image_ypos
The maximum (lowest y position of an image.
|
Constructor and Description |
---|
PDFPage(float effectivePageHeight,
boolean legacy32)
Constructor.
|
Modifier and Type | Method and Description |
---|---|
static float |
findMaxX(Pos[] coordinates) |
static float |
findMaxY(Pos[] coordinates) |
static float |
findMinX(Pos[] coordinates) |
static float |
findMinY(Pos[] coordinates) |
java.awt.geom.GeneralPath |
getCurrentPath()
Returns the path currently being constructed.
|
float |
getMaxPageLength()
Returns the calculated page length.
|
protected void |
processOperator(org.apache.pdfbox.util.PDFOperator operator,
java.util.List arguments) |
protected void |
processTextPosition(org.apache.pdfbox.util.TextPosition text) |
void |
registerPathBounds(java.awt.Rectangle bounds)
Registers a rectangle that bounds the path currently being drawn.
|
void |
setCurrentPath(java.awt.geom.GeneralPath currentPath)
Sets the current path.
|
protected void |
showCharacter(org.apache.pdfbox.util.TextPosition text)
A method provided as an event interface to allow a subclass to perform
some specific functionality when a character needs to be displayed.
|
static Pos |
transtormCoordinate(Pos pos,
org.apache.pdfbox.util.Matrix m) |
static Pos[] |
transtormCoordinates(Pos[] coordinates,
org.apache.pdfbox.util.Matrix m) |
endArticle, endDocument, endPage, getAddMoreFormatting, getArticleEnd, getArticleStart, getAverageCharTolerance, getCharactersByArticle, getCurrentPageNo, getDropThreshold, getEndBookmark, getEndPage, getIndentThreshold, getLineSeparator, getListItemPatterns, getOutput, getPageEnd, getPageSeparator, getPageStart, getParagraphEnd, getParagraphStart, getSeparateByBeads, getSortByPosition, getSpacingTolerance, getStartBookmark, getStartPage, getSuppressDuplicateOverlappingText, getText, getText, getWordSeparator, handleLineSeparation, inspectFontEncoding, isParagraphSeparation, matchListItemPattern, matchPattern, processPage, processPages, resetEngine, setAddMoreFormatting, setArticleEnd, setArticleStart, setAverageCharTolerance, setDropThreshold, setEndBookmark, setEndPage, setIndentThreshold, setLineSeparator, setListItemPatterns, setPageEnd, setPageSeparator, setPageStart, setParagraphEnd, setParagraphStart, setShouldSeparateByBeads, setSortByPosition, setSpacingTolerance, setStartBookmark, setStartPage, setSuppressDuplicateOverlappingText, setWordSeparator, startArticle, startArticle, startDocument, startPage, writeCharacters, writeLineSeparator, writePage, writePageEnd, writePageSeperator, writePageStart, writeParagraphEnd, writeParagraphSeparator, writeParagraphStart, writeString, writeText, writeText, writeWordSeparator
getColorSpaces, getCurrentPage, getFonts, getGraphicsStack, getGraphicsState, getGraphicsStates, getResources, getTextLineMatrix, getTextMatrix, getTotalCharCnt, getValidCharCnt, getXObjects, isForceParsing, processEncodedText, processOperator, processStream, processSubStream, registerOperatorProcessor, setColorSpaces, setFonts, setForceParsing, setGraphicsStack, setGraphicsState, setGraphicsStates, setTextLineMatrix, setTextMatrix
protected float max_character_ypos
protected float max_image_ypos
protected float effectivePageHeight
public PDFPage(float effectivePageHeight, boolean legacy32) throws java.io.IOException
effectivePageHeight
- The height of the page to be evaluated. PDF elements outside
this height will not be considered.java.io.IOException
public java.awt.geom.GeneralPath getCurrentPath()
public void setCurrentPath(java.awt.geom.GeneralPath currentPath)
currentPath
- The new current path.public void registerPathBounds(java.awt.Rectangle bounds)
bounds
- A rectangle depicting the bounds (coordinates originating from
bottom left).protected void processOperator(org.apache.pdfbox.util.PDFOperator operator, java.util.List arguments) throws java.io.IOException
processOperator
in class org.apache.pdfbox.util.PDFStreamEngine
java.io.IOException
protected void processTextPosition(org.apache.pdfbox.util.TextPosition text)
processTextPosition
in class org.apache.pdfbox.util.PDFTextStripper
protected void showCharacter(org.apache.pdfbox.util.TextPosition text)
text
- the character to be displayed -> calculate there y position.public float getMaxPageLength()
public static Pos[] transtormCoordinates(Pos[] coordinates, org.apache.pdfbox.util.Matrix m)
public static float findMinY(Pos[] coordinates)
public static float findMaxY(Pos[] coordinates)
public static float findMaxX(Pos[] coordinates)
public static float findMinX(Pos[] coordinates)