Package org.openpdf.text.pdf.parser
Class PdfContentStreamHandler
java.lang.Object
org.openpdf.text.pdf.parser.PdfContentStreamHandler
- Direct Known Subclasses:
PdfContentTextExtractor,PdfContentTextLocator
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionprivate static classA content operator implementation (BMC).private static classA content operator implementation (BDC).(package private) static classA content operator implementation (BT).protected classA content operator implementation (Do) for handling XObject forms.private static classA content operator implementation (EMC).(package private) static classA content operator implementation (ET).(package private) static classA content operator implementation (cm).(package private) static classA content operator implementation (').(package private) static classA content operator implementation (").(package private) static classA content operator implementation (Q).(package private) static classA content operator implementation (gs).(package private) static classA content operator implementation (q).(package private) static classA content operator implementation (Tc).(package private) static classA content operator implementation (Tf).(package private) static classA content operator implementation (Tz).(package private) static classA content operator implementation (TL).(package private) static classA content operator implementation (Tr).(package private) static classA content operator implementation (Ts).(package private) static classA content operator implementation (Tw).(package private) static classA content operator implementation (Tj).(package private) static classA content operator implementation (TJ).(package private) static classA content operator implementation (T*).(package private) static classA content operator implementation (Td).(package private) static classA content operator implementation (TD).(package private) static classA content operator implementation (Tm). -
Field Summary
FieldsModifier and TypeFieldDescriptionprotected Stack<GraphicsState> Stack keeping track of the graphics state.protected Map<String, ContentOperator> A map with all supported operators operators (PDF syntax).protected final TextAssemblerdetail parser for text within a marked section.protected List<TextAssemblyBuffer> protected final Stack<List<TextAssemblyBuffer>> protected MatrixText line matrix.protected MatrixText matrix. -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescription(package private) voidapplyTextAdjust(float tj) Adjusts the text matrix for the specified adjustment value (see TJ operator in the PDF spec for information)(package private) abstract voiddisplayPdfString(PdfString string) Displays text.protected byte[]Gets the content bytes from a PdfObject, which may be a reference, a stream or an array.(package private) static byte[]Gets the content bytes from a PdfObject, which may be a reference, a stream or an array.protected MatrixReturns the current line matrix.protected MatrixReturns the current text matrix.private static Matrixabstract String(package private) GraphicsStateReturns the current graphics state.protected voidLoads all the supported graphics and text state operators in a map.voidinvokeOperator(PdfLiteral operator, List<PdfObject> operands, PdfDictionary resources) Invokes an operator.protected Optional<ContentOperator> lookupOperator(String operatorName) Get the operator to process a command with a given name(package private) abstract voidprotected voidprocessContent(byte[] contentBytes, PdfDictionary resources) Processes PDF content stream bytes.(package private) abstract voidpushContext(String newContextName) voidregisterContentOperator(ContentOperator operator) Registers a content operator that will be called when the specified operator string is encountered during content processing.abstract voidreset()
-
Field Details
-
textFragmentStreams
-
contextNames
-
renderListener
detail parser for text within a marked section. used by TextAssembler -
operators
A map with all supported operators operators (PDF syntax). Protected to allow subclasses to override installDefaultOperators() and register additional operators. -
gsStack
Stack keeping track of the graphics state. -
textMatrix
Text matrix. -
textLineMatrix
Text line matrix. -
textFragments
-
-
Constructor Details
-
PdfContentStreamHandler
-
-
Method Details
-
getMatrix
-
registerContentOperator
Registers a content operator that will be called when the specified operator string is encountered during content processing. Each operator may be registered only once (it is not legal to have multiple operators with the same operatorString)- Parameters:
operator- the operator that will receive notification when the operator is encountered- Since:
- 2.1.7
-
installDefaultOperators
protected void installDefaultOperators()Loads all the supported graphics and text state operators in a map. Subclasses can override this method to register additional operators. When overriding, subclasses should call super.installDefaultOperators() first. -
lookupOperator
Get the operator to process a command with a given name- Parameters:
operatorName- name of the operator that we might need to call- Returns:
- the operator or null if none present
-
invokeOperator
Invokes an operator.- Parameters:
operator- the PDF Syntax of the operatoroperands- a list with operandsresources- Pdf Resources found in the file containing the stream.
-
popContext
abstract void popContext() -
pushContext
-
graphicsState
GraphicsState graphicsState()Returns the current graphics state.- Returns:
- the graphics state
-
reset
public abstract void reset() -
getCurrentTextMatrix
Returns the current text matrix.- Returns:
- the text matrix
- Since:
- 2.1.5
-
getCurrentTextLineMatrix
Returns the current line matrix.- Returns:
- the line matrix
- Since:
- 2.1.5
-
applyTextAdjust
void applyTextAdjust(float tj) Adjusts the text matrix for the specified adjustment value (see TJ operator in the PDF spec for information)- Parameters:
tj- the text adjustment
-
getCurrentFont
- Returns:
- current font in processing state
-
displayPdfString
Displays text.- Parameters:
string- the text to display
-
getResultantText
- Returns:
- result text
-
processContent
Processes PDF content stream bytes.- Parameters:
contentBytes- the bytes of a content streamresources- the resources that come with the content stream
-
getContentBytesFromPdfObject
Gets the content bytes from a PdfObject, which may be a reference, a stream or an array. This is a utility method that can be used by subclasses and other classes in this package.- Parameters:
object- the object to read bytes from- Returns:
- the content bytes
- Throws:
IOException- if there's an error reading the content
-
getContentBytesFromPdfObjectStatic
Gets the content bytes from a PdfObject, which may be a reference, a stream or an array. This is a static utility method that can be used by any class in this package.- Parameters:
object- the object to read bytes from- Returns:
- the content bytes
- Throws:
IOException- if there's an error reading the content
-