public abstract class BaseParser
extends java.lang.Object
Modifier and Type | Field and Description |
---|---|
protected static int |
A |
protected static byte |
ASCII_CR
ASCII code for carriage return.
|
protected static byte |
ASCII_LF
ASCII code for line feed.
|
private static byte |
ASCII_NINE |
private static byte |
ASCII_SPACE |
private static byte |
ASCII_ZERO |
protected static int |
B |
protected static int |
D |
static java.lang.String |
DEF
This is a string constant that will be used for comparisons.
|
protected COSDocument |
document
This is the document that will be parsed.
|
protected static int |
E |
protected static java.lang.String |
ENDOBJ_STRING
This is a string constant that will be used for comparisons.
|
protected static java.lang.String |
ENDSTREAM_STRING
This is a string constant that will be used for comparisons.
|
private static java.lang.String |
FALSE
This is a string constant that will be used for comparisons.
|
private static long |
GENERATION_NUMBER_THRESHOLD |
protected static int |
J |
private static org.apache.commons.logging.Log |
LOG
Log instance.
|
protected static int |
M |
(package private) static int |
MAX_LENGTH_LONG |
protected static int |
N |
private static java.lang.String |
NULL
This is a string constant that will be used for comparisons.
|
protected static int |
O |
private static long |
OBJECT_NUMBER_THRESHOLD |
protected static int |
R |
protected static int |
S |
(package private) SequentialSource |
seqSource
This is the stream that will be read from.
|
protected static java.lang.String |
STREAM_STRING
This is a string constant that will be used for comparisons.
|
protected static int |
T |
private static java.lang.String |
TRUE
This is a string constant that will be used for comparisons.
|
private java.nio.charset.CharsetDecoder |
utf8Decoder |
Constructor and Description |
---|
BaseParser(SequentialSource pdfSource)
Default constructor.
|
Modifier and Type | Method and Description |
---|---|
private int |
checkForEndOfString(int bracesParameter)
This is really a bug in the Document creators code, but it caused a crash in PDFBox, the first bug was in this
format: /Title ( (5) /Creator which was patched in 1 place.
|
private COSBase |
getObjectFromPool(COSObjectKey key) |
protected boolean |
isClosing()
This will tell if the next character is a closing brace( close of PDF array ).
|
protected boolean |
isClosing(int c)
This will tell if the next character is a closing brace( close of PDF array ).
|
private boolean |
isCR(int c) |
protected boolean |
isDigit()
This will tell if the next byte is a digit or not.
|
protected static boolean |
isDigit(int c)
This will tell if the given value is a digit or not.
|
protected boolean |
isEndOfName(int ch)
Determine if a character terminates a PDF name.
|
protected boolean |
isEOL()
This will tell if the next byte to be read is an end of line byte.
|
protected boolean |
isEOL(int c)
This will tell if the next byte to be read is an end of line byte.
|
private static boolean |
isHexDigit(char ch) |
private boolean |
isLF(int c) |
protected boolean |
isSpace()
This will tell if the next byte is a space or not.
|
protected boolean |
isSpace(int c)
This will tell if the given value is a space or not.
|
private boolean |
isValidUTF8(byte[] input)
Returns true if a byte sequence is valid UTF-8.
|
protected boolean |
isWhitespace()
This will tell if the next byte is whitespace or not.
|
protected boolean |
isWhitespace(int c)
This will tell if a character is whitespace or not.
|
protected COSBoolean |
parseBoolean()
This will parse a boolean object from the stream.
|
protected COSArray |
parseCOSArray()
This will parse a PDF array object.
|
protected COSDictionary |
parseCOSDictionary()
This will parse a PDF dictionary.
|
private void |
parseCOSDictionaryNameValuePair(COSDictionary obj) |
private COSBase |
parseCOSDictionaryValue()
This will parse a PDF dictionary value.
|
private COSString |
parseCOSHexString()
This will parse a PDF HEX string with fail fast semantic
meaning that we stop if a not allowed character is found.
|
protected COSName |
parseCOSName()
This will parse a PDF name from the stream.
|
protected COSString |
parseCOSString()
This will parse a PDF string.
|
protected COSBase |
parseDirObject()
This will parse a directory object from the stream.
|
protected void |
readExpectedChar(char ec)
Read one char and throw an exception if it is not the expected value.
|
protected void |
readExpectedString(char[] expectedString,
boolean skipSpaces)
Reads given pattern from
seqSource . |
protected void |
readExpectedString(java.lang.String expectedString)
Read one String and throw an exception if it is not the expected value.
|
protected int |
readGenerationNumber()
This will read a integer from the Stream and throw an
IllegalArgumentException if the integer value
has more than the maximum object revision (i.e. |
protected int |
readInt()
This will read an integer from the stream.
|
protected java.lang.String |
readLine()
This will read bytes until the first end of line marker occurs.
|
protected long |
readLong()
This will read an long from the stream.
|
protected long |
readObjectNumber()
This will read a long from the Stream and throw an
IOException if
the long value is negative or has more than 10 digits (i.e. |
protected java.lang.String |
readString()
This will read the next string from the stream.
|
protected java.lang.String |
readString(int length)
This will read the next string from the stream up to a certain length.
|
protected java.lang.StringBuilder |
readStringNumber()
This method is used to read a token by the readInt() method
and the readLong() method.
|
private boolean |
readUntilEndOfCOSDictionary()
Keep reading until the end of the dictionary object or the file has been hit, or until a '/'
has been found.
|
protected void |
skipSpaces()
This will skip all spaces and comments that are present.
|
protected void |
skipWhiteSpaces() |
private static final long OBJECT_NUMBER_THRESHOLD
private static final long GENERATION_NUMBER_THRESHOLD
static final int MAX_LENGTH_LONG
private final java.nio.charset.CharsetDecoder utf8Decoder
private static final org.apache.commons.logging.Log LOG
protected static final int E
protected static final int N
protected static final int D
protected static final int S
protected static final int T
protected static final int R
protected static final int A
protected static final int M
protected static final int O
protected static final int B
protected static final int J
public static final java.lang.String DEF
protected static final java.lang.String ENDOBJ_STRING
protected static final java.lang.String ENDSTREAM_STRING
protected static final java.lang.String STREAM_STRING
private static final java.lang.String TRUE
private static final java.lang.String FALSE
private static final java.lang.String NULL
protected static final byte ASCII_LF
protected static final byte ASCII_CR
private static final byte ASCII_ZERO
private static final byte ASCII_NINE
private static final byte ASCII_SPACE
final SequentialSource seqSource
protected COSDocument document
BaseParser(SequentialSource pdfSource)
private static boolean isHexDigit(char ch)
private COSBase parseCOSDictionaryValue() throws java.io.IOException
java.io.IOException
- If there is an error parsing the dictionary object.private COSBase getObjectFromPool(COSObjectKey key) throws java.io.IOException
java.io.IOException
protected COSDictionary parseCOSDictionary() throws java.io.IOException
java.io.IOException
- If there is an error reading the stream.private boolean readUntilEndOfCOSDictionary() throws java.io.IOException
java.io.IOException
- if there is a reading error.private void parseCOSDictionaryNameValuePair(COSDictionary obj) throws java.io.IOException
java.io.IOException
protected void skipWhiteSpaces() throws java.io.IOException
java.io.IOException
private int checkForEndOfString(int bracesParameter) throws java.io.IOException
bracesParameter
- the number of braces currently open.java.io.IOException
protected COSString parseCOSString() throws java.io.IOException
java.io.IOException
- If there is an error reading from the stream.private COSString parseCOSHexString() throws java.io.IOException
java.io.IOException
- If there is an error reading from the stream.protected COSArray parseCOSArray() throws java.io.IOException
java.io.IOException
- If there is an error parsing the stream.protected boolean isEndOfName(int ch)
ch
- The characterprotected COSName parseCOSName() throws java.io.IOException
java.io.IOException
- If there is an error reading from the stream.private boolean isValidUTF8(byte[] input)
protected COSBoolean parseBoolean() throws java.io.IOException
java.io.IOException
- If an IO error occurs during parsing.protected COSBase parseDirObject() throws java.io.IOException
java.io.IOException
- If there is an error during parsing.protected java.lang.String readString() throws java.io.IOException
java.io.IOException
- If there is an error reading from the stream.protected void readExpectedString(java.lang.String expectedString) throws java.io.IOException
expectedString
- the String value that is expected.java.io.IOException
- if the String char is not the expected value or if an
I/O error occurs.protected final void readExpectedString(char[] expectedString, boolean skipSpaces) throws java.io.IOException
seqSource
. Skipping whitespace at start and end if wanted.expectedString
- pattern to be skippedskipSpaces
- if set to true spaces before and after the string will be skippedjava.io.IOException
- if pattern could not be readprotected void readExpectedChar(char ec) throws java.io.IOException
ec
- the char value that is expected.java.io.IOException
- if the read char is not the expected value or if an
I/O error occurs.protected java.lang.String readString(int length) throws java.io.IOException
length
- The length to stop reading at.java.io.IOException
- If there is an error reading from the stream.protected boolean isClosing() throws java.io.IOException
java.io.IOException
- If an IO error occurs.protected boolean isClosing(int c)
c
- The character to check against end of lineprotected java.lang.String readLine() throws java.io.IOException
java.io.IOException
- If there is an error reading from the stream.protected boolean isEOL() throws java.io.IOException
java.io.IOException
- If there is an error reading from the stream.protected boolean isEOL(int c)
c
- The character to check against end of lineprivate boolean isLF(int c)
private boolean isCR(int c)
protected boolean isWhitespace() throws java.io.IOException
java.io.IOException
- If there is an error reading from the stream.protected boolean isWhitespace(int c)
c
- The character to check against whitespaceprotected boolean isSpace() throws java.io.IOException
java.io.IOException
- If there is an error reading from the stream.protected boolean isSpace(int c)
c
- The character to check against spaceprotected boolean isDigit() throws java.io.IOException
java.io.IOException
- If there is an error reading from the stream.protected static boolean isDigit(int c)
c
- The character to be checkedprotected void skipSpaces() throws java.io.IOException
java.io.IOException
- If there is an error reading from the stream.protected long readObjectNumber() throws java.io.IOException
IOException
if
the long value is negative or has more than 10 digits (i.e. : bigger than
OBJECT_NUMBER_THRESHOLD
)java.io.IOException
- if an I/O error occursprotected int readGenerationNumber() throws java.io.IOException
IllegalArgumentException
if the integer value
has more than the maximum object revision (i.e. : bigger than GENERATION_NUMBER_THRESHOLD
)java.io.IOException
- if an I/O error occursprotected int readInt() throws java.io.IOException
java.io.IOException
- If there is an error reading from the stream.protected long readLong() throws java.io.IOException
java.io.IOException
- If there is an error reading from the stream.protected final java.lang.StringBuilder readStringNumber() throws java.io.IOException
java.io.IOException
- throws by the seqSource
methods.