Package | Description |
---|---|
org.apache.pdfbox.pdmodel.documentinterchange.markedcontent |
The marked content package provides a mechanism for modeling marked-content
sequences.
|
org.apache.pdfbox.text | |
org.apache.pdfbox.tools |
Modifier and Type | Method and Description |
---|---|
void |
PDMarkedContent.addText(TextPosition text)
Adds a text position to the contents.
|
Modifier and Type | Field and Description |
---|---|
private TextPosition |
PDFTextStripper.PositionWrapper.position |
private TextPosition |
PDFTextStripper.LineItem.textPosition |
Modifier and Type | Field and Description |
---|---|
private java.util.Map<java.lang.String,java.util.List<TextPosition>> |
PDFMarkedContentExtractor.characterListMapping |
protected java.util.ArrayList<java.util.List<TextPosition>> |
PDFTextStripper.charactersByArticle
The charactersByArticle is used to extract text by article divisions.
|
private java.util.Map<java.lang.String,java.util.ArrayList<java.util.List<TextPosition>>> |
PDFTextStripperByArea.regionCharacterList |
(package private) java.util.List<TextPosition> |
PDFTextStripper.WordWithTextPositions.textPositions |
Modifier and Type | Method and Description |
---|---|
TextPosition |
PDFTextStripper.LineItem.getTextPosition() |
TextPosition |
PDFTextStripper.PositionWrapper.getTextPosition()
Returns the underlying TextPosition object.
|
Modifier and Type | Method and Description |
---|---|
protected java.util.List<java.util.List<TextPosition>> |
PDFTextStripper.getCharactersByArticle()
Character strings are grouped by articles.
|
java.util.List<TextPosition> |
PDFTextStripper.WordWithTextPositions.getTextPositions() |
Modifier and Type | Method and Description |
---|---|
int |
TextPositionComparator.compare(TextPosition pos1,
TextPosition pos2) |
boolean |
TextPosition.contains(TextPosition tp2)
Determine if this TextPosition logically contains another (i.e.
|
private void |
TextPosition.insertDiacritic(int i,
TextPosition diacritic)
Inserts the diacritic TextPosition to the str of this TextPosition and updates the widths
array to include the extra character width.
|
void |
TextPosition.mergeDiacritic(TextPosition diacritic)
Merge a single character TextPosition into the current object.
|
protected void |
PDFTextStripper.processTextPosition(TextPosition text)
This will process a TextPosition object and add the text to the list of characters on a page.
|
protected void |
LegacyPDFStreamEngine.processTextPosition(TextPosition text)
A method provided as an event interface to allow a subclass to perform some specific
functionality when text needs to be processed.
|
protected void |
PDFTextStripperByArea.processTextPosition(TextPosition text)
This will process a TextPosition object and add the text to the list of characters on a page.
|
protected void |
PDFMarkedContentExtractor.processTextPosition(TextPosition text)
This will process a TextPosition object and add the
text to the list of characters on a page.
|
protected void |
PDFTextStripper.writeCharacters(TextPosition text)
Write the string in TextPosition to the output stream.
|
Modifier and Type | Method and Description |
---|---|
private PDFTextStripper.WordWithTextPositions |
PDFTextStripper.createWord(java.lang.String word,
java.util.List<TextPosition> wordPositions)
Used within
PDFTextStripper.normalize(List) to create a single PDFTextStripper.WordWithTextPositions entry. |
private java.lang.StringBuilder |
PDFTextStripper.normalizeAdd(java.util.List<PDFTextStripper.WordWithTextPositions> normalized,
java.lang.StringBuilder lineBuilder,
java.util.List<TextPosition> wordPositions,
PDFTextStripper.LineItem item)
Used within
PDFTextStripper.normalize(List) to handle a TextPosition . |
protected void |
PDFTextStripper.writeString(java.lang.String text,
java.util.List<TextPosition> textPositions)
Write a Java string to the output stream.
|
Constructor and Description |
---|
LineItem(TextPosition textPosition) |
PositionWrapper(TextPosition position)
Constructs a PositionWrapper around the specified TextPosition object.
|
Constructor and Description |
---|
WordWithTextPositions(java.lang.String word,
java.util.List<TextPosition> positions) |
Modifier and Type | Method and Description |
---|---|
(package private) static int |
ExtractText.getAngle(TextPosition text) |
protected void |
AngleCollector.processTextPosition(TextPosition text) |
protected void |
FilteredTextStripper.processTextPosition(TextPosition text) |
protected java.lang.String |
PDFText2HTML.FontState.push(java.lang.StringBuilder buffer,
char character,
TextPosition textPosition) |
Modifier and Type | Method and Description |
---|---|
java.lang.String |
PDFText2HTML.FontState.push(java.lang.String text,
java.util.List<TextPosition> textPositions)
Pushes new
TextPositions into the font state. |
protected void |
PDFText2HTML.writeString(java.lang.String text,
java.util.List<TextPosition> textPositions)
Write a string to the output stream, maintain font state, and escape some HTML characters.
|