Package org.jsoup.parser
Class CharacterReader
java.lang.Object
org.jsoup.parser.CharacterReader
CharacterReader consumes tokens off a string. Used internally by jsoup. API subject to changes.
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate int
private int
private int
private int
private char[]
(package private) static final char
(package private) static final int
private static final int
private static final int
(package private) static final int
private Reader
private int
private boolean
private String[]
private static final int
-
Constructor Summary
ConstructorsConstructorDescriptionCharacterReader
(Reader input) CharacterReader
(Reader input, int sz) CharacterReader
(String input) -
Method Summary
Modifier and TypeMethodDescriptionvoid
advance()
Moves the current position by one.private void
bufferUp()
private static String
cacheString
(char[] charBuf, String[] stringCache, int start, int count) Caches short strings, as a flywheel pattern, to reduce GC load.void
close()
(package private) char
consume()
(package private) String
(package private) String
(package private) String
(package private) String
(package private) String
(package private) String
(package private) String
consumeTo
(char c) Reads characters up to the specific char.(package private) String
consumeToAny
(char... chars) Read characters until the first of any delimiters is found.(package private) String
consumeToAnySorted
(char... chars) (package private) String
(package private) boolean
containsIgnoreCase
(String seq) char
current()
Get the char at the current position.boolean
isEmpty()
Tests if all the content has been read.private boolean
(package private) void
mark()
(package private) boolean
matchConsume
(String seq) (package private) boolean
(package private) boolean
matches
(char c) (package private) boolean
(package private) boolean
matchesAny
(char... seq) (package private) boolean
matchesAnySorted
(char[] seq) (package private) boolean
(package private) boolean
matchesIgnoreCase
(String seq) (package private) boolean
(package private) int
nextIndexOf
(char c) Returns the number of characters between the current position and the next instance of the input char(package private) int
nextIndexOf
(CharSequence seq) Returns the number of characters between the current position and the next instance of the input sequenceint
pos()
Gets the current cursor position in the content.(package private) static boolean
rangeEquals
(char[] charBuf, int start, int count, String cached) Check if the value of the provided range equals the string.(package private) boolean
rangeEquals
(int start, int count, String cached) (package private) void
toString()
(package private) void
(package private) void
unmark()
-
Field Details
-
EOF
static final char EOF- See Also:
-
maxStringCacheLen
private static final int maxStringCacheLen- See Also:
-
maxBufferLen
static final int maxBufferLen- See Also:
-
readAheadLimit
static final int readAheadLimit- See Also:
-
minReadAheadLen
private static final int minReadAheadLen- See Also:
-
charBuf
private char[] charBuf -
reader
-
bufLength
private int bufLength -
bufSplitPoint
private int bufSplitPoint -
bufPos
private int bufPos -
readerPos
private int readerPos -
bufMark
private int bufMark -
stringCacheSize
private static final int stringCacheSize- See Also:
-
stringCache
-
readFully
private boolean readFully
-
-
Constructor Details
-
CharacterReader
-
CharacterReader
-
CharacterReader
-
-
Method Details
-
close
public void close() -
bufferUp
private void bufferUp() -
pos
public int pos()Gets the current cursor position in the content.- Returns:
- current position
-
isEmpty
public boolean isEmpty()Tests if all the content has been read.- Returns:
- true if nothing left to read.
-
isEmptyNoBufferUp
private boolean isEmptyNoBufferUp() -
current
public char current()Get the char at the current position.- Returns:
- char
-
consume
char consume() -
unconsume
void unconsume() -
advance
public void advance()Moves the current position by one. -
mark
void mark() -
unmark
void unmark() -
rewindToMark
void rewindToMark() -
nextIndexOf
int nextIndexOf(char c) Returns the number of characters between the current position and the next instance of the input char- Parameters:
c
- scan target- Returns:
- offset between current position and next instance of target. -1 if not found.
-
nextIndexOf
Returns the number of characters between the current position and the next instance of the input sequence- Parameters:
seq
- scan target- Returns:
- offset between current position and next instance of target. -1 if not found.
-
consumeTo
Reads characters up to the specific char.- Parameters:
c
- the delimiter- Returns:
- the chars read
-
consumeTo
-
consumeToAny
Read characters until the first of any delimiters is found.- Parameters:
chars
- delimiters to scan for- Returns:
- characters read up to the matched delimiter.
-
consumeToAnySorted
-
consumeData
String consumeData() -
consumeRawData
String consumeRawData() -
consumeTagName
String consumeTagName() -
consumeToEnd
String consumeToEnd() -
consumeLetterSequence
String consumeLetterSequence() -
consumeLetterThenDigitSequence
String consumeLetterThenDigitSequence() -
consumeHexSequence
String consumeHexSequence() -
consumeDigitSequence
String consumeDigitSequence() -
matches
boolean matches(char c) -
matches
-
matchesIgnoreCase
-
matchesAny
boolean matchesAny(char... seq) -
matchesAnySorted
boolean matchesAnySorted(char[] seq) -
matchesLetter
boolean matchesLetter() -
matchesDigit
boolean matchesDigit() -
matchConsume
-
matchConsumeIgnoreCase
-
containsIgnoreCase
-
toString
-
cacheString
Caches short strings, as a flywheel pattern, to reduce GC load. Just for this doc, to prevent leaks. Simplistic, and on hash collisions just falls back to creating a new string, vs a full HashMap with Entry list. That saves both having to create objects as hash keys, and running through the entry list, at the expense of some more duplicates. -
rangeEquals
Check if the value of the provided range equals the string. -
rangeEquals
-