Package org.jsoup.helper
Class DataUtil
java.lang.Object
org.jsoup.helper.DataUtil
Internal static utilities for handling data.
-
Nested Class Summary
Nested Classes -
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescription(package private) static void
crossStreams
(InputStream in, OutputStream out) Writes the input stream to the output stream.private static DataUtil.BomCharset
detectCharsetFromBom
(ByteBuffer byteData) (package private) static ByteBuffer
(package private) static String
getCharsetFromContentType
(String contentType) Parse out a charset from a content type header.static Document
Loads a file to a Document.static Document
load
(InputStream in, String charsetName, String baseUri) Parses a Document from an input steam.static Document
load
(InputStream in, String charsetName, String baseUri, Parser parser) Parses a Document from an input steam, using the provided Parser.(package private) static String
Creates a random string, suitable for use as a mime boundary(package private) static Document
parseInputStream
(InputStream input, String charsetName, String baseUri, Parser parser) static ByteBuffer
readToByteBuffer
(InputStream inStream, int maxSize) Read the input stream into a byte buffer.private static String
-
Field Details
-
charsetPattern
-
defaultCharset
- See Also:
-
firstReadBufferSize
private static final int firstReadBufferSize- See Also:
-
bufferSize
static final int bufferSize- See Also:
-
mimeBoundaryChars
private static final char[] mimeBoundaryChars -
boundaryLength
static final int boundaryLength- See Also:
-
-
Constructor Details
-
DataUtil
private DataUtil()
-
-
Method Details
-
load
Loads a file to a Document.- Parameters:
in
- file to loadcharsetName
- character set of inputbaseUri
- base URI of document, to resolve relative links against- Returns:
- Document
- Throws:
IOException
- on IO error
-
load
Parses a Document from an input steam.- Parameters:
in
- input stream to parse. You will need to close it.charsetName
- character set of inputbaseUri
- base URI of document, to resolve relative links against- Returns:
- Document
- Throws:
IOException
- on IO error
-
load
public static Document load(InputStream in, String charsetName, String baseUri, Parser parser) throws IOException Parses a Document from an input steam, using the provided Parser.- Parameters:
in
- input stream to parse. You will need to close it.charsetName
- character set of inputbaseUri
- base URI of document, to resolve relative links againstparser
- alternateparser
to use.- Returns:
- Document
- Throws:
IOException
- on IO error
-
crossStreams
Writes the input stream to the output stream. Doesn't close them.- Parameters:
in
- input stream to read fromout
- output stream to write to- Throws:
IOException
- on IO error
-
parseInputStream
static Document parseInputStream(InputStream input, String charsetName, String baseUri, Parser parser) throws IOException - Throws:
IOException
-
readToByteBuffer
Read the input stream into a byte buffer. To deal with slow input streams, you may interrupt the thread this method is executing on. The data read until being interrupted will be available.- Parameters:
inStream
- the input stream to read frommaxSize
- the maximum size in bytes to read from the stream. Set to 0 to be unlimited.- Returns:
- the filled byte buffer
- Throws:
IOException
- if an exception occurs whilst reading from the input stream.
-
emptyByteBuffer
-
getCharsetFromContentType
Parse out a charset from a content type header. If the charset is not supported, returns null (so the default will kick in.)- Parameters:
contentType
- e.g. "text/html; charset=EUC-JP"- Returns:
- "EUC-JP", or null if not found. Charset is trimmed and uppercased.
-
validateCharset
-
mimeBoundary
Creates a random string, suitable for use as a mime boundary -
detectCharsetFromBom
-