Package org.gjt.xpp.impl.pullparser
Class PullParser
- java.lang.Object
-
- org.gjt.xpp.impl.pullparser.PullParser
-
- All Implemented Interfaces:
XmlPullParser
,XmlPullParserBufferControl
,XmlPullParserEventPosition
public class PullParser extends java.lang.Object implements XmlPullParser, XmlPullParserBufferControl, XmlPullParserEventPosition
XML Pull Parser (XPP) allows to pull XML events from input stream. Advantages:- very simple pull interface - ideal for deserializing XML objects (like SOAP)
- simple and efficient thin wrapper around Tokenizer class - when compared with using Tokenizer directly adds about 10% for big documents, maximum 50% more processing time for small documents
- lightweight memory model - minimized memory allocation: element content and attributes are only read on explicit method calls, both StartTag and EndTag can be reused during parsing
- small - total compiled size around 20K
- by default supports namespaces parsing (can be switched off)
- support for mixed content can be explicitly disabled
- this is beta version - may have still bugs :-)
- does not parse DTD (recognizes only predefined entities)
- Author:
- Aleksander Slominski
-
-
Field Summary
Fields Modifier and Type Field Description protected Attribute[]
attrPos
temporary array of current attributesprotected int
attrPosEnd
index for last attribute in attrPos arrayprotected int
attrPosSize
size of attrPos arrayprotected static boolean
CHECK_ATTRIB_UNIQ
Should attribute uniqueness be checked for attributes as in specified XML and NS specifications?protected java.lang.String
elContent
Content of current element if in CONTENT stateprotected ElementContent[]
elStack
temprary array to keep ElementContent stackprotected int
elStackDepth
how many elements are on elStackprotected int
elStackSize
size of elStack arrayprotected boolean
emptyElement
Have we read empty element?protected int
eventEnd
end position of current event in tokenizer bifferprotected int
eventStart
start position of current event in tokenizer bifferprotected java.util.Hashtable
prefix2Ns
mapping of names prefixes to urisprotected boolean
reportNsAttribs
should parser report namespace xmlns* attributes ?protected boolean
seenRootElement
Have we seen root elementprotected byte
state
what is current event type as returned from next()?protected boolean
supportNs
should parser support namespaces?protected byte
token
what is current token returned from tokeizerprotected Tokenizer
tokenizer
XML tokenizer that is doing actual tokenizning of input stream.protected static boolean
USE_QNAMEBUF
-
Fields inherited from interface org.gjt.xpp.XmlPullParser
CONTENT, END_DOCUMENT, END_TAG, START_TAG
-
-
Constructor Summary
Constructors Constructor Description PullParser()
Create instance of pull parser.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected void
ensureAttribs(int size)
Make sure that in attributes temporary array is enough space.protected void
ensureCapacity(int size)
Make sure that we have enough space to keep element stack if passed size.int
getBufferShrinkOffset()
int
getColumnNumber()
int
getContentLength()
Return how big is content.int
getDepth()
Returns the current depth of the element.char[]
getEventBuffer()
NOTE: This may be internal buffer and is valud only until call to method next()- do NOT attempt modify !int
getEventEnd()
int
getEventStart()
byte
getEventType()
Returns the type of the current element (START_TAG, END_TAG, CONTENT, etc)int
getHardLimit()
int
getLineNumber()
java.lang.String
getLocalName()
Returns the local name of the current element (current event must be START_TAG or END_TAG)int
getNamespacesLength(int depth)
java.lang.String
getNamespaceUri()
Returns the namespace URI of the current element Returns null if not applicable (current event must be START_TAG or END_TAG)java.lang.String
getPosDesc()
Return string describing current position of parser in input stream.java.lang.String
getPrefix()
Returns the prefix of the current element or null if elemet has no prefix.java.lang.String
getQNameLocal(java.lang.String qName)
Return local part of qname.java.lang.String
getQNameUri(java.lang.String qName)
Return uri part of qname.java.lang.String
getRawName()
Returns the raw name (prefix + ':' + localName) of the current element (current event must be START_TAG or END_TAG)int
getSoftLimit()
boolean
isAllowedMixedContent()
Is mixed element context allowed?boolean
isBufferShrinkable()
boolean
isNamespaceAttributesReporting()
Is parser going to report namespace attributes (xmlns*) ?boolean
isNamespaceAware()
Is parser namespace aware?boolean
isWhitespaceContent()
Return true if just read CONTENT contained only white spaces.byte
next()
This is key method - it reads more from input stream and returns next event type (such as START_TAG, END_TAG, CONTENT).java.lang.String
readContent()
Return String that contains just read CONTENT.void
readEndTag(XmlEndTag etag)
Read value of just read END_TAG into passed as argument EndTag.void
readNamespacesPrefixes(int depth, java.lang.String[] prefixes, int off, int len)
Return namespace prefixes for element at depthvoid
readNamespacesUris(int depth, java.lang.String[] uris, int off, int len)
Return namespace URIs for element at depthbyte
readNode(XmlNode node)
Read subtree into node: call readNodeWithoutChildren and then parse subtree adding children (values obtained with readXontent or readNodeWithoutChildren).void
readNodeWithoutChildren(XmlNode node)
Read node: it calls readStartTag and then if parser is namespaces aware currently declared nemaspeces will be added and defaultNamespace will be set.void
readStartTag(XmlStartTag stag)
Read value of just read START_TAG into passed as argument StartTag.void
reset()
Reset parser state so it can be used to parse newprotected void
resetState()
void
setAllowedMixedContent(boolean enable)
Allow for mixed element content.void
setBufferShrinkable(boolean shrinkable)
void
setHardLimit(int value)
void
setInput(char[] buf)
Reset parser and set new input.void
setInput(char[] buf, int off, int len)
Set the input for parser.void
setInput(java.io.Reader reader)
Reset parser and set new input.void
setNamespaceAttributesReporting(boolean enable)
Make parser to report xmlns* attributes.void
setNamespaceAware(boolean awareness)
Set support of namespaces.void
setSoftLimit(int value)
byte
skipNode()
If parser has just read start tag it allows to skip whoole subtree contined in this element.
-
-
-
Field Detail
-
USE_QNAMEBUF
protected static final boolean USE_QNAMEBUF
- See Also:
- Constant Field Values
-
CHECK_ATTRIB_UNIQ
protected static final boolean CHECK_ATTRIB_UNIQ
Should attribute uniqueness be checked for attributes as in specified XML and NS specifications?- See Also:
- Constant Field Values
-
emptyElement
protected boolean emptyElement
Have we read empty element?
-
seenRootElement
protected boolean seenRootElement
Have we seen root element
-
elContent
protected java.lang.String elContent
Content of current element if in CONTENT state
-
tokenizer
protected Tokenizer tokenizer
XML tokenizer that is doing actual tokenizning of input stream.
-
eventStart
protected int eventStart
start position of current event in tokenizer biffer
-
eventEnd
protected int eventEnd
end position of current event in tokenizer biffer
-
state
protected byte state
what is current event type as returned from next()?
-
token
protected byte token
what is current token returned from tokeizer
-
supportNs
protected boolean supportNs
should parser support namespaces?
-
reportNsAttribs
protected boolean reportNsAttribs
should parser report namespace xmlns* attributes ?
-
prefix2Ns
protected java.util.Hashtable prefix2Ns
mapping of names prefixes to uris
-
attrPosEnd
protected int attrPosEnd
index for last attribute in attrPos array
-
attrPosSize
protected int attrPosSize
size of attrPos array
-
attrPos
protected Attribute[] attrPos
temporary array of current attributes
-
elStackDepth
protected int elStackDepth
how many elements are on elStack
-
elStackSize
protected int elStackSize
size of elStack array
-
elStack
protected ElementContent[] elStack
temprary array to keep ElementContent stack
-
-
Method Detail
-
setInput
public void setInput(java.io.Reader reader)
Reset parser and set new input.- Specified by:
setInput
in interfaceXmlPullParser
-
setInput
public void setInput(char[] buf)
Reset parser and set new input.- Specified by:
setInput
in interfaceXmlPullParser
-
setInput
public void setInput(char[] buf, int off, int len) throws XmlPullParserException
Description copied from interface:XmlPullParser
Set the input for parser.- Specified by:
setInput
in interfaceXmlPullParser
- Throws:
XmlPullParserException
-
reset
public void reset()
Reset parser state so it can be used to parse new- Specified by:
reset
in interfaceXmlPullParser
-
isAllowedMixedContent
public boolean isAllowedMixedContent()
Description copied from interface:XmlPullParser
Is mixed element context allowed?- Specified by:
isAllowedMixedContent
in interfaceXmlPullParser
-
setAllowedMixedContent
public void setAllowedMixedContent(boolean enable)
Allow for mixed element content. Enabled by default. When disbaled element must containt either text or other elements.- Specified by:
setAllowedMixedContent
in interfaceXmlPullParser
-
isNamespaceAware
public boolean isNamespaceAware()
Description copied from interface:XmlPullParser
Is parser namespace aware?- Specified by:
isNamespaceAware
in interfaceXmlPullParser
-
setNamespaceAware
public void setNamespaceAware(boolean awareness) throws XmlPullParserException
Set support of namespaces. Disabled by default.- Specified by:
setNamespaceAware
in interfaceXmlPullParser
- Throws:
XmlPullParserException
-
isNamespaceAttributesReporting
public boolean isNamespaceAttributesReporting()
Description copied from interface:XmlPullParser
Is parser going to report namespace attributes (xmlns*) ?- Specified by:
isNamespaceAttributesReporting
in interfaceXmlPullParser
-
setNamespaceAttributesReporting
public void setNamespaceAttributesReporting(boolean enable)
Make parser to report xmlns* attributes. Disabled by default. Only meaningful when namespaces are enabled (when namespaces are disabled all attributes are always reported).- Specified by:
setNamespaceAttributesReporting
in interfaceXmlPullParser
-
getNamespaceUri
public java.lang.String getNamespaceUri()
Description copied from interface:XmlPullParser
Returns the namespace URI of the current element Returns null if not applicable (current event must be START_TAG or END_TAG)- Specified by:
getNamespaceUri
in interfaceXmlPullParser
-
getLocalName
public java.lang.String getLocalName()
Description copied from interface:XmlPullParser
Returns the local name of the current element (current event must be START_TAG or END_TAG)- Specified by:
getLocalName
in interfaceXmlPullParser
-
getPrefix
public java.lang.String getPrefix()
Description copied from interface:XmlPullParser
Returns the prefix of the current element or null if elemet has no prefix. (current event must be START_TAG or END_TAG)- Specified by:
getPrefix
in interfaceXmlPullParser
-
getRawName
public java.lang.String getRawName()
Description copied from interface:XmlPullParser
Returns the raw name (prefix + ':' + localName) of the current element (current event must be START_TAG or END_TAG)- Specified by:
getRawName
in interfaceXmlPullParser
-
getQNameLocal
public java.lang.String getQNameLocal(java.lang.String qName)
Description copied from interface:XmlPullParser
Return local part of qname. For example for 'xsi:type' it returns 'type'.- Specified by:
getQNameLocal
in interfaceXmlPullParser
-
getQNameUri
public java.lang.String getQNameUri(java.lang.String qName) throws XmlPullParserException
Description copied from interface:XmlPullParser
Return uri part of qname. It is depending on current state of parser to find what namespace uri is mapped from namespace prefix. For example for 'xsi:type' if xsi namespace prefix was declared to 'urn:foo' it will return 'urn:foo'.- Specified by:
getQNameUri
in interfaceXmlPullParser
- Throws:
XmlPullParserException
-
getDepth
public int getDepth()
Description copied from interface:XmlPullParser
Returns the current depth of the element.- Specified by:
getDepth
in interfaceXmlPullParser
-
getNamespacesLength
public int getNamespacesLength(int depth)
- Specified by:
getNamespacesLength
in interfaceXmlPullParser
-
readNamespacesPrefixes
public void readNamespacesPrefixes(int depth, java.lang.String[] prefixes, int off, int len) throws XmlPullParserException
Return namespace prefixes for element at depth- Specified by:
readNamespacesPrefixes
in interfaceXmlPullParser
- Throws:
XmlPullParserException
-
readNamespacesUris
public void readNamespacesUris(int depth, java.lang.String[] uris, int off, int len) throws XmlPullParserException
Return namespace URIs for element at depth- Specified by:
readNamespacesUris
in interfaceXmlPullParser
- Throws:
XmlPullParserException
-
getPosDesc
public java.lang.String getPosDesc()
Return string describing current position of parser in input stream.- Specified by:
getPosDesc
in interfaceXmlPullParser
-
getLineNumber
public int getLineNumber()
- Specified by:
getLineNumber
in interfaceXmlPullParser
-
getColumnNumber
public int getColumnNumber()
- Specified by:
getColumnNumber
in interfaceXmlPullParser
-
next
public byte next() throws XmlPullParserException, java.io.IOException
This is key method - it reads more from input stream and returns next event type (such as START_TAG, END_TAG, CONTENT). or END_DOCUMENT if no more input.This is simple automata (in pseudo-code):
byte next() { while(state != END_DOCUMENT) { token = tokenizer.next(); // get next XML token switch(token) { case Tokenizer.END_DOCUMENT: return state = END_DOCUMENT case Tokenizer.CONTENT: // check if content allowed - only inside element return state = CONTENT case Tokenizer.ETAG_NAME: // popup element from stack - compare if matched start and end tag // if namespaces supported restore namespaces prefix mappings return state = END_TAG; case Tokenizer.STAG_NAME: // create new element push it on stack // process attributes (including namespaces) // set emptyElement = true; if empty element // check atribute uniqueness (including nmespacese prefixes) return state = START_TAG; } } }
Actual parsing is more complex especilly for start tag due to dealing with attributes reported separately from tokenizer and declaring namespace prefixes and uris.
- Specified by:
next
in interfaceXmlPullParser
- Throws:
XmlPullParserException
java.io.IOException
-
getEventType
public byte getEventType()
Description copied from interface:XmlPullParser
Returns the type of the current element (START_TAG, END_TAG, CONTENT, etc)- Specified by:
getEventType
in interfaceXmlPullParser
-
isWhitespaceContent
public boolean isWhitespaceContent() throws XmlPullParserException
Return true if just read CONTENT contained only white spaces.- Specified by:
isWhitespaceContent
in interfaceXmlPullParser
- Throws:
XmlPullParserException
-
getContentLength
public int getContentLength() throws XmlPullParserException
Description copied from interface:XmlPullParser
Return how big is content.NOTE: parser must be on CONTENT event.
- Specified by:
getContentLength
in interfaceXmlPullParser
- Throws:
XmlPullParserException
-
readContent
public java.lang.String readContent() throws XmlPullParserException
Return String that contains just read CONTENT.- Specified by:
readContent
in interfaceXmlPullParser
- Throws:
XmlPullParserException
-
readEndTag
public void readEndTag(XmlEndTag etag) throws XmlPullParserException
Read value of just read END_TAG into passed as argument EndTag.- Specified by:
readEndTag
in interfaceXmlPullParser
- Throws:
XmlPullParserException
-
readStartTag
public void readStartTag(XmlStartTag stag) throws XmlPullParserException
Read value of just read START_TAG into passed as argument StartTag.- Specified by:
readStartTag
in interfaceXmlPullParser
- Throws:
XmlPullParserException
-
readNodeWithoutChildren
public void readNodeWithoutChildren(XmlNode node) throws XmlPullParserException
Description copied from interface:XmlPullParser
Read node: it calls readStartTag and then if parser is namespaces aware currently declared nemaspeces will be added and defaultNamespace will be set.NOTE: parser must be on START_TAG event. and all events will written into node!
- Specified by:
readNodeWithoutChildren
in interfaceXmlPullParser
- Throws:
XmlPullParserException
-
readNode
public byte readNode(XmlNode node) throws XmlPullParserException, java.io.IOException
Description copied from interface:XmlPullParser
Read subtree into node: call readNodeWithoutChildren and then parse subtree adding children (values obtained with readXontent or readNodeWithoutChildren).NOTE: parser must be on START_TAG event. and all events will written into node!
- Specified by:
readNode
in interfaceXmlPullParser
- Throws:
XmlPullParserException
java.io.IOException
-
skipNode
public byte skipNode() throws XmlPullParserException, java.io.IOException
If parser has just read start tag it allows to skip whoole subtree contined in this element. Returns when encounters end tag matching the start tag.- Specified by:
skipNode
in interfaceXmlPullParser
- Throws:
XmlPullParserException
java.io.IOException
-
getHardLimit
public int getHardLimit()
- Specified by:
getHardLimit
in interfaceXmlPullParserBufferControl
-
setHardLimit
public void setHardLimit(int value) throws XmlPullParserException
- Specified by:
setHardLimit
in interfaceXmlPullParserBufferControl
- Throws:
XmlPullParserException
-
getSoftLimit
public int getSoftLimit()
- Specified by:
getSoftLimit
in interfaceXmlPullParserBufferControl
-
setSoftLimit
public void setSoftLimit(int value) throws XmlPullParserException
- Specified by:
setSoftLimit
in interfaceXmlPullParserBufferControl
- Throws:
XmlPullParserException
-
getBufferShrinkOffset
public int getBufferShrinkOffset()
- Specified by:
getBufferShrinkOffset
in interfaceXmlPullParserBufferControl
-
setBufferShrinkable
public void setBufferShrinkable(boolean shrinkable) throws XmlPullParserException
- Specified by:
setBufferShrinkable
in interfaceXmlPullParserBufferControl
- Throws:
XmlPullParserException
-
isBufferShrinkable
public boolean isBufferShrinkable()
- Specified by:
isBufferShrinkable
in interfaceXmlPullParserBufferControl
-
getEventStart
public int getEventStart()
- Specified by:
getEventStart
in interfaceXmlPullParserEventPosition
-
getEventEnd
public int getEventEnd()
- Specified by:
getEventEnd
in interfaceXmlPullParserEventPosition
-
getEventBuffer
public char[] getEventBuffer()
Description copied from interface:XmlPullParserEventPosition
NOTE: This may be internal buffer and is valud only until call to method next()- do NOT attempt modify !
- Specified by:
getEventBuffer
in interfaceXmlPullParserEventPosition
-
ensureCapacity
protected void ensureCapacity(int size)
Make sure that we have enough space to keep element stack if passed size.
-
ensureAttribs
protected void ensureAttribs(int size)
Make sure that in attributes temporary array is enough space.
-
resetState
protected void resetState()
-
-