Core API: Strings¶
-
construct.
setglobalstringencoding
(encoding)¶ Sets the encoding globally for all String PascalString CString GreedyString instances.
- Parameters
encoding – a string like “utf8” etc or None, which means working with bytes
-
construct.
String
(length, encoding=None, padchar=b'\x00', paddir='right', trimdir='right')¶ A configurable, fixed-length or variable-length string field.
When parsing, the byte string is stripped of pad character (as specified) from the direction (as specified) then decoded (as specified). Length is a constant integer or a function of the context. When building, the string is encoded (as specified) then padded (as specified) from the direction (as specified) or trimmed as bytes (as specified).
The padding character and direction must be specified for padding to work. The trim direction must be specified for trimming to work.
- Parameters
length – length in bytes (not unicode characters), as int or context function
encoding – encoding (e.g. “utf8”) or None for bytes
padchar – b-string character to pad out strings (by default b”x00”)
paddir – direction to pad out strings (one of: right left both)
trimdir – direction to trim strings (one of: right left)
Example:
>>> String(10).build(b"hello") b'hello\x00\x00\x00\x00\x00' >>> String(10).parse(_) b'hello' >>> String(10).sizeof() 10 >>> String(10, encoding="utf8").build("Афон") b'\xd0\x90\xd1\x84\xd0\xbe\xd0\xbd\x00\x00' >>> String(10, encoding="utf8").parse(_) 'Афон' >>> String(10, padchar=b"XYZ", paddir="center").build(b"abc") b'XXXabcXXXX' >>> String(10, padchar=b"XYZ", paddir="center").parse(b"XYZabcXYZY") b'abc' >>> String(10, trimdir="right").build(b"12345678901234567890") b'1234567890'
-
construct.
PascalString
(lengthfield, encoding=None)¶ A length-prefixed string.
PascalString
is named after the string types of Pascal, which are length-prefixed. Lisp strings also follow this convention.The length field will not appear in the same dict, when parsing. Only the string will be returned. When building, actual length is prepended before the encoded string. The length field can be variable length (such as VarInt). Stored length is in bytes, not characters.
- Parameters
lengthfield – a field used to parse and build the length
encoding – encoding (e.g. “utf8”) or None for bytes
Example:
>>> PascalString(VarInt, encoding="utf8").build("Афон") b'\x08\xd0\x90\xd1\x84\xd0\xbe\xd0\xbd' >>> PascalString(VarInt, encoding="utf8").parse(_) 'Афон'
-
construct.
CString
(terminators=b'\x00', encoding=None)¶ A string ending in a terminator b-string character.
CString
is similar to the strings of C.By default, the terminator is the NULL byte (b’x00’). Terminators field can be a longer b-string, and any of the characters breaks parsing. First terminator byte is used when building.
- Parameters
terminators – sequence of valid terminators, first is used when building, all are used when parsing
encoding – encoding (e.g. “utf8”) or None for bytes
Example:
>>> CString(encoding="utf8").build("Афон") b'\xd0\x90\xd1\x84\xd0\xbe\xd0\xbd\x00' >>> CString(encoding="utf8").parse(_) 'Афон'
-
construct.
GreedyString
(encoding=None)¶ A string that reads the rest of the stream until EOF, and writes a given string as is. If no encoding is given, this is essentially GreedyBytes.
- Parameters
encoding – encoding (e.g. “utf8”) or None for bytes
See also
Analog to
GreedyBytes
and the same when no enoding is used.Example:
>>> GreedyString(encoding="utf8").build("Афон") b'\xd0\x90\xd1\x84\xd0\xbe\xd0\xbd' >>> GreedyString(encoding="utf8").parse(_) 'Афон'