abstract static class Utf8.Processor
extends java.lang.Object
Constructor and Description |
---|
Processor() |
Modifier and Type | Method and Description |
---|---|
(package private) abstract java.lang.String |
decodeUtf8(byte[] bytes,
int index,
int size)
Decodes the given byte array slice into a
String . |
(package private) java.lang.String |
decodeUtf8(java.nio.ByteBuffer buffer,
int index,
int size)
Decodes the given portion of the
ByteBuffer into a String . |
(package private) java.lang.String |
decodeUtf8Default(java.nio.ByteBuffer buffer,
int index,
int size)
Decodes
ByteBuffer instances using the ByteBuffer API rather than
potentially faster approaches. |
(package private) abstract java.lang.String |
decodeUtf8Direct(java.nio.ByteBuffer buffer,
int index,
int size)
Decodes direct
ByteBuffer instances into String . |
(package private) abstract int |
encodeUtf8(java.lang.CharSequence in,
byte[] out,
int offset,
int length)
Encodes an input character sequence (
in ) to UTF-8 in the target array (out ). |
(package private) void |
encodeUtf8(java.lang.CharSequence in,
java.nio.ByteBuffer out)
Encodes an input character sequence (
in ) to UTF-8 in the target buffer (out ). |
(package private) void |
encodeUtf8Default(java.lang.CharSequence in,
java.nio.ByteBuffer out)
Encodes the input character sequence to a
ByteBuffer instance using the ByteBuffer API, rather than potentially faster approaches. |
(package private) abstract void |
encodeUtf8Direct(java.lang.CharSequence in,
java.nio.ByteBuffer out)
Encodes the input character sequence to a direct
ByteBuffer instance. |
(package private) boolean |
isValidUtf8(byte[] bytes,
int index,
int limit)
Returns
true if the given byte array slice is a
well-formed UTF-8 byte sequence. |
(package private) boolean |
isValidUtf8(java.nio.ByteBuffer buffer,
int index,
int limit)
Returns
true if the given portion of the ByteBuffer is a
well-formed UTF-8 byte sequence. |
private static int |
partialIsValidUtf8(java.nio.ByteBuffer buffer,
int index,
int limit)
Performs validation for
ByteBuffer instances using the ByteBuffer API rather
than potentially faster approaches. |
(package private) abstract int |
partialIsValidUtf8(int state,
byte[] bytes,
int index,
int limit)
Tells whether the given byte array slice is a well-formed,
malformed, or incomplete UTF-8 byte sequence.
|
(package private) int |
partialIsValidUtf8(int state,
java.nio.ByteBuffer buffer,
int index,
int limit)
Indicates whether or not the given buffer contains a valid UTF-8 string.
|
(package private) int |
partialIsValidUtf8Default(int state,
java.nio.ByteBuffer buffer,
int index,
int limit)
Performs validation for
ByteBuffer instances using the ByteBuffer API rather
than potentially faster approaches. |
(package private) abstract int |
partialIsValidUtf8Direct(int state,
java.nio.ByteBuffer buffer,
int index,
int limit)
Performs validation for direct
ByteBuffer instances. |
final boolean isValidUtf8(byte[] bytes, int index, int limit)
true
if the given byte array slice is a
well-formed UTF-8 byte sequence. The range of bytes to be
checked extends from index index
, inclusive, to limit
, exclusive.
This is a convenience method, equivalent to partialIsValidUtf8(bytes, index, limit) == Utf8.COMPLETE
.
abstract int partialIsValidUtf8(int state, byte[] bytes, int index, int limit)
index
, inclusive, to
limit
, exclusive.state
- either Utf8.COMPLETE
(if this is the initial decoding
operation) or the value returned from a call to a partial decoding method
for the previous bytesUtf8.MALFORMED
if the partial byte sequence is
definitely not well-formed, Utf8.COMPLETE
if it is well-formed
(no additional input needed), or if the byte sequence is
"incomplete", i.e. apparently terminated in the middle of a character,
an opaque integer "state" value containing enough information to
decode the character when passed to a subsequent invocation of a
partial decoding method.final boolean isValidUtf8(java.nio.ByteBuffer buffer, int index, int limit)
true
if the given portion of the ByteBuffer
is a
well-formed UTF-8 byte sequence. The range of bytes to be
checked extends from index index
, inclusive, to limit
, exclusive.
This is a convenience method, equivalent to partialIsValidUtf8(bytes, index, limit) == Utf8.COMPLETE
.
final int partialIsValidUtf8(int state, java.nio.ByteBuffer buffer, int index, int limit)
buffer
- the buffer to check.true
if the given buffer contains a valid UTF-8 string.abstract int partialIsValidUtf8Direct(int state, java.nio.ByteBuffer buffer, int index, int limit)
ByteBuffer
instances.final int partialIsValidUtf8Default(int state, java.nio.ByteBuffer buffer, int index, int limit)
ByteBuffer
instances using the ByteBuffer
API rather
than potentially faster approaches. This first completes validation for the current
character (provided by state
) and then finishes validation for the sequence.private static int partialIsValidUtf8(java.nio.ByteBuffer buffer, int index, int limit)
ByteBuffer
instances using the ByteBuffer
API rather
than potentially faster approaches.abstract java.lang.String decodeUtf8(byte[] bytes, int index, int size) throws InvalidProtocolBufferException
String
.InvalidProtocolBufferException
- if the byte array slice is not valid UTF-8.final java.lang.String decodeUtf8(java.nio.ByteBuffer buffer, int index, int size) throws InvalidProtocolBufferException
ByteBuffer
into a String
.InvalidProtocolBufferException
- if the portion of the buffer is not valid UTF-8.abstract java.lang.String decodeUtf8Direct(java.nio.ByteBuffer buffer, int index, int size) throws InvalidProtocolBufferException
ByteBuffer
instances into String
.InvalidProtocolBufferException
final java.lang.String decodeUtf8Default(java.nio.ByteBuffer buffer, int index, int size) throws InvalidProtocolBufferException
ByteBuffer
instances using the ByteBuffer
API rather than
potentially faster approaches.InvalidProtocolBufferException
abstract int encodeUtf8(java.lang.CharSequence in, byte[] out, int offset, int length)
in
) to UTF-8 in the target array (out
).
For a string, this method is similar to
byte[] a = string.getBytes(UTF_8);
System.arraycopy(a, 0, bytes, offset, a.length);
return offset + a.length;
but is more efficient in both time and space. One key difference is that this method
requires paired surrogates, and therefore does not support chunking.
While String.getBytes(UTF_8)
replaces unpaired surrogates with the default
replacement character, this method throws Utf8.UnpairedSurrogateException
.
To ensure sufficient space in the output buffer, either call Utf8.encodedLength(java.lang.CharSequence)
to
compute the exact amount needed, or leave room for
Utf8.MAX_BYTES_PER_CHAR * sequence.length()
, which is the largest possible number
of bytes that any input can be encoded to.
in
- the input character sequence to be encodedout
- the target arrayoffset
- the starting offset in bytes
to start writing atlength
- the length of the bytes
, starting from offset
offset + Utf8.encodedLength(sequence)
Utf8.UnpairedSurrogateException
- if sequence
contains ill-formed UTF-16 (unpaired
surrogates)java.lang.ArrayIndexOutOfBoundsException
- if sequence
encoded in UTF-8 is longer than
bytes.length - offset
final void encodeUtf8(java.lang.CharSequence in, java.nio.ByteBuffer out)
in
) to UTF-8 in the target buffer (out
).
Upon returning from this method, the out
position will point to the position after
the last encoded byte. This method requires paired surrogates, and therefore does not
support chunking.
To ensure sufficient space in the output buffer, either call Utf8.encodedLength(java.lang.CharSequence)
to
compute the exact amount needed, or leave room for
Utf8.MAX_BYTES_PER_CHAR * in.length()
, which is the largest possible number
of bytes that any input can be encoded to.
in
- the source character sequence to be encodedout
- the target bufferUtf8.UnpairedSurrogateException
- if in
contains ill-formed UTF-16 (unpaired
surrogates)java.lang.ArrayIndexOutOfBoundsException
- if in
encoded in UTF-8 is longer than
out.remaining()
abstract void encodeUtf8Direct(java.lang.CharSequence in, java.nio.ByteBuffer out)
ByteBuffer
instance.final void encodeUtf8Default(java.lang.CharSequence in, java.nio.ByteBuffer out)
ByteBuffer
instance using the ByteBuffer
API, rather than potentially faster approaches.