static final class Utf8.UnsafeProcessor extends Utf8.Processor
Utf8.Processor
that uses sun.misc.Unsafe
where possible to improve performance.Constructor and Description |
---|
UnsafeProcessor() |
Modifier and Type | Method and Description |
---|---|
(package private) java.lang.String |
decodeUtf8(byte[] bytes,
int index,
int size)
Decodes the given byte array slice into a
String . |
(package private) java.lang.String |
decodeUtf8Direct(java.nio.ByteBuffer buffer,
int index,
int size)
Decodes direct
ByteBuffer instances into String . |
(package private) int |
encodeUtf8(java.lang.CharSequence in,
byte[] out,
int offset,
int length)
Encodes an input character sequence (
in ) to UTF-8 in the target array (out ). |
(package private) void |
encodeUtf8Direct(java.lang.CharSequence in,
java.nio.ByteBuffer out)
Encodes the input character sequence to a direct
ByteBuffer instance. |
(package private) static boolean |
isAvailable()
Indicates whether or not all required unsafe operations are supported on this platform.
|
private static int |
partialIsValidUtf8(byte[] bytes,
long offset,
int remaining) |
(package private) int |
partialIsValidUtf8(int state,
byte[] bytes,
int index,
int limit)
Tells whether the given byte array slice is a well-formed,
malformed, or incomplete UTF-8 byte sequence.
|
private static int |
partialIsValidUtf8(long address,
int remaining) |
(package private) int |
partialIsValidUtf8Direct(int state,
java.nio.ByteBuffer buffer,
int index,
int limit)
Performs validation for direct
ByteBuffer instances. |
private static int |
unsafeEstimateConsecutiveAscii(byte[] bytes,
long offset,
int maxChars)
Counts (approximately) the number of consecutive ASCII characters starting from the given
position, using the most efficient method available to the platform.
|
private static int |
unsafeEstimateConsecutiveAscii(long address,
int maxChars)
Same as
Utf8.estimateConsecutiveAscii(ByteBuffer, int, int) except that it uses the
most efficient method available to the platform. |
private static int |
unsafeIncompleteStateFor(byte[] bytes,
int byte1,
long offset,
int remaining) |
private static int |
unsafeIncompleteStateFor(long address,
int byte1,
int remaining) |
decodeUtf8, decodeUtf8Default, encodeUtf8, encodeUtf8Default, isValidUtf8, isValidUtf8, partialIsValidUtf8, partialIsValidUtf8Default
static boolean isAvailable()
int partialIsValidUtf8(int state, byte[] bytes, int index, int limit)
Utf8.Processor
index
, inclusive, to
limit
, exclusive.partialIsValidUtf8
in class Utf8.Processor
state
- either Utf8.COMPLETE
(if this is the initial decoding
operation) or the value returned from a call to a partial decoding method
for the previous bytesUtf8.MALFORMED
if the partial byte sequence is
definitely not well-formed, Utf8.COMPLETE
if it is well-formed
(no additional input needed), or if the byte sequence is
"incomplete", i.e. apparently terminated in the middle of a character,
an opaque integer "state" value containing enough information to
decode the character when passed to a subsequent invocation of a
partial decoding method.int partialIsValidUtf8Direct(int state, java.nio.ByteBuffer buffer, int index, int limit)
Utf8.Processor
ByteBuffer
instances.partialIsValidUtf8Direct
in class Utf8.Processor
java.lang.String decodeUtf8(byte[] bytes, int index, int size) throws InvalidProtocolBufferException
Utf8.Processor
String
.decodeUtf8
in class Utf8.Processor
InvalidProtocolBufferException
- if the byte array slice is not valid UTF-8.java.lang.String decodeUtf8Direct(java.nio.ByteBuffer buffer, int index, int size) throws InvalidProtocolBufferException
Utf8.Processor
ByteBuffer
instances into String
.decodeUtf8Direct
in class Utf8.Processor
InvalidProtocolBufferException
int encodeUtf8(java.lang.CharSequence in, byte[] out, int offset, int length)
Utf8.Processor
in
) to UTF-8 in the target array (out
).
For a string, this method is similar to
byte[] a = string.getBytes(UTF_8);
System.arraycopy(a, 0, bytes, offset, a.length);
return offset + a.length;
but is more efficient in both time and space. One key difference is that this method
requires paired surrogates, and therefore does not support chunking.
While String.getBytes(UTF_8)
replaces unpaired surrogates with the default
replacement character, this method throws Utf8.UnpairedSurrogateException
.
To ensure sufficient space in the output buffer, either call Utf8.encodedLength(java.lang.CharSequence)
to
compute the exact amount needed, or leave room for
Utf8.MAX_BYTES_PER_CHAR * sequence.length()
, which is the largest possible number
of bytes that any input can be encoded to.
encodeUtf8
in class Utf8.Processor
in
- the input character sequence to be encodedout
- the target arrayoffset
- the starting offset in bytes
to start writing atlength
- the length of the bytes
, starting from offset
offset + Utf8.encodedLength(sequence)
void encodeUtf8Direct(java.lang.CharSequence in, java.nio.ByteBuffer out)
Utf8.Processor
ByteBuffer
instance.encodeUtf8Direct
in class Utf8.Processor
private static int unsafeEstimateConsecutiveAscii(byte[] bytes, long offset, int maxChars)
bytes
- the array containing the character sequenceoffset
- the offset position of the index (same as index + arrayBaseOffset)maxChars
- the maximum number of characters to countprivate static int unsafeEstimateConsecutiveAscii(long address, int maxChars)
Utf8.estimateConsecutiveAscii(ByteBuffer, int, int)
except that it uses the
most efficient method available to the platform.private static int partialIsValidUtf8(byte[] bytes, long offset, int remaining)
private static int partialIsValidUtf8(long address, int remaining)
private static int unsafeIncompleteStateFor(byte[] bytes, int byte1, long offset, int remaining)
private static int unsafeIncompleteStateFor(long address, int byte1, int remaining)