Possible consequences of bug:
1) Denial of service by causing a crash
Possible when all of the following apply:
- Untrusted data is passed to pb_decode()
- The top-level message contains a static string field as the first field.
Causes a single write of '0' byte to 1 byte before the message struct.
2) Remote code execution
Possible when all of the following apply:
- 64-bit platform
- The message or a submessage contains a static string field.
- Decoding directly from a custom pb_istream_t
- bytes_left on the stream is set to larger than 4 GB
Causes a write of up to 4 GB of data past the string field.
--
Detailed analysis follows
In the following consideration, I define "platform bitness" as equal to
number of bits in size_t datatype. Therefore most 8-bit platforms are
regarded as 16-bit for the purposes of this discussion.
1. The overflow in pb_dec_string
The overflow happens in this computation:
uint32_t size;
size_t alloc_size;
alloc_size = size + 1;
There are two ways in which the overflow can occur: In the uint32_t
addition, or in the cast to size_t. This depends on the platform
bitness.
On 32- and 64-bit platforms, the size has to be UINT32_MAX for the
overflow to occur. In that case alloc_size will be 0.
On 16-bit platforms, overflow will happen whenever size is more than
UINT16_MAX, and resulting alloc_size is attacker controlled.
For static fields, the alloc_size value is just checked against the
field data size. For pointer fields, the alloc_size value is passed to
malloc(). End result in both cases is the same, the storage is 0 or
just a few bytes in length.
On 16-bit platforms, another overflow occurs in the call to pb_read(),
when passing the original size. An attacker will want the passed value
to be larger than the alloc_size, therefore the only reasonable choice
is to have size = UINT16_MAX and alloc_size = 0. Any larger multiple
will truncate to the same values.
At this point we have read atleast the tag and the string length of the
message, i.e. atleast 3 bytes. The maximum initial value for stream
bytes_left is SIZE_MAX, thus at this point at most SIZE_MAX-3 bytes are
remaining.
On 32-bit and 16-bit platforms this means that the size passed to
pb_read() is always larger than the number of remaining bytes. This
causes pb_read() to fail immediately, before reading any bytes.
On 64-bit platforms, it is possible for the bytes_left value to be set
to a value larger than UINT32_MAX, which is the wraparound point in
size calculation. In this case pb_read() will succeed and write up to 4
GB of attacker controlled data over the RAM that comes after the string
field.
On all platforms, there is an unconditional write of a terminating null
byte. Because the size of size_t typically reflects the size of the
processor address space, a write at UINT16_MAX or UINT32_MAX bytes
after the string field actually wraps back to before the string field.
Consequently, on 32-bit and 16-bit platforms, the bug causes a single
write of '0' byte at one byte before the string field.
If the string field is in the middle of a message, this will just
corrupt other data in the message struct. Because the message contents
is attacker controlled anyway, this is a non-issue. However, if the
string field is the first field in the top-level message, it can
corrupt other data on the stack/heap before it. Typically a single '0'
write at a location not controlled by attacker is enough only for a
denial-of-service attack.
When using pointer fields and malloc(), the attacker controlled
alloc_size will cause a 0-size allocation to happen. By the same logic
as before, on 32-bit and 16-bit platforms this causes a '0' byte write
only. On 64-bit platforms, however, it will again allow up to 4 GB of
malicious data to be written over memory, if the stream length allows
the read.
2. The overflow in pb_dec_bytes
This overflow happens in the PB_BYTES_ARRAY_T_ALLOCSIZE macro:
The computation is done in size_t data type this time. This means that
an overflow is possible only when n is larger than SIZE_MAX -
offsetof(..). The offsetof value in this case is equal to
sizeof(pb_size_t) bytes.
Because the incoming size value is limited to 32 bits, no overflow can
happen here on 64-bit platforms.
The size will be passed to pb_read(). Like before, on 32-bit and 16-bit
platforms the read will always fail before writing anything.
This leaves only the write of bdest->size as exploitable. On statically
allocated fields, the size field will always be allocated, regardless
of alloc_size. In this case, no buffer overflow is possible here, but
user code could possibly use the attacker controlled size value and
read past a buffer.
If the field is allocated through malloc(), this will allow a write of
sizeof(pb_size_t) attacker controlled bytes to past a 0-byte long
buffer. In typical malloc implementations, this will either fit in
unused alignment padding area, or cause a heap corruption and a crash.
Under very exceptional situation it could allow attacker to influence
the behaviour of malloc(), possibly jumping into an attacker-controlled
location and thus leading to remote code execution.
This allows slight optimizations if only memory buffer support
(as opposed to stream callbacks) is wanted. On ARM difference
is -12% execution time, -4% code size when enabled.
This avoids doing 64-bit arithmetic for 32-bit varint decodings.
It does increase the code size somewhat.
Results for ARM Cortex-M3: -10% execution time, +1% code size, -2% ram usage.
In the pb_istream_from_buffer and pb_ostream_from_buffer, memcpy was
used to transfer values to the buffer. For the common case of
count = 1-10 bytes, a simple loop is faster.
This allows ignoring fields that are unnecessary or too large for an
embedded system using nanopb, while allowing them to remain in the .proto
for other platforms.
Update issue 51
Status: FixedInGit
It is not necessary to maintain full compatibility of generated files
for all of eternity, but this test will warn us if there is
a need to regenerate the files.
The __COUNTER__ macro (used for generating unique names) is at least supported
by gcc, clang and Visual Studio. With this change test_compiles.c is
compilable, since no more typedefs are redefined.
Compilers/Preprocessors not supporting __COUNTER__ error's are still possible
which are hopfully handled by the usage of __LINE__ in most sittuations.
Added unit test for the problem.
Rationale: it's easy to implement the callback wrong. Doing so introduces
io errors when unknown fields are present in the input. If code is not
tested with unknown fields, these bugs can remain hidden for long time.
Added a special case for the memory buffer stream, where it gives a small
speed benefit.
Added testcase for skipping fields with test_decode2 implementation.
Update issue 37
Status: FixedInGit