From: "Laszlo Ersek" <lersek@redhat.com>
To: devel@edk2.groups.io, zhichao.gao@intel.com
Cc: Michael D Kinney <michael.d.kinney@intel.com>,
Liming Gao <liming.gao@intel.com>,
Marvin Hauser <mhaeuser@outlook.de>
Subject: Re: [edk2-devel] [PATCH 0/3] MdePkg/BaseLib: Base64Decode: Make it follow its specification
Date: Mon, 1 Jul 2019 13:02:12 +0200 [thread overview]
Message-ID: <c495bd0b-ea4d-7206-8a4f-a7149760d19a@redhat.com> (raw)
In-Reply-To: <20190628035746.24160-1-zhichao.gao@intel.com>
On 06/28/19 05:57, Gao, Zhichao wrote:
> REF: https://bugzilla.tianocore.org/show_bug.cgi?id=1891
>
> Adjust the coding style.
> Set DestinationSize before return.
> Add addition decription for the RETURN_SUCCESS case.
>
> Cc: Michael D Kinney <michael.d.kinney@intel.com>
> Cc: Liming Gao <liming.gao@intel.com>
> Cc: Marvin Hauser <mhaeuser@outlook.de>
> Cc: Laszlo Ersek <lersek@redhat.com>
>
> Zhichao Gao (3):
> MdePkg/BaseLib: Adjust the coding style in Base64Decode
> MdePkg/BaseLib: Base64Decode: Make DestinationSize complied to spec
> MdePkg/BaseLib: Base64Decode: Add decription for RETURN_SUCCESS
>
> MdePkg/Library/BaseLib/String.c | 28 ++++++++++++++++++----------
> 1 file changed, 18 insertions(+), 10 deletions(-)
>
Issues that have not been addressed by this patch set, but should be:
(1) the leading comment says, "Produce Null-terminated binary data in
the output buffer". That's bogus, the binary output is never
NUL-terminated (nor should it be).
(2) One of the RETURN_INVALID_PARAMETER cases is documented as:
"If SourceLength or DestinationSize is bigger than (MAX_ADDRESS
-(UINTN)Destination )."
There are two problems with this.
(2a) SourceLength has nothing to do with Destination. The comment should
be updated -- making sure that (Source + SourceLength) do not overflow
MAX_ADDRESS is worthwhile, but the comment is misleading.
(2b) The code that actually performs the check is off by one. What we
need is that the byte *one past* each buffer still be expressible as a
valid address, so mathematically we need
(UINTN)Buffer + BufferLength <= MAX_ADDRESS
If you reorder this for a C expression that cannot overflow, you get
BufferLength <= MAX_ADDRESS - (UINTN)Buffer
If you negate this, to express the failure condition, you get
BufferLength > MAX_ADDRESS - (UINTN)Buffer
Therefore, the comment for RETURN_INVALID_PARAMETER that says
DestinationSize is bigger than (MAX_ADDRESS -(UINTN)Destination)
is correct; *however*, the code is wrong:
*DestinationSize >= (MAX_ADDRESS - (UINTN)Destination)
This failure condition is too lax (IOW, the success condition is too
strict), and shuld be restricted (IOW the success condition should be
relaxed).
(3) Maintaining a signed integer (INTN) BufferSize, and then doing
arithmetic on it with UINTN values, is really bad practice. In
particular, the following expression makes me nervous:
BufferSize += ActualSourceLength / 4 * 3;
A separate UINTN variable called "EqualSigns" should be introduced,
BufferSize should be made an UINTN, and the logic should be reworked
using those.
(4) "DecodingTable" should be called "mDecodingTable".
(5) The decoding loop checks
(SourceIndex < SourceLength) && (DestinationIndex < *DestinationSize)
and we have an inner loop
do {
Chr = DecodingTable[(UINT8) Source[SourceIndex++]];
} while (Chr == BAD_V);
Now consider the following case. The caller passes in a valid
(*DestinationSize) that is larger than what is requried for the
decoding. In addition, assume that the input data (Source), which is
otherwise completely valid, is terminated with a space character (or
even with a NUL character, although NUL-termination is not required by
the function's specification).
In the above situation, the innermost loop, which scans for BAD_V, will
fall off the end of Source. The outermost loop condition will evaluate
to TRUE (we have some Source characters left -- namely, one space, or
NUL), and we have room in the Destination buffer too (the caller
specified / allocated a larger DestinationSize thatn what is required).
So we reach the innermost loop, and the space character (BAD_V) will
lead it right off the end of Source.
The outermost loop condition should be changed.
First, the SourceIndex subcondition should be dropped altogether.
Second, *DestinationSize should be set to the actual decoded data size
*before* starting the actual decoding. Then, if we still have output
bytes to produce, in the outermost loop, the Source scanning in the
innermost BAD_V loop is guaranteed to remain in-bounds.
... Honestly, at this point, I sort of wish we just rewrote this
function from zero. The current *approach* of the function is wrong. The
function currently forms a mental image of how the input data "should"
look, and tries to parse that -- it tries to shoehorn the input into the
"expected" format. If the input does not look like the expectation, we
run into gaps here and there.
Instead, the function should follow a state machine approach, where the
outermost loop scans input characters one by one, and makes *absolutely
no assumption* about the character that has just been found. Every UINT8
character in the input should be checked against the full possible UINT8
domain (valid BASE64 range, the equal sign, tolerated whitespace, and
the rest), and acted upon accordingly.
For example, valid BASE64 characters should be accumulated into a 24-bit
value, and flushed when the latter becomes full, and also at the end of
the scanning loop.
Counting vs. decoding can be implemented by making just the flushing
operation conditional (do not write to memory).
Thanks
Laszlo
next prev parent reply other threads:[~2019-07-01 11:02 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-06-28 3:57 [PATCH 0/3] MdePkg/BaseLib: Base64Decode: Make it follow its specification Gao, Zhichao
2019-06-28 3:57 ` [PATCH 1/3] MdePkg/BaseLib: Adjust the coding style in Base64Decode Gao, Zhichao
2019-06-28 14:26 ` [edk2-devel] " Laszlo Ersek
2019-07-01 9:24 ` Laszlo Ersek
2019-07-01 9:55 ` Laszlo Ersek
2019-06-28 3:57 ` [PATCH 2/3] MdePkg/BaseLib: Base64Decode: Make DestinationSize complied to spec Gao, Zhichao
2019-07-01 11:03 ` [edk2-devel] " Laszlo Ersek
2019-06-28 3:57 ` [PATCH 3/3] MdePkg/BaseLib: Base64Decode: Add decription for RETURN_SUCCESS Gao, Zhichao
2019-07-01 9:54 ` [edk2-devel] " Laszlo Ersek
2019-06-28 14:28 ` [edk2-devel] [PATCH 0/3] MdePkg/BaseLib: Base64Decode: Make it follow its specification Laszlo Ersek
2019-07-01 11:02 ` Laszlo Ersek [this message]
2019-07-01 11:11 ` Laszlo Ersek
2019-07-01 18:01 ` Laszlo Ersek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-list from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c495bd0b-ea4d-7206-8a4f-a7149760d19a@redhat.com \
--to=devel@edk2.groups.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox