public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
* [PATCH 0/3] MdePkg, OvmfPkg: rewrite Base64Decode(), clean up call site
@ 2019-07-02 10:28 Laszlo Ersek
  2019-07-02 10:28 ` [PATCH 1/3] MdePkg/BaseLib: re-specify Base64Decode(), and add temporary stub impl Laszlo Ersek
                   ` (4 more replies)
  0 siblings, 5 replies; 26+ messages in thread
From: Laszlo Ersek @ 2019-07-02 10:28 UTC (permalink / raw)
  To: edk2-devel-groups-io
  Cc: Ard Biesheuvel, Jordan Justen, Liming Gao, Marvin Häuser,
	Michael D Kinney, Philippe Mathieu-Daudé, Zhichao Gao

Repo:   https://github.com/lersek/edk2.git
Branch: base64_decode_bz1891

Base64Decode() has a number of issues; see

- <https://bugzilla.tianocore.org/show_bug.cgi?id=1891>

- and the mailing list discussion linked from
  <https://bugzilla.tianocore.org/show_bug.cgi?id=1891#c6>.

In my opinion, rewriting Base64Decode() from scratch, using a different
(state machine-based) approach is safer / more robust than attempting to
identify and patch up individual problems in the current implementation.
The emphasis of the proposed implementation is to reject invalid input;
decoding valid input is kind of secondary. (This is the safe approach
for all parsers that process untrusted input, in my opinion.)

My understanding is that unit tests for Base64Decode() already exist in
some repository. While I tested the new implementation through OvmfPkg's
EnrollDefaultKeys application -- which makes the sole calls to
Base64Decode() in the open source edk2 tree -- I didn't run a unit test
suite. Help with that (pointers to the test suite, or actual unit
testing) would be highly appreciated.

Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Liming Gao <liming.gao@intel.com>
Cc: Marvin Häuser <mhaeuser@outlook.de>
Cc: Michael D Kinney <michael.d.kinney@intel.com>
Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
Cc: Zhichao Gao <zhichao.gao@intel.com>

Thanks
Laszlo

Laszlo Ersek (3):
  MdePkg/BaseLib: re-specify Base64Decode(), and add temporary stub impl
  MdePkg/BaseLib: rewrite Base64Decode()
  OvmfPkg/EnrollDefaultKeys: clean up Base64Decode() retval handling

 MdePkg/Include/Library/BaseLib.h              |  99 ++++-
 MdePkg/Library/BaseLib/String.c               | 448 +++++++++++++-------
 OvmfPkg/EnrollDefaultKeys/EnrollDefaultKeys.c |  10 +-
 3 files changed, 374 insertions(+), 183 deletions(-)

-- 
2.19.1.3.g30247aa5d201


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 1/3] MdePkg/BaseLib: re-specify Base64Decode(), and add temporary stub impl
  2019-07-02 10:28 [PATCH 0/3] MdePkg, OvmfPkg: rewrite Base64Decode(), clean up call site Laszlo Ersek
@ 2019-07-02 10:28 ` Laszlo Ersek
  2019-07-16  8:38   ` Philippe Mathieu-Daudé
  2019-07-16 10:49   ` Laszlo Ersek
  2019-07-02 10:28 ` [PATCH 2/3] MdePkg/BaseLib: rewrite Base64Decode() Laszlo Ersek
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 26+ messages in thread
From: Laszlo Ersek @ 2019-07-02 10:28 UTC (permalink / raw)
  To: edk2-devel-groups-io
  Cc: Liming Gao, Marvin Häuser, Michael D Kinney,
	Philippe Mathieu-Daudé, Zhichao Gao

Rewrite Base64Decode() from scratch, due to reasons listed in the second
reference below.

As first step, redo the interface contract, and replace the current
implementation with a stub that asserts FALSE, then fails.

Cc: Liming Gao <liming.gao@intel.com>
Cc: Marvin Häuser <mhaeuser@outlook.de>
Cc: Michael D Kinney <michael.d.kinney@intel.com>
Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
Cc: Zhichao Gao <zhichao.gao@intel.com>
Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=1891
Ref: http://mid.mail-archive.com/c495bd0b-ea4d-7206-8a4f-a7149760d19a@redhat.com
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
---
 MdePkg/Include/Library/BaseLib.h |  99 +++++--
 MdePkg/Library/BaseLib/String.c  | 285 ++++++--------------
 2 files changed, 168 insertions(+), 216 deletions(-)

diff --git a/MdePkg/Include/Library/BaseLib.h b/MdePkg/Include/Library/BaseLib.h
index ebd7dd274cf4..5ef03e24edb1 100644
--- a/MdePkg/Include/Library/BaseLib.h
+++ b/MdePkg/Include/Library/BaseLib.h
@@ -2785,31 +2785,94 @@ Base64Encode (
   );
 
 /**
-  Convert Base64 ascii string to binary data based on RFC4648.
+  Decode Base64 ASCII encoded data to 8-bit binary representation, based on
+  RFC4648.
 
-  Produce Null-terminated binary data in the output buffer specified by Destination and DestinationSize.
-  The binary data is produced by converting the Base64 ascii string specified by Source and SourceLength.
+  Decoding occurs according to "Table 1: The Base 64 Alphabet" in RFC4648.
 
-  @param Source          Input ASCII characters
-  @param SourceLength    Number of ASCII characters
-  @param Destination     Pointer to output buffer
-  @param DestinationSize Caller is responsible for passing in buffer of at least DestinationSize.
-                         Set 0 to get the size needed. Set to bytes stored on return.
+  Whitespace is ignored at all positions:
+  - 0x09 ('\t') horizontal tab
+  - 0x0A ('\n') new line
+  - 0x0B ('\v') vertical tab
+  - 0x0C ('\f') form feed
+  - 0x0D ('\r') carriage return
+  - 0x20 (' ')  space
 
-  @retval RETURN_SUCCESS             When binary buffer is filled in.
-  @retval RETURN_INVALID_PARAMETER   If Source is NULL or DestinationSize is NULL.
-  @retval RETURN_INVALID_PARAMETER   If SourceLength or DestinationSize is bigger than (MAX_ADDRESS -(UINTN)Destination ).
-  @retval RETURN_INVALID_PARAMETER   If there is any invalid character in input stream.
-  @retval RETURN_BUFFER_TOO_SMALL    If buffer length is smaller than required buffer size.
+  The minimum amount of required padding (with ASCII 0x3D, '=') is tolerated
+  and enforced at the end of the Base64 ASCII encoded data, and only there.
 
- **/
+  Other characters outside of the encoding alphabet cause the function to
+  reject the Base64 ASCII encoded data.
+
+  @param[in] Source               Array of CHAR8 elements containing the Base64
+                                  ASCII encoding. May be NULL if SourceSize is
+                                  zero.
+
+  @param[in] SourceSize           Number of CHAR8 elements in Source.
+
+  @param[out] Destination         Array of UINT8 elements receiving the decoded
+                                  8-bit binary representation. Allocated by the
+                                  caller. May be NULL if DestinationSize is
+                                  zero on input. If NULL, decoding is
+                                  performed, but the 8-bit binary
+                                  representation is not stored. If non-NULL and
+                                  the function returns an error, the contents
+                                  of Destination are indeterminate.
+
+  @param[in,out] DestinationSize  On input, the number of UINT8 elements that
+                                  the caller allocated for Destination. On
+                                  output, if the function returns
+                                  RETURN_SUCCESS or RETURN_BUFFER_TOO_SMALL,
+                                  the number of UINT8 elements that are
+                                  required for decoding the Base64 ASCII
+                                  representation. If the function returns a
+                                  value different from both RETURN_SUCCESS and
+                                  RETURN_BUFFER_TOO_SMALL, then DestinationSize
+                                  is indeterminate on output.
+
+  @retval RETURN_SUCCESS            SourceSize CHAR8 elements at Source have
+                                    been decoded to on-output DestinationSize
+                                    UINT8 elements at Destination. Note that
+                                    RETURN_SUCCESS covers the case when
+                                    DestinationSize is zero on input, and
+                                    Source decodes to zero bytes (due to
+                                    containing at most ignored whitespace).
+
+  @retval RETURN_BUFFER_TOO_SMALL   The input value of DestinationSize is not
+                                    large enough for decoding SourceSize CHAR8
+                                    elements at Source. The required number of
+                                    UINT8 elements has been stored to
+                                    DestinationSize.
+
+  @retval RETURN_INVALID_PARAMETER  DestinationSize is NULL.
+
+  @retval RETURN_INVALID_PARAMETER  Source is NULL, but SourceSize is not zero.
+
+  @retval RETURN_INVALID_PARAMETER  Destination is NULL, but DestinationSize is
+                                    not zero on input.
+
+  @retval RETURN_INVALID_PARAMETER  Source is non-NULL, and (Source +
+                                    SourceSize) would wrap around MAX_ADDRESS.
+
+  @retval RETURN_INVALID_PARAMETER  Destination is non-NULL, and (Destination +
+                                    DestinationSize) would wrap around
+                                    MAX_ADDRESS, as specified on input.
+
+  @retval RETURN_INVALID_PARAMETER  None of Source and Destination are NULL,
+                                    and CHAR8[SourceSize] at Source overlaps
+                                    UINT8[DestinationSize] at Destination, as
+                                    specified on input.
+
+  @retval RETURN_INVALID_PARAMETER  Invalid CHAR8 element encountered in
+                                    Source.
+**/
 RETURN_STATUS
 EFIAPI
 Base64Decode (
-  IN  CONST CHAR8  *Source,
-  IN        UINTN   SourceLength,
-  OUT       UINT8  *Destination  OPTIONAL,
-  IN OUT    UINTN  *DestinationSize
+  IN     CONST CHAR8 *Source          OPTIONAL,
+  IN     UINTN       SourceSize,
+  OUT    UINT8       *Destination     OPTIONAL,
+  IN OUT UINTN       *DestinationSize
   );
 
 /**
diff --git a/MdePkg/Library/BaseLib/String.c b/MdePkg/Library/BaseLib/String.c
index 32e189791cb8..f8397035c32a 100644
--- a/MdePkg/Library/BaseLib/String.c
+++ b/MdePkg/Library/BaseLib/String.c
@@ -1757,45 +1757,10 @@ AsciiStrToUnicodeStr (
 
 #endif
 
-//
-// The basis for Base64 encoding is RFC 4686 https://tools.ietf.org/html/rfc4648
-//
-// RFC 4686 has a number of MAY and SHOULD cases.  This implementation chooses
-// the more restrictive versions for security concerns (see RFC 4686 section 3.3).
-//
-// A invalid character, if encountered during the decode operation, causes the data
-// to be rejected. In addition, the '=' padding character is only allowed at the end
-// of the Base64 encoded string.
-//
-#define BAD_V  99
-
 STATIC CHAR8 EncodingTable[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
                                 "abcdefghijklmnopqrstuvwxyz"
                                 "0123456789+/";
 
-STATIC UINT8 DecodingTable[] = {
-  //
-  // Valid characters ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/
-  // Also, set '=' as a zero for decoding
-  // 0  ,            1,           2,           3,            4,           5,            6,           7,           8,            9,           a,            b,            c,           d,            e,            f
-  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,   //   0
-  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,   //  10
-  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,     62,  BAD_V,  BAD_V,  BAD_V,     63,   //  20
-     52,     53,     54,     55,     56,     57,     58,     59,     60,     61,  BAD_V,  BAD_V,  BAD_V,      0,  BAD_V,  BAD_V,   //  30
-  BAD_V,      0,      1,      2,      3,      4,      5,      6,      7,      8,      9,     10,     11,     12,     13,     14,   //  40
-     15,     16,     17,     18,     19,     20,     21,     22,     23,     24,     25,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,   //  50
-  BAD_V,     26,     27,     28,     29,     30,     31,     32,     33,     34,     35,     36,     37,     38,     39,     40,   //  60
-     41,     42,     43,     44,     45,     46,     47,     48,     49,     50,     51,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,   //  70
-  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,   //  80
-  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,   //  90
-  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,   //  a0
-  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,   //  b0
-  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,   //  c0
-  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,   //  d0
-  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,   //  d0
-  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V    //  f0
-};
-
 /**
   Convert binary data to a Base64 encoded ascii string based on RFC4648.
 
@@ -1918,174 +1883,98 @@ Base64Encode (
 }
 
 /**
-  Convert Base64 ascii string to binary data based on RFC4648.
-
-  Produce Null-terminated binary data in the output buffer specified by Destination and DestinationSize.
-  The binary data is produced by converting the Base64 ascii string specified by Source and SourceLength.
-
-  @param Source            Input ASCII characters
-  @param SourceLength      Number of ASCII characters
-  @param Destination       Pointer to output buffer
-  @param DestinationSize   Caller is responsible for passing in buffer of at least DestinationSize.
-                           Set 0 to get the size needed. Set to bytes stored on return.
-
-  @retval RETURN_SUCCESS             When binary buffer is filled in.
-  @retval RETURN_INVALID_PARAMETER   If Source is NULL or DestinationSize is NULL.
-  @retval RETURN_INVALID_PARAMETER   If SourceLength or DestinationSize is bigger than (MAX_ADDRESS -(UINTN)Destination ).
-  @retval RETURN_INVALID_PARAMETER   If there is any invalid character in input stream.
-  @retval RETURN_BUFFER_TOO_SMALL    If buffer length is smaller than required buffer size.
- **/
+  Decode Base64 ASCII encoded data to 8-bit binary representation, based on
+  RFC4648.
+
+  Decoding occurs according to "Table 1: The Base 64 Alphabet" in RFC4648.
+
+  Whitespace is ignored at all positions:
+  - 0x09 ('\t') horizontal tab
+  - 0x0A ('\n') new line
+  - 0x0B ('\v') vertical tab
+  - 0x0C ('\f') form feed
+  - 0x0D ('\r') carriage return
+  - 0x20 (' ')  space
+
+  The minimum amount of required padding (with ASCII 0x3D, '=') is tolerated
+  and enforced at the end of the Base64 ASCII encoded data, and only there.
+
+  Other characters outside of the encoding alphabet cause the function to
+  reject the Base64 ASCII encoded data.
+
+  @param[in] Source               Array of CHAR8 elements containing the Base64
+                                  ASCII encoding. May be NULL if SourceSize is
+                                  zero.
+
+  @param[in] SourceSize           Number of CHAR8 elements in Source.
+
+  @param[out] Destination         Array of UINT8 elements receiving the decoded
+                                  8-bit binary representation. Allocated by the
+                                  caller. May be NULL if DestinationSize is
+                                  zero on input. If NULL, decoding is
+                                  performed, but the 8-bit binary
+                                  representation is not stored. If non-NULL and
+                                  the function returns an error, the contents
+                                  of Destination are indeterminate.
+
+  @param[in,out] DestinationSize  On input, the number of UINT8 elements that
+                                  the caller allocated for Destination. On
+                                  output, if the function returns
+                                  RETURN_SUCCESS or RETURN_BUFFER_TOO_SMALL,
+                                  the number of UINT8 elements that are
+                                  required for decoding the Base64 ASCII
+                                  representation. If the function returns a
+                                  value different from both RETURN_SUCCESS and
+                                  RETURN_BUFFER_TOO_SMALL, then DestinationSize
+                                  is indeterminate on output.
+
+  @retval RETURN_SUCCESS            SourceSize CHAR8 elements at Source have
+                                    been decoded to on-output DestinationSize
+                                    UINT8 elements at Destination. Note that
+                                    RETURN_SUCCESS covers the case when
+                                    DestinationSize is zero on input, and
+                                    Source decodes to zero bytes (due to
+                                    containing at most ignored whitespace).
+
+  @retval RETURN_BUFFER_TOO_SMALL   The input value of DestinationSize is not
+                                    large enough for decoding SourceSize CHAR8
+                                    elements at Source. The required number of
+                                    UINT8 elements has been stored to
+                                    DestinationSize.
+
+  @retval RETURN_INVALID_PARAMETER  DestinationSize is NULL.
+
+  @retval RETURN_INVALID_PARAMETER  Source is NULL, but SourceSize is not zero.
+
+  @retval RETURN_INVALID_PARAMETER  Destination is NULL, but DestinationSize is
+                                    not zero on input.
+
+  @retval RETURN_INVALID_PARAMETER  Source is non-NULL, and (Source +
+                                    SourceSize) would wrap around MAX_ADDRESS.
+
+  @retval RETURN_INVALID_PARAMETER  Destination is non-NULL, and (Destination +
+                                    DestinationSize) would wrap around
+                                    MAX_ADDRESS, as specified on input.
+
+  @retval RETURN_INVALID_PARAMETER  None of Source and Destination are NULL,
+                                    and CHAR8[SourceSize] at Source overlaps
+                                    UINT8[DestinationSize] at Destination, as
+                                    specified on input.
+
+  @retval RETURN_INVALID_PARAMETER  Invalid CHAR8 element encountered in
+                                    Source.
+**/
 RETURN_STATUS
 EFIAPI
 Base64Decode (
-  IN  CONST CHAR8  *Source,
-  IN        UINTN   SourceLength,
-  OUT       UINT8  *Destination   OPTIONAL,
-  IN OUT    UINTN  *DestinationSize
+  IN     CONST CHAR8 *Source          OPTIONAL,
+  IN     UINTN       SourceSize,
+  OUT    UINT8       *Destination     OPTIONAL,
+  IN OUT UINTN       *DestinationSize
   )
 {
-
-  UINT32   Value;
-  CHAR8    Chr;
-  INTN     BufferSize;
-  UINTN    SourceIndex;
-  UINTN    DestinationIndex;
-  UINTN    Index;
-  UINTN    ActualSourceLength;
-
-  //
-  // Check pointers are not NULL
-  //
-  if ((Source == NULL) || (DestinationSize == NULL)) {
-    return RETURN_INVALID_PARAMETER;
-  }
-
-  //
-  // Check if SourceLength or  DestinationSize is valid
-  //
-  if ((SourceLength >= (MAX_ADDRESS - (UINTN)Source)) || (*DestinationSize >= (MAX_ADDRESS - (UINTN)Destination))){
-    return RETURN_INVALID_PARAMETER;
-  }
-
-  ActualSourceLength = 0;
-  BufferSize = 0;
-
-  //
-  // Determine the actual number of valid characters in the string.
-  // All invalid characters except selected white space characters,
-  // will cause the Base64 string to be rejected. White space to allow
-  // properly formatted XML will be ignored.
-  //
-  // See section 3.3 of RFC 4648.
-  //
-  for (SourceIndex = 0; SourceIndex < SourceLength; SourceIndex++) {
-
-    //
-    // '=' is part of the quantum
-    //
-    if (Source[SourceIndex] == '=') {
-      ActualSourceLength++;
-      BufferSize--;
-
-      //
-      // Only two '=' characters can be valid.
-      //
-      if (BufferSize < -2) {
-        return RETURN_INVALID_PARAMETER;
-      }
-    }
-    else {
-      Chr = Source[SourceIndex];
-      if (BAD_V != DecodingTable[(UINT8) Chr]) {
-
-        //
-        // The '=' characters are only valid at the end, so any
-        // valid character after an '=', will be flagged as an error.
-        //
-        if (BufferSize < 0) {
-          return RETURN_INVALID_PARAMETER;
-        }
-          ActualSourceLength++;
-      }
-        else {
-
-        //
-        // The reset of the decoder will ignore all invalid characters allowed here.
-        // Ignoring selected white space is useful.  In this case, the decoder will
-        // ignore ' ', '\t', '\n', and '\r'.
-        //
-        if ((Chr != ' ') &&(Chr != '\t') &&(Chr != '\n') &&(Chr != '\r')) {
-          return RETURN_INVALID_PARAMETER;
-        }
-      }
-    }
-  }
-
-  //
-  // The Base64 character string must be a multiple of 4 character quantums.
-  //
-  if (ActualSourceLength % 4 != 0) {
-    return RETURN_INVALID_PARAMETER;
-  }
-
-  BufferSize += ActualSourceLength / 4 * 3;
-    if (BufferSize < 0) {
-      return RETURN_INVALID_PARAMETER;
-  }
-
-  //
-  // BufferSize is >= 0
-  //
-  if ((Destination == NULL) || (*DestinationSize < (UINTN) BufferSize)) {
-    *DestinationSize = BufferSize;
-    return RETURN_BUFFER_TOO_SMALL;
-  }
-
-  //
-  // If no decodable characters, return a size of zero. RFC 4686 test vector 1.
-  //
-  if (ActualSourceLength == 0) {
-    *DestinationSize = 0;
-    return RETURN_SUCCESS;
-  }
-
-  //
-  // Input data is verified to be a multiple of 4 valid charcters.  Process four
-  // characters at a time. Uncounted (ie. invalid)  characters will be ignored.
-  //
-  for (SourceIndex = 0, DestinationIndex = 0; (SourceIndex < SourceLength) && (DestinationIndex < *DestinationSize); ) {
-    Value = 0;
-
-    //
-    // Get 24 bits of data from 4 input characters, each character representing 6 bits
-    //
-    for (Index = 0; Index < 4; Index++) {
-      do {
-      Chr = DecodingTable[(UINT8) Source[SourceIndex++]];
-      } while (Chr == BAD_V);
-      Value <<= 6;
-      Value |= (UINT32)Chr;
-    }
-
-    //
-    // Store 3 bytes of binary data (24 bits)
-    //
-    *Destination++ = (UINT8) (Value >> 16);
-    DestinationIndex++;
-
-    //
-    // Due to the '=' special cases for the two bytes at the end,
-    // we have to check the length and not store the padding data
-    //
-    if (DestinationIndex++ < *DestinationSize) {
-      *Destination++ = (UINT8) (Value >>  8);
-    }
-    if (DestinationIndex++ < *DestinationSize) {
-      *Destination++ = (UINT8) Value;
-    }
-  }
-
-  return RETURN_SUCCESS;
+  ASSERT (FALSE);
+  return RETURN_INVALID_PARAMETER;
 }
 
 /**
-- 
2.19.1.3.g30247aa5d201



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 2/3] MdePkg/BaseLib: rewrite Base64Decode()
  2019-07-02 10:28 [PATCH 0/3] MdePkg, OvmfPkg: rewrite Base64Decode(), clean up call site Laszlo Ersek
  2019-07-02 10:28 ` [PATCH 1/3] MdePkg/BaseLib: re-specify Base64Decode(), and add temporary stub impl Laszlo Ersek
@ 2019-07-02 10:28 ` Laszlo Ersek
  2019-07-12  2:31   ` [edk2-devel] " Gao, Zhichao
                     ` (2 more replies)
  2019-07-02 10:28 ` [PATCH 3/3] OvmfPkg/EnrollDefaultKeys: clean up Base64Decode() retval handling Laszlo Ersek
                   ` (2 subsequent siblings)
  4 siblings, 3 replies; 26+ messages in thread
From: Laszlo Ersek @ 2019-07-02 10:28 UTC (permalink / raw)
  To: edk2-devel-groups-io
  Cc: Liming Gao, Marvin Häuser, Michael D Kinney,
	Philippe Mathieu-Daudé, Zhichao Gao

Rewrite Base64Decode() from scratch, due to reasons listed in the second
reference below.

Implement Base64Decode() according to the specification added in the
previous patch. The decoder scans the input buffer once, it has no inner
loop(s), and it spills each output byte as soon as the output byte is
complete.

Cc: Liming Gao <liming.gao@intel.com>
Cc: Marvin Häuser <mhaeuser@outlook.de>
Cc: Michael D Kinney <michael.d.kinney@intel.com>
Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
Cc: Zhichao Gao <zhichao.gao@intel.com>
Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=1891
Ref: http://mid.mail-archive.com/c495bd0b-ea4d-7206-8a4f-a7149760d19a@redhat.com
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
---
 MdePkg/Library/BaseLib/String.c | 249 +++++++++++++++++++-
 1 file changed, 247 insertions(+), 2 deletions(-)

diff --git a/MdePkg/Library/BaseLib/String.c b/MdePkg/Library/BaseLib/String.c
index f8397035c32a..6198ccbc9672 100644
--- a/MdePkg/Library/BaseLib/String.c
+++ b/MdePkg/Library/BaseLib/String.c
@@ -1973,8 +1973,253 @@ Base64Decode (
   IN OUT UINTN       *DestinationSize
   )
 {
-  ASSERT (FALSE);
-  return RETURN_INVALID_PARAMETER;
+  BOOLEAN PaddingMode;
+  UINTN   SixBitGroupsConsumed;
+  UINT32  Accumulator;
+  UINTN   OriginalDestinationSize;
+  UINTN   SourceIndex;
+
+  if (DestinationSize == NULL) {
+    return RETURN_INVALID_PARAMETER;
+  }
+
+  //
+  // Check Source array validity.
+  //
+  if (Source == NULL) {
+    if (SourceSize > 0) {
+      //
+      // At least one CHAR8 element at NULL Source.
+      //
+      return RETURN_INVALID_PARAMETER;
+    }
+  } else if (SourceSize > MAX_ADDRESS - (UINTN)Source) {
+    //
+    // Non-NULL Source, but it wraps around.
+    //
+    return RETURN_INVALID_PARAMETER;
+  }
+
+  //
+  // Check Destination array validity.
+  //
+  if (Destination == NULL) {
+    if (*DestinationSize > 0) {
+      //
+      // At least one UINT8 element at NULL Destination.
+      //
+      return RETURN_INVALID_PARAMETER;
+    }
+  } else if (*DestinationSize > MAX_ADDRESS - (UINTN)Destination) {
+    //
+    // Non-NULL Destination, but it wraps around.
+    //
+    return RETURN_INVALID_PARAMETER;
+  }
+
+  //
+  // Check for overlap.
+  //
+  if (Source != NULL && Destination != NULL) {
+    //
+    // Both arrays have been provided, and we know from earlier that each array
+    // is valid in itself.
+    //
+    if ((UINTN)Source + SourceSize <= (UINTN)Destination) {
+      //
+      // Source array precedes Destination array, OK.
+      //
+    } else if ((UINTN)Destination + *DestinationSize <= (UINTN)Source) {
+      //
+      // Destination array precedes Source array, OK.
+      //
+    } else {
+      //
+      // Overlap.
+      //
+      return RETURN_INVALID_PARAMETER;
+    }
+  }
+
+  //
+  // Decoding loop setup.
+  //
+  PaddingMode             = FALSE;
+  SixBitGroupsConsumed    = 0;
+  Accumulator             = 0;
+  OriginalDestinationSize = *DestinationSize;
+  *DestinationSize        = 0;
+
+  //
+  // Decoding loop.
+  //
+  for (SourceIndex = 0; SourceIndex < SourceSize; SourceIndex++) {
+    CHAR8  SourceChar;
+    UINT32 Base64Value;
+    UINT8  DestinationOctet;
+
+    SourceChar = Source[SourceIndex];
+
+    //
+    // Whitespace is ignored at all positions (regardless of padding mode).
+    //
+    if (SourceChar == '\t' || SourceChar == '\n' || SourceChar == '\v' ||
+        SourceChar == '\f' || SourceChar == '\r' || SourceChar == ' ') {
+      continue;
+    }
+
+    //
+    // If we're in padding mode, accept another padding character, as long as
+    // that padding character completes the quantum. This completes case (2)
+    // from RFC4648, Chapter 4. "Base 64 Encoding":
+    //
+    // (2) The final quantum of encoding input is exactly 8 bits; here, the
+    //     final unit of encoded output will be two characters followed by two
+    //     "=" padding characters.
+    //
+    if (PaddingMode) {
+      if (SourceChar == '=' && SixBitGroupsConsumed == 3) {
+        SixBitGroupsConsumed = 0;
+        continue;
+      }
+      return RETURN_INVALID_PARAMETER;
+    }
+
+    //
+    // When not in padding mode, decode Base64Value based on RFC4648, "Table 1:
+    // The Base 64 Alphabet".
+    //
+    if ('A' <= SourceChar && SourceChar <= 'Z') {
+      Base64Value = SourceChar - 'A';
+    } else if ('a' <= SourceChar && SourceChar <= 'z') {
+      Base64Value = 26 + (SourceChar - 'a');
+    } else if ('0' <= SourceChar && SourceChar <= '9') {
+      Base64Value = 52 + (SourceChar - '0');
+    } else if (SourceChar == '+') {
+      Base64Value = 62;
+    } else if (SourceChar == '/') {
+      Base64Value = 63;
+    } else if (SourceChar == '=') {
+      //
+      // Enter padding mode.
+      //
+      PaddingMode = TRUE;
+
+      if (SixBitGroupsConsumed == 2) {
+        //
+        // If we have consumed two 6-bit groups from the current quantum before
+        // encountering the first padding character, then this is case (2) from
+        // RFC4648, Chapter 4. "Base 64 Encoding". Bump SixBitGroupsConsumed,
+        // and we'll enforce another padding character.
+        //
+        SixBitGroupsConsumed = 3;
+      } else if (SixBitGroupsConsumed == 3) {
+        //
+        // If we have consumed three 6-bit groups from the current quantum
+        // before encountering the first padding character, then this is case
+        // (3) from RFC4648, Chapter 4. "Base 64 Encoding". The quantum is now
+        // complete.
+        //
+        SixBitGroupsConsumed = 0;
+      } else {
+        //
+        // Padding characters are not allowed at the first two positions of a
+        // quantum.
+        //
+        return RETURN_INVALID_PARAMETER;
+      }
+
+      //
+      // Wherever in a quantum we enter padding mode, we enforce the padding
+      // bits pending in the accumulator -- from the last 6-bit group just
+      // preceding the padding character -- to be zero. Refer to RFC4648,
+      // Chapter 3.5. "Canonical Encoding".
+      //
+      if (Accumulator != 0) {
+        return RETURN_INVALID_PARAMETER;
+      }
+
+      //
+      // Advance to the next source character.
+      //
+      continue;
+    } else {
+      //
+      // Other characters outside of the encoding alphabet are rejected.
+      //
+      return RETURN_INVALID_PARAMETER;
+    }
+
+    //
+    // Feed the bits of the current 6-bit group of the quantum to the
+    // accumulator.
+    //
+    Accumulator = (Accumulator << 6) | Base64Value;
+    SixBitGroupsConsumed++;
+    switch (SixBitGroupsConsumed) {
+    case 1:
+      //
+      // No octet to spill after consuming the first 6-bit group of the
+      // quantum; advance to the next source character.
+      //
+      continue;
+    case 2:
+      //
+      // 12 bits accumulated (6 pending + 6 new); prepare for spilling an
+      // octet. 4 bits remain pending.
+      //
+      DestinationOctet = (UINT8)(Accumulator >> 4);
+      Accumulator &= 0xF;
+      break;
+    case 3:
+      //
+      // 10 bits accumulated (4 pending + 6 new); prepare for spilling an
+      // octet. 2 bits remain pending.
+      //
+      DestinationOctet = (UINT8)(Accumulator >> 2);
+      Accumulator &= 0x3;
+      break;
+    default:
+      ASSERT (SixBitGroupsConsumed == 4);
+      //
+      // 8 bits accumulated (2 pending + 6 new); prepare for spilling an octet.
+      // The quantum is complete, 0 bits remain pending.
+      //
+      DestinationOctet = (UINT8)Accumulator;
+      Accumulator = 0;
+      SixBitGroupsConsumed = 0;
+      break;
+    }
+
+    //
+    // Store the decoded octet if there's room left. Increment
+    // (*DestinationSize) unconditionally.
+    //
+    if (*DestinationSize < OriginalDestinationSize) {
+      ASSERT (Destination != NULL);
+      Destination[*DestinationSize] = DestinationOctet;
+    }
+    (*DestinationSize)++;
+
+    //
+    // Advance to the next source character.
+    //
+  }
+
+  //
+  // If Source terminates mid-quantum, then Source is invalid.
+  //
+  if (SixBitGroupsConsumed != 0) {
+    return RETURN_INVALID_PARAMETER;
+  }
+
+  //
+  // Done.
+  //
+  if (*DestinationSize <= OriginalDestinationSize) {
+    return RETURN_SUCCESS;
+  }
+  return RETURN_BUFFER_TOO_SMALL;
 }
 
 /**
-- 
2.19.1.3.g30247aa5d201



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 3/3] OvmfPkg/EnrollDefaultKeys: clean up Base64Decode() retval handling
  2019-07-02 10:28 [PATCH 0/3] MdePkg, OvmfPkg: rewrite Base64Decode(), clean up call site Laszlo Ersek
  2019-07-02 10:28 ` [PATCH 1/3] MdePkg/BaseLib: re-specify Base64Decode(), and add temporary stub impl Laszlo Ersek
  2019-07-02 10:28 ` [PATCH 2/3] MdePkg/BaseLib: rewrite Base64Decode() Laszlo Ersek
@ 2019-07-02 10:28 ` Laszlo Ersek
  2019-07-15 21:58   ` [edk2-devel] " Laszlo Ersek
  2019-07-10  9:20 ` [edk2-devel] [PATCH 0/3] MdePkg, OvmfPkg: rewrite Base64Decode(), clean up call site Laszlo Ersek
  2019-07-16 22:02 ` Laszlo Ersek
  4 siblings, 1 reply; 26+ messages in thread
From: Laszlo Ersek @ 2019-07-02 10:28 UTC (permalink / raw)
  To: edk2-devel-groups-io
  Cc: Ard Biesheuvel, Jordan Justen, Philippe Mathieu-Daudé

Base64Decode() now guarantees that DestinationSize is larger on output
than it was on input if RETURN_BUFFER_TOO_SMALL is returned. Clean up the
retval handling for the first Base64Decode() call in EnrollDefaultKeys,
which used to work around the ambiguity in the previous Base64Decode()
interface contract.

Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=1891
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
---
 OvmfPkg/EnrollDefaultKeys/EnrollDefaultKeys.c | 10 ++--------
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/OvmfPkg/EnrollDefaultKeys/EnrollDefaultKeys.c b/OvmfPkg/EnrollDefaultKeys/EnrollDefaultKeys.c
index f45cb799f726..302b80d97720 100644
--- a/OvmfPkg/EnrollDefaultKeys/EnrollDefaultKeys.c
+++ b/OvmfPkg/EnrollDefaultKeys/EnrollDefaultKeys.c
@@ -154,14 +154,8 @@ GetPkKek1 (
   Status = Base64Decode (Base64Cert, Base64CertLen, NULL, &DecodedCertSize);
   switch (Status) {
   case EFI_BUFFER_TOO_SMALL:
-    if (DecodedCertSize > 0) {
-      break;
-    }
-    //
-    // Fall through: the above Base64Decode() call is ill-specified in BaseLib
-    // if Source decodes to zero bytes (for example if it consists of ignored
-    // whitespace only).
-    //
+    ASSERT (DecodedCertSize > 0);
+    break;
   case EFI_SUCCESS:
     AsciiPrint ("error: empty certificate after app prefix %g\n",
       &gOvmfPkKek1AppPrefixGuid);
-- 
2.19.1.3.g30247aa5d201


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [edk2-devel] [PATCH 0/3] MdePkg, OvmfPkg: rewrite Base64Decode(), clean up call site
  2019-07-02 10:28 [PATCH 0/3] MdePkg, OvmfPkg: rewrite Base64Decode(), clean up call site Laszlo Ersek
                   ` (2 preceding siblings ...)
  2019-07-02 10:28 ` [PATCH 3/3] OvmfPkg/EnrollDefaultKeys: clean up Base64Decode() retval handling Laszlo Ersek
@ 2019-07-10  9:20 ` Laszlo Ersek
  2019-07-16 22:02 ` Laszlo Ersek
  4 siblings, 0 replies; 26+ messages in thread
From: Laszlo Ersek @ 2019-07-10  9:20 UTC (permalink / raw)
  To: edk2-devel-groups-io
  Cc: Ard Biesheuvel, Jordan Justen, Liming Gao, Marvin Häuser,
	Michael D Kinney, Philippe Mathieu-Daudé, Zhichao Gao

On 07/02/19 12:28, Laszlo Ersek wrote:
> Repo:   https://github.com/lersek/edk2.git
> Branch: base64_decode_bz1891
> 
> Base64Decode() has a number of issues; see
> 
> - <https://bugzilla.tianocore.org/show_bug.cgi?id=1891>
> 
> - and the mailing list discussion linked from
>   <https://bugzilla.tianocore.org/show_bug.cgi?id=1891#c6>.
> 
> In my opinion, rewriting Base64Decode() from scratch, using a different
> (state machine-based) approach is safer / more robust than attempting to
> identify and patch up individual problems in the current implementation.
> The emphasis of the proposed implementation is to reject invalid input;
> decoding valid input is kind of secondary. (This is the safe approach
> for all parsers that process untrusted input, in my opinion.)
> 
> My understanding is that unit tests for Base64Decode() already exist in
> some repository. While I tested the new implementation through OvmfPkg's
> EnrollDefaultKeys application -- which makes the sole calls to
> Base64Decode() in the open source edk2 tree -- I didn't run a unit test
> suite. Help with that (pointers to the test suite, or actual unit
> testing) would be highly appreciated.
> 
> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> Cc: Jordan Justen <jordan.l.justen@intel.com>
> Cc: Liming Gao <liming.gao@intel.com>
> Cc: Marvin Häuser <mhaeuser@outlook.de>
> Cc: Michael D Kinney <michael.d.kinney@intel.com>
> Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
> Cc: Zhichao Gao <zhichao.gao@intel.com>
> 
> Thanks
> Laszlo
> 
> Laszlo Ersek (3):
>   MdePkg/BaseLib: re-specify Base64Decode(), and add temporary stub impl
>   MdePkg/BaseLib: rewrite Base64Decode()
>   OvmfPkg/EnrollDefaultKeys: clean up Base64Decode() retval handling
> 
>  MdePkg/Include/Library/BaseLib.h              |  99 ++++-
>  MdePkg/Library/BaseLib/String.c               | 448 +++++++++++++-------
>  OvmfPkg/EnrollDefaultKeys/EnrollDefaultKeys.c |  10 +-
>  3 files changed, 374 insertions(+), 183 deletions(-)
> 

Ping.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [edk2-devel] [PATCH 2/3] MdePkg/BaseLib: rewrite Base64Decode()
  2019-07-02 10:28 ` [PATCH 2/3] MdePkg/BaseLib: rewrite Base64Decode() Laszlo Ersek
@ 2019-07-12  2:31   ` Gao, Zhichao
  2019-07-12 19:31     ` Laszlo Ersek
  2019-07-15 18:44   ` mhaeuser
  2019-07-16 10:05   ` Philippe Mathieu-Daudé
  2 siblings, 1 reply; 26+ messages in thread
From: Gao, Zhichao @ 2019-07-12  2:31 UTC (permalink / raw)
  To: devel@edk2.groups.io, lersek@redhat.com
  Cc: Gao, Liming, Marvin Häuser, Kinney, Michael D,
	Philippe Mathieu-Daudé

Sorry for late respond.
The whole code is OK for me. And I write a tiny test for it without the memory address check. See https://github.com/ZhichaoGao/edk2/commit/615356ba32d3147957d215debd844e7709f06849 . It is tested in Emulator environment. If it is OK, I think you can take my Tested-by for this patch. If there are some missing, please let me know.

Base64Decode parameter Source is indicate as OPTIONAL. Although it is OK to be NULL, and return DestinationSize to be zero with success status to indicate no decode occurred . I don't know if it is meaningful to report the result as that. In my opinion, NULL Source is an invalid input. Just my opinion, if the maintainer is OK with that, I am OK too.

Some other minor comment about the block scope variable blow.

> -----Original Message-----
> From: devel@edk2.groups.io [mailto:devel@edk2.groups.io] On Behalf Of
> Laszlo Ersek
> Sent: Tuesday, July 2, 2019 6:29 PM
> To: edk2-devel-groups-io <devel@edk2.groups.io>
> Cc: Gao, Liming <liming.gao@intel.com>; Marvin Häuser
> <mhaeuser@outlook.de>; Kinney, Michael D <michael.d.kinney@intel.com>;
> Philippe Mathieu-Daudé <philmd@redhat.com>; Gao, Zhichao
> <zhichao.gao@intel.com>
> Subject: [edk2-devel] [PATCH 2/3] MdePkg/BaseLib: rewrite Base64Decode()
>
> Rewrite Base64Decode() from scratch, due to reasons listed in the second
> reference below.
>
> Implement Base64Decode() according to the specification added in the
> previous patch. The decoder scans the input buffer once, it has no inner
> loop(s), and it spills each output byte as soon as the output byte is complete.
>
> Cc: Liming Gao <liming.gao@intel.com>
> Cc: Marvin Häuser <mhaeuser@outlook.de>
> Cc: Michael D Kinney <michael.d.kinney@intel.com>
> Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
> Cc: Zhichao Gao <zhichao.gao@intel.com>
> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=1891
> Ref: http://mid.mail-archive.com/c495bd0b-ea4d-7206-8a4f-
> a7149760d19a@redhat.com
> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
> ---
>  MdePkg/Library/BaseLib/String.c | 249 +++++++++++++++++++-
>  1 file changed, 247 insertions(+), 2 deletions(-)
>
> diff --git a/MdePkg/Library/BaseLib/String.c
> b/MdePkg/Library/BaseLib/String.c index f8397035c32a..6198ccbc9672
> 100644
> --- a/MdePkg/Library/BaseLib/String.c
> +++ b/MdePkg/Library/BaseLib/String.c
> @@ -1973,8 +1973,253 @@ Base64Decode (
>    IN OUT UINTN       *DestinationSize
>    )
>  {
> -  ASSERT (FALSE);
> -  return RETURN_INVALID_PARAMETER;
> +  BOOLEAN PaddingMode;
> +  UINTN   SixBitGroupsConsumed;
> +  UINT32  Accumulator;
> +  UINTN   OriginalDestinationSize;
> +  UINTN   SourceIndex;
> +
> +  if (DestinationSize == NULL) {
> +    return RETURN_INVALID_PARAMETER;
> +  }
> +
> +  //
> +  // Check Source array validity.
> +  //
> +  if (Source == NULL) {
> +    if (SourceSize > 0) {
> +      //
> +      // At least one CHAR8 element at NULL Source.
> +      //
> +      return RETURN_INVALID_PARAMETER;
> +    }
> +  } else if (SourceSize > MAX_ADDRESS - (UINTN)Source) {
> +    //
> +    // Non-NULL Source, but it wraps around.
> +    //
> +    return RETURN_INVALID_PARAMETER;
> +  }
> +
> +  //
> +  // Check Destination array validity.
> +  //
> +  if (Destination == NULL) {
> +    if (*DestinationSize > 0) {
> +      //
> +      // At least one UINT8 element at NULL Destination.
> +      //
> +      return RETURN_INVALID_PARAMETER;
> +    }
> +  } else if (*DestinationSize > MAX_ADDRESS - (UINTN)Destination) {
> +    //
> +    // Non-NULL Destination, but it wraps around.
> +    //
> +    return RETURN_INVALID_PARAMETER;
> +  }
> +
> +  //
> +  // Check for overlap.
> +  //
> +  if (Source != NULL && Destination != NULL) {
> +    //
> +    // Both arrays have been provided, and we know from earlier that each
> array
> +    // is valid in itself.
> +    //
> +    if ((UINTN)Source + SourceSize <= (UINTN)Destination) {
> +      //
> +      // Source array precedes Destination array, OK.
> +      //
> +    } else if ((UINTN)Destination + *DestinationSize <= (UINTN)Source) {
> +      //
> +      // Destination array precedes Source array, OK.
> +      //
> +    } else {
> +      //
> +      // Overlap.
> +      //
> +      return RETURN_INVALID_PARAMETER;
> +    }
> +  }
> +
> +  //
> +  // Decoding loop setup.
> +  //
> +  PaddingMode             = FALSE;
> +  SixBitGroupsConsumed    = 0;
> +  Accumulator             = 0;
> +  OriginalDestinationSize = *DestinationSize;
> +  *DestinationSize        = 0;
> +
> +  //
> +  // Decoding loop.
> +  //
> +  for (SourceIndex = 0; SourceIndex < SourceSize; SourceIndex++) {
> +    CHAR8  SourceChar;
> +    UINT32 Base64Value;
> +    UINT8  DestinationOctet;

The local variable define in a block scope. For CSS_2_1 Section 5.4.1.1 "Block (local) Scope". It is strongly discouraged. Maybe we should move them to the beginning of the function.

Thanks,
Zhichao

> +
> +    SourceChar = Source[SourceIndex];
> +
> +    //
> +    // Whitespace is ignored at all positions (regardless of padding mode).
> +    //
> +    if (SourceChar == '\t' || SourceChar == '\n' || SourceChar == '\v' ||
> +        SourceChar == '\f' || SourceChar == '\r' || SourceChar == ' ') {
> +      continue;
> +    }
> +


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [edk2-devel] [PATCH 2/3] MdePkg/BaseLib: rewrite Base64Decode()
  2019-07-12  2:31   ` [edk2-devel] " Gao, Zhichao
@ 2019-07-12 19:31     ` Laszlo Ersek
  2019-07-15 15:22       ` Liming Gao
  0 siblings, 1 reply; 26+ messages in thread
From: Laszlo Ersek @ 2019-07-12 19:31 UTC (permalink / raw)
  To: Gao, Zhichao, devel@edk2.groups.io, Gao, Liming,
	Kinney, Michael D
  Cc: Marvin Häuser, Philippe Mathieu-Daudé

On 07/12/19 04:31, Gao, Zhichao wrote:
> Sorry for late respond.
> The whole code is OK for me. And I write a tiny test for it without the memory address check. See https://github.com/ZhichaoGao/edk2/commit/615356ba32d3147957d215debd844e7709f06849 . It is tested in Emulator environment. If it is OK, I think you can take my Tested-by for this patch. If there are some missing, please let me know.

Thanks for writing that test app. It seems to have pretty good coverage.
I like that it covers the exit points systematically.

Mike, Liming: I intend to pick up Zhichao's T-b, from above. If you are
aware of another test suite (perhaps used in conjunction with the
originally contributed Base64Decode() impl), please let me know.

> Base64Decode parameter Source is indicate as OPTIONAL. Although it is OK to be NULL, and return DestinationSize to be zero with success status to indicate no decode occurred . I don't know if it is meaningful to report the result as that. In my opinion, NULL Source is an invalid input. Just my opinion, if the maintainer is OK with that, I am OK too.

I think that it is helpful. It is similar to the standard C function
free() accepting NULL (it is a no-op). Edk2's FreePool() doesn't accept
NULL, and that is inconvenient sometimes.

Consider the following use case. There is some code that optionally
receives data and then decodes it from base64. By default that data is
represented as a NULL pointer, with size 0. If Base64Decode() does not
accept NULL input with size 0, then the code in question has to
implement a separate check for that. But it keeps the call site simpler
if Base64Decode() explicitly specifies "do nothing", for those parameter
values.

Consider CopyMem() and CompareMem() too. In practice, if you pass
Length=0, they will work just fine, even in case the buffers are NULL.
These functions will exhibit the expected / intuitive behavior -- they
will not dereference NULL. However, because their interface contracts
don't spell out this case as valid (i.e. OPTIONAL), *technically* such
calls are not permitted, and so every CopyMem() and CompareMem() call
site has to be surrounded with if's, if NULL buffers with size 0 are
otherwise possible there. That's a waste -- it duplicates checks and it
makes the call site inconvenient.

NULL input with nonzero size is caught and rejected (as described in the
function's specification). "OPTIONAL" does not mean that the buffer can
be NULL under *any* circumstances -- it means that it can be NULL under
*some* (specified) circumstances.

> 
> Some other minor comment about the block scope variable blow.
> 
>> -----Original Message-----
>> From: devel@edk2.groups.io [mailto:devel@edk2.groups.io] On Behalf Of
>> Laszlo Ersek
>> Sent: Tuesday, July 2, 2019 6:29 PM
>> To: edk2-devel-groups-io <devel@edk2.groups.io>
>> Cc: Gao, Liming <liming.gao@intel.com>; Marvin Häuser
>> <mhaeuser@outlook.de>; Kinney, Michael D <michael.d.kinney@intel.com>;
>> Philippe Mathieu-Daudé <philmd@redhat.com>; Gao, Zhichao
>> <zhichao.gao@intel.com>
>> Subject: [edk2-devel] [PATCH 2/3] MdePkg/BaseLib: rewrite Base64Decode()
>>
>> Rewrite Base64Decode() from scratch, due to reasons listed in the second
>> reference below.
>>
>> Implement Base64Decode() according to the specification added in the
>> previous patch. The decoder scans the input buffer once, it has no inner
>> loop(s), and it spills each output byte as soon as the output byte is complete.
>>
>> Cc: Liming Gao <liming.gao@intel.com>
>> Cc: Marvin Häuser <mhaeuser@outlook.de>
>> Cc: Michael D Kinney <michael.d.kinney@intel.com>
>> Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
>> Cc: Zhichao Gao <zhichao.gao@intel.com>
>> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=1891
>> Ref: http://mid.mail-archive.com/c495bd0b-ea4d-7206-8a4f-
>> a7149760d19a@redhat.com
>> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
>> ---
>>  MdePkg/Library/BaseLib/String.c | 249 +++++++++++++++++++-
>>  1 file changed, 247 insertions(+), 2 deletions(-)
>>
>> diff --git a/MdePkg/Library/BaseLib/String.c
>> b/MdePkg/Library/BaseLib/String.c index f8397035c32a..6198ccbc9672
>> 100644
>> --- a/MdePkg/Library/BaseLib/String.c
>> +++ b/MdePkg/Library/BaseLib/String.c
>> @@ -1973,8 +1973,253 @@ Base64Decode (
>>    IN OUT UINTN       *DestinationSize
>>    )
>>  {
>> -  ASSERT (FALSE);
>> -  return RETURN_INVALID_PARAMETER;
>> +  BOOLEAN PaddingMode;
>> +  UINTN   SixBitGroupsConsumed;
>> +  UINT32  Accumulator;
>> +  UINTN   OriginalDestinationSize;
>> +  UINTN   SourceIndex;
>> +
>> +  if (DestinationSize == NULL) {
>> +    return RETURN_INVALID_PARAMETER;
>> +  }
>> +
>> +  //
>> +  // Check Source array validity.
>> +  //
>> +  if (Source == NULL) {
>> +    if (SourceSize > 0) {
>> +      //
>> +      // At least one CHAR8 element at NULL Source.
>> +      //
>> +      return RETURN_INVALID_PARAMETER;
>> +    }
>> +  } else if (SourceSize > MAX_ADDRESS - (UINTN)Source) {
>> +    //
>> +    // Non-NULL Source, but it wraps around.
>> +    //
>> +    return RETURN_INVALID_PARAMETER;
>> +  }
>> +
>> +  //
>> +  // Check Destination array validity.
>> +  //
>> +  if (Destination == NULL) {
>> +    if (*DestinationSize > 0) {
>> +      //
>> +      // At least one UINT8 element at NULL Destination.
>> +      //
>> +      return RETURN_INVALID_PARAMETER;
>> +    }
>> +  } else if (*DestinationSize > MAX_ADDRESS - (UINTN)Destination) {
>> +    //
>> +    // Non-NULL Destination, but it wraps around.
>> +    //
>> +    return RETURN_INVALID_PARAMETER;
>> +  }
>> +
>> +  //
>> +  // Check for overlap.
>> +  //
>> +  if (Source != NULL && Destination != NULL) {
>> +    //
>> +    // Both arrays have been provided, and we know from earlier that each
>> array
>> +    // is valid in itself.
>> +    //
>> +    if ((UINTN)Source + SourceSize <= (UINTN)Destination) {
>> +      //
>> +      // Source array precedes Destination array, OK.
>> +      //
>> +    } else if ((UINTN)Destination + *DestinationSize <= (UINTN)Source) {
>> +      //
>> +      // Destination array precedes Source array, OK.
>> +      //
>> +    } else {
>> +      //
>> +      // Overlap.
>> +      //
>> +      return RETURN_INVALID_PARAMETER;
>> +    }
>> +  }
>> +
>> +  //
>> +  // Decoding loop setup.
>> +  //
>> +  PaddingMode             = FALSE;
>> +  SixBitGroupsConsumed    = 0;
>> +  Accumulator             = 0;
>> +  OriginalDestinationSize = *DestinationSize;
>> +  *DestinationSize        = 0;
>> +
>> +  //
>> +  // Decoding loop.
>> +  //
>> +  for (SourceIndex = 0; SourceIndex < SourceSize; SourceIndex++) {
>> +    CHAR8  SourceChar;
>> +    UINT32 Base64Value;
>> +    UINT8  DestinationOctet;
> 
> The local variable define in a block scope. For CSS_2_1 Section 5.4.1.1 "Block (local) Scope". It is strongly discouraged. Maybe we should move them to the beginning of the function.

Yes, I can do that. Thanks for pointing out the exact location in the
CCS. Before I post v2 though, I'd like to hear back from the MdePkg
maintainers as well.

Thank you!
Laszlo

>> +
>> +    SourceChar = Source[SourceIndex];
>> +
>> +    //
>> +    // Whitespace is ignored at all positions (regardless of padding mode).
>> +    //
>> +    if (SourceChar == '\t' || SourceChar == '\n' || SourceChar == '\v' ||
>> +        SourceChar == '\f' || SourceChar == '\r' || SourceChar == ' ') {
>> +      continue;
>> +    }
>> +
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [edk2-devel] [PATCH 2/3] MdePkg/BaseLib: rewrite Base64Decode()
  2019-07-12 19:31     ` Laszlo Ersek
@ 2019-07-15 15:22       ` Liming Gao
  2019-07-15 21:56         ` Laszlo Ersek
  0 siblings, 1 reply; 26+ messages in thread
From: Liming Gao @ 2019-07-15 15:22 UTC (permalink / raw)
  To: devel@edk2.groups.io, lersek@redhat.com, Gao, Zhichao,
	Kinney, Michael D
  Cc: Marvin H?user, Philippe Mathieu-Daudé

Laszlo:

> -----Original Message-----
> From: devel@edk2.groups.io [mailto:devel@edk2.groups.io] On Behalf Of Laszlo Ersek
> Sent: Saturday, July 13, 2019 3:31 AM
> To: Gao, Zhichao <zhichao.gao@intel.com>; devel@edk2.groups.io; Gao, Liming <liming.gao@intel.com>; Kinney, Michael D
> <michael.d.kinney@intel.com>
> Cc: Marvin Häuser <mhaeuser@outlook.de>; Philippe Mathieu-Daudé <philmd@redhat.com>
> Subject: Re: [edk2-devel] [PATCH 2/3] MdePkg/BaseLib: rewrite Base64Decode()
> 
> On 07/12/19 04:31, Gao, Zhichao wrote:
> > Sorry for late respond.
> > The whole code is OK for me. And I write a tiny test for it without the memory address check. See
> https://github.com/ZhichaoGao/edk2/commit/615356ba32d3147957d215debd844e7709f06849 . It is tested in Emulator environment. If
> it is OK, I think you can take my Tested-by for this patch. If there are some missing, please let me know.
> 
> Thanks for writing that test app. It seems to have pretty good coverage.
> I like that it covers the exit points systematically.
> 
> Mike, Liming: I intend to pick up Zhichao's T-b, from above. If you are
> aware of another test suite (perhaps used in conjunction with the
> originally contributed Base64Decode() impl), please let me know.

OK to me.

> 
> > Base64Decode parameter Source is indicate as OPTIONAL. Although it is OK to be NULL, and return DestinationSize to be zero with
> success status to indicate no decode occurred . I don't know if it is meaningful to report the result as that. In my opinion, NULL Source is
> an invalid input. Just my opinion, if the maintainer is OK with that, I am OK too.

Yes. Source is NULL, SourceSize is zero. Code does nothing. I am fine with it. 

I have no other comments for the code logic. Reviewed-by: Liming Gao <liming.gao@intel.com>

> 
> I think that it is helpful. It is similar to the standard C function
> free() accepting NULL (it is a no-op). Edk2's FreePool() doesn't accept
> NULL, and that is inconvenient sometimes.
> 
> Consider the following use case. There is some code that optionally
> receives data and then decodes it from base64. By default that data is
> represented as a NULL pointer, with size 0. If Base64Decode() does not
> accept NULL input with size 0, then the code in question has to
> implement a separate check for that. But it keeps the call site simpler
> if Base64Decode() explicitly specifies "do nothing", for those parameter
> values.
> 
> Consider CopyMem() and CompareMem() too. In practice, if you pass
> Length=0, they will work just fine, even in case the buffers are NULL.
> These functions will exhibit the expected / intuitive behavior -- they
> will not dereference NULL. However, because their interface contracts
> don't spell out this case as valid (i.e. OPTIONAL), *technically* such
> calls are not permitted, and so every CopyMem() and CompareMem() call
> site has to be surrounded with if's, if NULL buffers with size 0 are
> otherwise possible there. That's a waste -- it duplicates checks and it
> makes the call site inconvenient.
> 
> NULL input with nonzero size is caught and rejected (as described in the
> function's specification). "OPTIONAL" does not mean that the buffer can
> be NULL under *any* circumstances -- it means that it can be NULL under
> *some* (specified) circumstances.
> 
> >
> > Some other minor comment about the block scope variable blow.
> >
> >> -----Original Message-----
> >> From: devel@edk2.groups.io [mailto:devel@edk2.groups.io] On Behalf Of
> >> Laszlo Ersek
> >> Sent: Tuesday, July 2, 2019 6:29 PM
> >> To: edk2-devel-groups-io <devel@edk2.groups.io>
> >> Cc: Gao, Liming <liming.gao@intel.com>; Marvin Häuser
> >> <mhaeuser@outlook.de>; Kinney, Michael D <michael.d.kinney@intel.com>;
> >> Philippe Mathieu-Daudé <philmd@redhat.com>; Gao, Zhichao
> >> <zhichao.gao@intel.com>
> >> Subject: [edk2-devel] [PATCH 2/3] MdePkg/BaseLib: rewrite Base64Decode()
> >>
> >> Rewrite Base64Decode() from scratch, due to reasons listed in the second
> >> reference below.
> >>
> >> Implement Base64Decode() according to the specification added in the
> >> previous patch. The decoder scans the input buffer once, it has no inner
> >> loop(s), and it spills each output byte as soon as the output byte is complete.
> >>
> >> Cc: Liming Gao <liming.gao@intel.com>
> >> Cc: Marvin Häuser <mhaeuser@outlook.de>
> >> Cc: Michael D Kinney <michael.d.kinney@intel.com>
> >> Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
> >> Cc: Zhichao Gao <zhichao.gao@intel.com>
> >> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=1891
> >> Ref: http://mid.mail-archive.com/c495bd0b-ea4d-7206-8a4f-
> >> a7149760d19a@redhat.com
> >> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
> >> ---
> >>  MdePkg/Library/BaseLib/String.c | 249 +++++++++++++++++++-
> >>  1 file changed, 247 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/MdePkg/Library/BaseLib/String.c
> >> b/MdePkg/Library/BaseLib/String.c index f8397035c32a..6198ccbc9672
> >> 100644
> >> --- a/MdePkg/Library/BaseLib/String.c
> >> +++ b/MdePkg/Library/BaseLib/String.c
> >> @@ -1973,8 +1973,253 @@ Base64Decode (
> >>    IN OUT UINTN       *DestinationSize
> >>    )
> >>  {
> >> -  ASSERT (FALSE);
> >> -  return RETURN_INVALID_PARAMETER;
> >> +  BOOLEAN PaddingMode;
> >> +  UINTN   SixBitGroupsConsumed;
> >> +  UINT32  Accumulator;
> >> +  UINTN   OriginalDestinationSize;
> >> +  UINTN   SourceIndex;
> >> +
> >> +  if (DestinationSize == NULL) {
> >> +    return RETURN_INVALID_PARAMETER;
> >> +  }
> >> +
> >> +  //
> >> +  // Check Source array validity.
> >> +  //
> >> +  if (Source == NULL) {
> >> +    if (SourceSize > 0) {
> >> +      //
> >> +      // At least one CHAR8 element at NULL Source.
> >> +      //
> >> +      return RETURN_INVALID_PARAMETER;
> >> +    }
> >> +  } else if (SourceSize > MAX_ADDRESS - (UINTN)Source) {
> >> +    //
> >> +    // Non-NULL Source, but it wraps around.
> >> +    //
> >> +    return RETURN_INVALID_PARAMETER;
> >> +  }
> >> +
> >> +  //
> >> +  // Check Destination array validity.
> >> +  //
> >> +  if (Destination == NULL) {
> >> +    if (*DestinationSize > 0) {
> >> +      //
> >> +      // At least one UINT8 element at NULL Destination.
> >> +      //
> >> +      return RETURN_INVALID_PARAMETER;
> >> +    }
> >> +  } else if (*DestinationSize > MAX_ADDRESS - (UINTN)Destination) {
> >> +    //
> >> +    // Non-NULL Destination, but it wraps around.
> >> +    //
> >> +    return RETURN_INVALID_PARAMETER;
> >> +  }
> >> +
> >> +  //
> >> +  // Check for overlap.
> >> +  //
> >> +  if (Source != NULL && Destination != NULL) {
> >> +    //
> >> +    // Both arrays have been provided, and we know from earlier that each
> >> array
> >> +    // is valid in itself.
> >> +    //
> >> +    if ((UINTN)Source + SourceSize <= (UINTN)Destination) {
> >> +      //
> >> +      // Source array precedes Destination array, OK.
> >> +      //
> >> +    } else if ((UINTN)Destination + *DestinationSize <= (UINTN)Source) {
> >> +      //
> >> +      // Destination array precedes Source array, OK.
> >> +      //
> >> +    } else {
> >> +      //
> >> +      // Overlap.
> >> +      //
> >> +      return RETURN_INVALID_PARAMETER;
> >> +    }
> >> +  }
> >> +
> >> +  //
> >> +  // Decoding loop setup.
> >> +  //
> >> +  PaddingMode             = FALSE;
> >> +  SixBitGroupsConsumed    = 0;
> >> +  Accumulator             = 0;
> >> +  OriginalDestinationSize = *DestinationSize;
> >> +  *DestinationSize        = 0;
> >> +
> >> +  //
> >> +  // Decoding loop.
> >> +  //
> >> +  for (SourceIndex = 0; SourceIndex < SourceSize; SourceIndex++) {
> >> +    CHAR8  SourceChar;
> >> +    UINT32 Base64Value;
> >> +    UINT8  DestinationOctet;
> >
> > The local variable define in a block scope. For CSS_2_1 Section 5.4.1.1 "Block (local) Scope". It is strongly discouraged. Maybe we should
> move them to the beginning of the function.
> 
> Yes, I can do that. Thanks for pointing out the exact location in the
> CCS. Before I post v2 though, I'd like to hear back from the MdePkg
> maintainers as well.
> 
> Thank you!
> Laszlo
> 
> >> +
> >> +    SourceChar = Source[SourceIndex];
> >> +
> >> +    //
> >> +    // Whitespace is ignored at all positions (regardless of padding mode).
> >> +    //
> >> +    if (SourceChar == '\t' || SourceChar == '\n' || SourceChar == '\v' ||
> >> +        SourceChar == '\f' || SourceChar == '\r' || SourceChar == ' ') {
> >> +      continue;
> >> +    }
> >> +
> >
> 
> 
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [edk2-devel] [PATCH 2/3] MdePkg/BaseLib: rewrite Base64Decode()
  2019-07-02 10:28 ` [PATCH 2/3] MdePkg/BaseLib: rewrite Base64Decode() Laszlo Ersek
  2019-07-12  2:31   ` [edk2-devel] " Gao, Zhichao
@ 2019-07-15 18:44   ` mhaeuser
  2019-07-16  0:45     ` Laszlo Ersek
  2019-07-16 10:05   ` Philippe Mathieu-Daudé
  2 siblings, 1 reply; 26+ messages in thread
From: mhaeuser @ 2019-07-15 18:44 UTC (permalink / raw)
  To: devel@edk2.groups.io, lersek@redhat.com
  Cc: Liming Gao, Michael D Kinney, Philippe Mathieu-Daudé,
	Zhichao Gao

Good day,

Please excuse my late reply and thank you very much for your reimplementation effort.

I feel like my rushed message mentioning 'MAX_ADDRESS' was misleading a little - the point with that was a potential index overflow (I may actually have meant 'MAX_UINTN', I am not sure about the details anymore) in the original code, I personally see a lot of sense in *not* checking whether the buffer wraps around (similarly the overlap condition). For one consistency with similar code, where no such checks exist, and then a sanity-trust in the caller (which, as this is a library function, is interior as opposed to an external protocol caller which should naturally be trusted less).

Generally, I am rather confused about the edk2 trust model for static calls. A bunch of libraries verify input parameter sanity via ASSERTs, another bunch by runtime checks and appropriate return statuses. Is there any kind of policy I am unaware of?

Regards,
Marvin

> -----Original Message-----
> From: devel@edk2.groups.io <devel@edk2.groups.io> On Behalf Of Laszlo
> Ersek
> Sent: Tuesday, July 2, 2019 12:29 PM
> To: edk2-devel-groups-io <devel@edk2.groups.io>
> Cc: Liming Gao <liming.gao@intel.com>; Marvin Häuser
> <mhaeuser@outlook.de>; Michael D Kinney <michael.d.kinney@intel.com>;
> Philippe Mathieu-Daudé <philmd@redhat.com>; Zhichao Gao
> <zhichao.gao@intel.com>
> Subject: [edk2-devel] [PATCH 2/3] MdePkg/BaseLib: rewrite Base64Decode()
> 
> Rewrite Base64Decode() from scratch, due to reasons listed in the second
> reference below.
> 
> Implement Base64Decode() according to the specification added in the
> previous patch. The decoder scans the input buffer once, it has no inner
> loop(s), and it spills each output byte as soon as the output byte is complete.
> 
> Cc: Liming Gao <liming.gao@intel.com>
> Cc: Marvin Häuser <mhaeuser@outlook.de>
> Cc: Michael D Kinney <michael.d.kinney@intel.com>
> Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
> Cc: Zhichao Gao <zhichao.gao@intel.com>
> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=1891
> Ref: http://mid.mail-archive.com/c495bd0b-ea4d-7206-8a4f-
> a7149760d19a@redhat.com
> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
> ---
>  MdePkg/Library/BaseLib/String.c | 249 +++++++++++++++++++-
>  1 file changed, 247 insertions(+), 2 deletions(-)
> 
> diff --git a/MdePkg/Library/BaseLib/String.c
> b/MdePkg/Library/BaseLib/String.c index f8397035c32a..6198ccbc9672
> 100644
> --- a/MdePkg/Library/BaseLib/String.c
> +++ b/MdePkg/Library/BaseLib/String.c
> @@ -1973,8 +1973,253 @@ Base64Decode (
>    IN OUT UINTN       *DestinationSize
>    )
>  {
> -  ASSERT (FALSE);
> -  return RETURN_INVALID_PARAMETER;
> +  BOOLEAN PaddingMode;
> +  UINTN   SixBitGroupsConsumed;
> +  UINT32  Accumulator;
> +  UINTN   OriginalDestinationSize;
> +  UINTN   SourceIndex;
> +
> +  if (DestinationSize == NULL) {
> +    return RETURN_INVALID_PARAMETER;
> +  }
> +
> +  //
> +  // Check Source array validity.
> +  //
> +  if (Source == NULL) {
> +    if (SourceSize > 0) {
> +      //
> +      // At least one CHAR8 element at NULL Source.
> +      //
> +      return RETURN_INVALID_PARAMETER;
> +    }
> +  } else if (SourceSize > MAX_ADDRESS - (UINTN)Source) {
> +    //
> +    // Non-NULL Source, but it wraps around.
> +    //
> +    return RETURN_INVALID_PARAMETER;
> +  }
> +
> +  //
> +  // Check Destination array validity.
> +  //
> +  if (Destination == NULL) {
> +    if (*DestinationSize > 0) {
> +      //
> +      // At least one UINT8 element at NULL Destination.
> +      //
> +      return RETURN_INVALID_PARAMETER;
> +    }
> +  } else if (*DestinationSize > MAX_ADDRESS - (UINTN)Destination) {
> +    //
> +    // Non-NULL Destination, but it wraps around.
> +    //
> +    return RETURN_INVALID_PARAMETER;
> +  }
> +
> +  //
> +  // Check for overlap.
> +  //
> +  if (Source != NULL && Destination != NULL) {
> +    //
> +    // Both arrays have been provided, and we know from earlier that each
> array
> +    // is valid in itself.
> +    //
> +    if ((UINTN)Source + SourceSize <= (UINTN)Destination) {
> +      //
> +      // Source array precedes Destination array, OK.
> +      //
> +    } else if ((UINTN)Destination + *DestinationSize <= (UINTN)Source) {
> +      //
> +      // Destination array precedes Source array, OK.
> +      //
> +    } else {
> +      //
> +      // Overlap.
> +      //
> +      return RETURN_INVALID_PARAMETER;
> +    }
> +  }
> +
> +  //
> +  // Decoding loop setup.
> +  //
> +  PaddingMode             = FALSE;
> +  SixBitGroupsConsumed    = 0;
> +  Accumulator             = 0;
> +  OriginalDestinationSize = *DestinationSize;
> +  *DestinationSize        = 0;
> +
> +  //
> +  // Decoding loop.
> +  //
> +  for (SourceIndex = 0; SourceIndex < SourceSize; SourceIndex++) {
> +    CHAR8  SourceChar;
> +    UINT32 Base64Value;
> +    UINT8  DestinationOctet;
> +
> +    SourceChar = Source[SourceIndex];
> +
> +    //
> +    // Whitespace is ignored at all positions (regardless of padding mode).
> +    //
> +    if (SourceChar == '\t' || SourceChar == '\n' || SourceChar == '\v' ||
> +        SourceChar == '\f' || SourceChar == '\r' || SourceChar == ' ') {
> +      continue;
> +    }
> +
> +    //
> +    // If we're in padding mode, accept another padding character, as long as
> +    // that padding character completes the quantum. This completes case (2)
> +    // from RFC4648, Chapter 4. "Base 64 Encoding":
> +    //
> +    // (2) The final quantum of encoding input is exactly 8 bits; here, the
> +    //     final unit of encoded output will be two characters followed by two
> +    //     "=" padding characters.
> +    //
> +    if (PaddingMode) {
> +      if (SourceChar == '=' && SixBitGroupsConsumed == 3) {
> +        SixBitGroupsConsumed = 0;
> +        continue;
> +      }
> +      return RETURN_INVALID_PARAMETER;
> +    }
> +
> +    //
> +    // When not in padding mode, decode Base64Value based on RFC4648,
> "Table 1:
> +    // The Base 64 Alphabet".
> +    //
> +    if ('A' <= SourceChar && SourceChar <= 'Z') {
> +      Base64Value = SourceChar - 'A';
> +    } else if ('a' <= SourceChar && SourceChar <= 'z') {
> +      Base64Value = 26 + (SourceChar - 'a');
> +    } else if ('0' <= SourceChar && SourceChar <= '9') {
> +      Base64Value = 52 + (SourceChar - '0');
> +    } else if (SourceChar == '+') {
> +      Base64Value = 62;
> +    } else if (SourceChar == '/') {
> +      Base64Value = 63;
> +    } else if (SourceChar == '=') {
> +      //
> +      // Enter padding mode.
> +      //
> +      PaddingMode = TRUE;
> +
> +      if (SixBitGroupsConsumed == 2) {
> +        //
> +        // If we have consumed two 6-bit groups from the current quantum
> before
> +        // encountering the first padding character, then this is case (2) from
> +        // RFC4648, Chapter 4. "Base 64 Encoding". Bump
> SixBitGroupsConsumed,
> +        // and we'll enforce another padding character.
> +        //
> +        SixBitGroupsConsumed = 3;
> +      } else if (SixBitGroupsConsumed == 3) {
> +        //
> +        // If we have consumed three 6-bit groups from the current quantum
> +        // before encountering the first padding character, then this is case
> +        // (3) from RFC4648, Chapter 4. "Base 64 Encoding". The quantum is
> now
> +        // complete.
> +        //
> +        SixBitGroupsConsumed = 0;
> +      } else {
> +        //
> +        // Padding characters are not allowed at the first two positions of a
> +        // quantum.
> +        //
> +        return RETURN_INVALID_PARAMETER;
> +      }
> +
> +      //
> +      // Wherever in a quantum we enter padding mode, we enforce the
> padding
> +      // bits pending in the accumulator -- from the last 6-bit group just
> +      // preceding the padding character -- to be zero. Refer to RFC4648,
> +      // Chapter 3.5. "Canonical Encoding".
> +      //
> +      if (Accumulator != 0) {
> +        return RETURN_INVALID_PARAMETER;
> +      }
> +
> +      //
> +      // Advance to the next source character.
> +      //
> +      continue;
> +    } else {
> +      //
> +      // Other characters outside of the encoding alphabet are rejected.
> +      //
> +      return RETURN_INVALID_PARAMETER;
> +    }
> +
> +    //
> +    // Feed the bits of the current 6-bit group of the quantum to the
> +    // accumulator.
> +    //
> +    Accumulator = (Accumulator << 6) | Base64Value;
> +    SixBitGroupsConsumed++;
> +    switch (SixBitGroupsConsumed) {
> +    case 1:
> +      //
> +      // No octet to spill after consuming the first 6-bit group of the
> +      // quantum; advance to the next source character.
> +      //
> +      continue;
> +    case 2:
> +      //
> +      // 12 bits accumulated (6 pending + 6 new); prepare for spilling an
> +      // octet. 4 bits remain pending.
> +      //
> +      DestinationOctet = (UINT8)(Accumulator >> 4);
> +      Accumulator &= 0xF;
> +      break;
> +    case 3:
> +      //
> +      // 10 bits accumulated (4 pending + 6 new); prepare for spilling an
> +      // octet. 2 bits remain pending.
> +      //
> +      DestinationOctet = (UINT8)(Accumulator >> 2);
> +      Accumulator &= 0x3;
> +      break;
> +    default:
> +      ASSERT (SixBitGroupsConsumed == 4);
> +      //
> +      // 8 bits accumulated (2 pending + 6 new); prepare for spilling an octet.
> +      // The quantum is complete, 0 bits remain pending.
> +      //
> +      DestinationOctet = (UINT8)Accumulator;
> +      Accumulator = 0;
> +      SixBitGroupsConsumed = 0;
> +      break;
> +    }
> +
> +    //
> +    // Store the decoded octet if there's room left. Increment
> +    // (*DestinationSize) unconditionally.
> +    //
> +    if (*DestinationSize < OriginalDestinationSize) {
> +      ASSERT (Destination != NULL);
> +      Destination[*DestinationSize] = DestinationOctet;
> +    }
> +    (*DestinationSize)++;
> +
> +    //
> +    // Advance to the next source character.
> +    //
> +  }
> +
> +  //
> +  // If Source terminates mid-quantum, then Source is invalid.
> +  //
> +  if (SixBitGroupsConsumed != 0) {
> +    return RETURN_INVALID_PARAMETER;
> +  }
> +
> +  //
> +  // Done.
> +  //
> +  if (*DestinationSize <= OriginalDestinationSize) {
> +    return RETURN_SUCCESS;
> +  }
> +  return RETURN_BUFFER_TOO_SMALL;
>  }
> 
>  /**
> --
> 2.19.1.3.g30247aa5d201
> 
> 
> 
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [edk2-devel] [PATCH 2/3] MdePkg/BaseLib: rewrite Base64Decode()
  2019-07-15 15:22       ` Liming Gao
@ 2019-07-15 21:56         ` Laszlo Ersek
  2019-07-16  1:18           ` Gao, Zhichao
  0 siblings, 1 reply; 26+ messages in thread
From: Laszlo Ersek @ 2019-07-15 21:56 UTC (permalink / raw)
  To: Gao, Liming, Gao, Zhichao
  Cc: devel@edk2.groups.io, Kinney, Michael D, Marvin Häuser,
	Philippe Mathieu-Daudé

On 07/15/19 17:22, Gao, Liming wrote:
> Laszlo:
> 
>> -----Original Message-----
>> From: devel@edk2.groups.io [mailto:devel@edk2.groups.io] On Behalf Of Laszlo Ersek
>> Sent: Saturday, July 13, 2019 3:31 AM
>> To: Gao, Zhichao <zhichao.gao@intel.com>; devel@edk2.groups.io; Gao, Liming <liming.gao@intel.com>; Kinney, Michael D
>> <michael.d.kinney@intel.com>
>> Cc: Marvin Häuser <mhaeuser@outlook.de>; Philippe Mathieu-Daudé <philmd@redhat.com>
>> Subject: Re: [edk2-devel] [PATCH 2/3] MdePkg/BaseLib: rewrite Base64Decode()
>>
>> On 07/12/19 04:31, Gao, Zhichao wrote:
>>> Sorry for late respond.
>>> The whole code is OK for me. And I write a tiny test for it without the memory address check. See
>> https://github.com/ZhichaoGao/edk2/commit/615356ba32d3147957d215debd844e7709f06849 . It is tested in Emulator environment. If
>> it is OK, I think you can take my Tested-by for this patch. If there are some missing, please let me know.
>>
>> Thanks for writing that test app. It seems to have pretty good coverage.
>> I like that it covers the exit points systematically.
>>
>> Mike, Liming: I intend to pick up Zhichao's T-b, from above. If you are
>> aware of another test suite (perhaps used in conjunction with the
>> originally contributed Base64Decode() impl), please let me know.
> 
> OK to me.
> 
>>
>>> Base64Decode parameter Source is indicate as OPTIONAL. Although it is OK to be NULL, and return DestinationSize to be zero with
>> success status to indicate no decode occurred . I don't know if it is meaningful to report the result as that. In my opinion, NULL Source is
>> an invalid input. Just my opinion, if the maintainer is OK with that, I am OK too.
> 
> Yes. Source is NULL, SourceSize is zero. Code does nothing. I am fine with it. 
> 
> I have no other comments for the code logic. Reviewed-by: Liming Gao <liming.gao@intel.com>

Thank you!

A few questions:


(1) Liming, does your R-b apply to the first patch in the series as
well? (That patch too is for MdePkg.)

Here are two links to patch#1 in the series, for your convenience:

http://mid.mail-archive.com/20190702102836.27589-2-lersek@redhat.com
https://edk2.groups.io/g/devel/message/43162


(2) Zhichao, you made a good remark about block-scoped variables, and
the CCS. You also gave your Tested-by for the present version. So, we
have the following two options:

(2a)

- I push the present patch as-is, including your Tested-by. (Together
with Liming's R-b -- he is OK with the present version.)

- Subsequently, I post a separate (incremental) patch for moving the
variables from the inner block scope to the outer function block scope.
This incremental patch needs another MdePkg maintainer review, but it
doesn't need additional testing from you.

(2b) Alternatively:

- I post v2 of this series, which incorporates the movement of the
variables from the inner block scope to the outermost function block scope.

- I keep Liming's R-b. (The variable definition movement is
straight-forward and I don't think it invalidates Liming's R-b).

- I drop your Tested-by, because the variable movement technically
changes code that your testing exercised, and a Tested-by tag should
only be applied to the *exact* code that was exercised by the testing.

- Therefore, for preserving your testing work in the git history, you
would have to redo the testing, please.


Zhichao, do you prefer (2a) or (2b)?

Personally, I prefer (2a), because (2a) is safe -- importantly, the v1
code is *valid* C --, and it is less work for the community (for you,
Liming and myself, together).

Thanks!
Laszlo

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [edk2-devel] [PATCH 3/3] OvmfPkg/EnrollDefaultKeys: clean up Base64Decode() retval handling
  2019-07-02 10:28 ` [PATCH 3/3] OvmfPkg/EnrollDefaultKeys: clean up Base64Decode() retval handling Laszlo Ersek
@ 2019-07-15 21:58   ` Laszlo Ersek
  2019-07-16  8:36     ` Philippe Mathieu-Daudé
  0 siblings, 1 reply; 26+ messages in thread
From: Laszlo Ersek @ 2019-07-15 21:58 UTC (permalink / raw)
  To: Jordan Justen
  Cc: edk2-devel-groups-io, Ard Biesheuvel, Philippe Mathieu-Daudé

Jordan, can you please ACK this one patch in the series?

Thanks!
Laszlo

On 07/02/19 12:28, Laszlo Ersek wrote:
> Base64Decode() now guarantees that DestinationSize is larger on output
> than it was on input if RETURN_BUFFER_TOO_SMALL is returned. Clean up the
> retval handling for the first Base64Decode() call in EnrollDefaultKeys,
> which used to work around the ambiguity in the previous Base64Decode()
> interface contract.
> 
> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> Cc: Jordan Justen <jordan.l.justen@intel.com>
> Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=1891
> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
> ---
>  OvmfPkg/EnrollDefaultKeys/EnrollDefaultKeys.c | 10 ++--------
>  1 file changed, 2 insertions(+), 8 deletions(-)
> 
> diff --git a/OvmfPkg/EnrollDefaultKeys/EnrollDefaultKeys.c b/OvmfPkg/EnrollDefaultKeys/EnrollDefaultKeys.c
> index f45cb799f726..302b80d97720 100644
> --- a/OvmfPkg/EnrollDefaultKeys/EnrollDefaultKeys.c
> +++ b/OvmfPkg/EnrollDefaultKeys/EnrollDefaultKeys.c
> @@ -154,14 +154,8 @@ GetPkKek1 (
>    Status = Base64Decode (Base64Cert, Base64CertLen, NULL, &DecodedCertSize);
>    switch (Status) {
>    case EFI_BUFFER_TOO_SMALL:
> -    if (DecodedCertSize > 0) {
> -      break;
> -    }
> -    //
> -    // Fall through: the above Base64Decode() call is ill-specified in BaseLib
> -    // if Source decodes to zero bytes (for example if it consists of ignored
> -    // whitespace only).
> -    //
> +    ASSERT (DecodedCertSize > 0);
> +    break;
>    case EFI_SUCCESS:
>      AsciiPrint ("error: empty certificate after app prefix %g\n",
>        &gOvmfPkKek1AppPrefixGuid);
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [edk2-devel] [PATCH 2/3] MdePkg/BaseLib: rewrite Base64Decode()
  2019-07-15 18:44   ` mhaeuser
@ 2019-07-16  0:45     ` Laszlo Ersek
  0 siblings, 0 replies; 26+ messages in thread
From: Laszlo Ersek @ 2019-07-16  0:45 UTC (permalink / raw)
  To: Marvin Häuser, devel@edk2.groups.io
  Cc: Liming Gao, Michael D Kinney, Philippe Mathieu-Daudé,
	Zhichao Gao

Hi Marvin,

On 07/15/19 20:44, Marvin Häuser wrote:
> I feel like my rushed message mentioning 'MAX_ADDRESS' was misleading
> a little - the point with that was a potential index overflow (I may
> actually have meant 'MAX_UINTN', I am not sure about the details
> anymore) in the original code, I personally see a lot of sense in
> *not* checking whether the buffer wraps around (similarly the overlap
> condition). For one consistency with similar code, where no such
> checks exist, and then a sanity-trust in the caller (which, as this is
> a library function, is interior as opposed to an external protocol
> caller which should naturally be trusted less).
>
> Generally, I am rather confused about the edk2 trust model for static
> calls. A bunch of libraries verify input parameter sanity via ASSERTs,
> another bunch by runtime checks and appropriate return statuses. Is
> there any kind of policy I am unaware of?

Good points, and you are right, the landscape is not consistent.

I think the edk2 practice can be characterized as follows (again, the
picture is not uniform / consistent across the codebase):

(1) Functions are supposed to have detailed interface contracts, but
    many don't. In some cases, cutting corners appears at least
    moderately defensible, because a function can be really simple, and
    writing documentation could be the larger part of the effort. I
    still prefer if we document all new functions painstakingly.

(2) Caller responsibilities are frequently checked by callees. The
    checks vary between ASSERT()s and explicit conditions / return
    values. The 2nd kind is better, of course, and that is actually
    exemplified by the UEFI spec.

    Namely, a large proportion of EFI_INVALID_PARAMETER return codes
    correspond to cases when the callee catches the caller breaking the
    interface contract. See for example EFI_BOOT_SERVICES.CreateEvent().

    In some other cases, similar efforts / spec requirements look
    dubious (see EFI_BOOT_SERVICES.FreePool() -- "Buffer was invalid").
    If the callee gets a pointer to freed storage, it can't even
    *evaluate* that pointer without depending on undefined behavior. So,
    in theory, there's no way to enforce the contract, beyond trusting
    the caller, and so there can be no *substitute* for a detailed
    contract; see (1).

    I think the approach ("watch your caller") is generally acceptable
    still, if there is not a large runtime cost, because the environment
    is extremely unforgiving, and because the firmware can, in practice,
    recover from *some* undefined behavior this way, even.

(3) Genuine failure conditions are occasionally checked with
    ASSERT(). This should never be done.


Now, considering "wrapped arrays" and such -- obviously such a thing is
not even an object in C (no valid memory allocation would ever produce
it), so normally I would consider checks against it pointless.

However, in the present case, there are three arguments for including
the MAX_ADDRESS checks:

- The original code included similar MAX_ADDRESS checks, and I didn't
  want to "weaken" the implementation in any way. (I simply didn't want
  to defend such choices -- so I didn't make them.)

- The conditions are not costly, and as long as the buffer pointers are
  into valid storage, the conditions do catch -- without invoking
  undefined behavior -- *sizes* that would cause wrap-around. This is in
  line with (2).

- This is MdePkg/BaseLib, so general distrust (with negligible runtime
  cost) cannot hurt. Widely used library -- yes, not a protocol, but
  still linked (statically) into a bunch of 3rd party code --, and
  (again) an unforgiving environment.

I introduced the overlap check myself (IIRC). That was because:

- There is no language-level reason why decoding back into the same
  buffer couldn't work -- it's just that this implementation doesn't aim
  to support that. Hence the corresponding natural language restriction
  in the interface contract. (In edk2, we don't have the "restrict"
  keyword, from C99.) Otherwise a programmer might cleverly say, "aha,
  base64 decoding never *inflates* data, so I'll just decode "in-place".

- And then, although it's a caller responsibility, catch the overlap
  explicitly with RETURN_INVALID_PARAMETER, in line with the above: the
  checks are cheap, and BaseLib is quite central.

Now I'm not trying to present any of this as "policy" -- just my 2
cents. I hope it makes sense.

Thanks
Laszlo

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [edk2-devel] [PATCH 2/3] MdePkg/BaseLib: rewrite Base64Decode()
  2019-07-15 21:56         ` Laszlo Ersek
@ 2019-07-16  1:18           ` Gao, Zhichao
  2019-07-16 10:48             ` Laszlo Ersek
  0 siblings, 1 reply; 26+ messages in thread
From: Gao, Zhichao @ 2019-07-16  1:18 UTC (permalink / raw)
  To: Laszlo Ersek, Gao, Liming
  Cc: devel@edk2.groups.io, Kinney, Michael D, Marvin Häuser,
	Philippe Mathieu-Daudé



> -----Original Message-----
> From: Laszlo Ersek [mailto:lersek@redhat.com]
> Sent: Tuesday, July 16, 2019 5:56 AM
> To: Gao, Liming <liming.gao@intel.com>; Gao, Zhichao
> <zhichao.gao@intel.com>
> Cc: devel@edk2.groups.io; Kinney, Michael D <michael.d.kinney@intel.com>;
> Marvin Häuser <mhaeuser@outlook.de>; Philippe Mathieu-Daudé
> <philmd@redhat.com>
> Subject: Re: [edk2-devel] [PATCH 2/3] MdePkg/BaseLib: rewrite
> Base64Decode()
> 
> On 07/15/19 17:22, Gao, Liming wrote:
> > Laszlo:
> >
> >> -----Original Message-----
> >> From: devel@edk2.groups.io [mailto:devel@edk2.groups.io] On Behalf Of
> >> Laszlo Ersek
> >> Sent: Saturday, July 13, 2019 3:31 AM
> >> To: Gao, Zhichao <zhichao.gao@intel.com>; devel@edk2.groups.io; Gao,
> >> Liming <liming.gao@intel.com>; Kinney, Michael D
> >> <michael.d.kinney@intel.com>
> >> Cc: Marvin Häuser <mhaeuser@outlook.de>; Philippe Mathieu-Daudé
> >> <philmd@redhat.com>
> >> Subject: Re: [edk2-devel] [PATCH 2/3] MdePkg/BaseLib: rewrite
> >> Base64Decode()
> >>
> >> On 07/12/19 04:31, Gao, Zhichao wrote:
> >>> Sorry for late respond.
> >>> The whole code is OK for me. And I write a tiny test for it without
> >>> the memory address check. See
> >>
> https://github.com/ZhichaoGao/edk2/commit/615356ba32d3147957d215deb
> d8
> >> 44e7709f06849 . It is tested in Emulator environment. If it is OK, I think you
> can take my Tested-by for this patch. If there are some missing, please let
> me know.
> >>
> >> Thanks for writing that test app. It seems to have pretty good coverage.
> >> I like that it covers the exit points systematically.
> >>
> >> Mike, Liming: I intend to pick up Zhichao's T-b, from above. If you
> >> are aware of another test suite (perhaps used in conjunction with the
> >> originally contributed Base64Decode() impl), please let me know.
> >
> > OK to me.
> >
> >>
> >>> Base64Decode parameter Source is indicate as OPTIONAL. Although it
> >>> is OK to be NULL, and return DestinationSize to be zero with
> >> success status to indicate no decode occurred . I don't know if it is
> >> meaningful to report the result as that. In my opinion, NULL Source is an
> invalid input. Just my opinion, if the maintainer is OK with that, I am OK too.
> >
> > Yes. Source is NULL, SourceSize is zero. Code does nothing. I am fine with it.
> >
> > I have no other comments for the code logic. Reviewed-by: Liming Gao
> > <liming.gao@intel.com>
> 
> Thank you!
> 
> A few questions:
> 
> 
> (1) Liming, does your R-b apply to the first patch in the series as well? (That
> patch too is for MdePkg.)
> 
> Here are two links to patch#1 in the series, for your convenience:
> 
> http://mid.mail-archive.com/20190702102836.27589-2-lersek@redhat.com
> https://edk2.groups.io/g/devel/message/43162
> 
> 
> (2) Zhichao, you made a good remark about block-scoped variables, and the
> CCS. You also gave your Tested-by for the present version. So, we have the
> following two options:
> 
> (2a)
> 
> - I push the present patch as-is, including your Tested-by. (Together with
> Liming's R-b -- he is OK with the present version.)
> 
> - Subsequently, I post a separate (incremental) patch for moving the
> variables from the inner block scope to the outer function block scope.
> This incremental patch needs another MdePkg maintainer review, but it
> doesn't need additional testing from you.
> 
> (2b) Alternatively:
> 
> - I post v2 of this series, which incorporates the movement of the variables
> from the inner block scope to the outermost function block scope.
> 
> - I keep Liming's R-b. (The variable definition movement is straight-forward
> and I don't think it invalidates Liming's R-b).
> 
> - I drop your Tested-by, because the variable movement technically changes
> code that your testing exercised, and a Tested-by tag should only be applied
> to the *exact* code that was exercised by the testing.
> 
> - Therefore, for preserving your testing work in the git history, you would
> have to redo the testing, please.
> 
> 
> Zhichao, do you prefer (2a) or (2b)?
> 
> Personally, I prefer (2a), because (2a) is safe -- importantly, the v1 code is
> *valid* C --, and it is less work for the community (for you, Liming and myself,
> together).

(2a) makes less work and makes a good history, (2b) makes a clean commit message. I have no idea on which is better.
For me, I prefer (2a) too because I don't think clean commit message is such important.

It's up to you to decide. If you choose (2b) I am fine to do a test again and provide a T-B.

Thanks,
Zhichao

> 
> Thanks!
> Laszlo

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [edk2-devel] [PATCH 3/3] OvmfPkg/EnrollDefaultKeys: clean up Base64Decode() retval handling
  2019-07-15 21:58   ` [edk2-devel] " Laszlo Ersek
@ 2019-07-16  8:36     ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 26+ messages in thread
From: Philippe Mathieu-Daudé @ 2019-07-16  8:36 UTC (permalink / raw)
  To: devel, lersek, Jordan Justen; +Cc: Ard Biesheuvel

On 7/15/19 11:58 PM, Laszlo Ersek wrote:
> Jordan, can you please ACK this one patch in the series?
> 
> Thanks!
> Laszlo
> 
> On 07/02/19 12:28, Laszlo Ersek wrote:
>> Base64Decode() now guarantees that DestinationSize is larger on output
>> than it was on input if RETURN_BUFFER_TOO_SMALL is returned. Clean up the
>> retval handling for the first Base64Decode() call in EnrollDefaultKeys,
>> which used to work around the ambiguity in the previous Base64Decode()
>> interface contract.
>>
>> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> Cc: Jordan Justen <jordan.l.justen@intel.com>
>> Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
>> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=1891
>> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
>> ---
>>  OvmfPkg/EnrollDefaultKeys/EnrollDefaultKeys.c | 10 ++--------
>>  1 file changed, 2 insertions(+), 8 deletions(-)
>>
>> diff --git a/OvmfPkg/EnrollDefaultKeys/EnrollDefaultKeys.c b/OvmfPkg/EnrollDefaultKeys/EnrollDefaultKeys.c
>> index f45cb799f726..302b80d97720 100644
>> --- a/OvmfPkg/EnrollDefaultKeys/EnrollDefaultKeys.c
>> +++ b/OvmfPkg/EnrollDefaultKeys/EnrollDefaultKeys.c
>> @@ -154,14 +154,8 @@ GetPkKek1 (
>>    Status = Base64Decode (Base64Cert, Base64CertLen, NULL, &DecodedCertSize);
>>    switch (Status) {
>>    case EFI_BUFFER_TOO_SMALL:
>> -    if (DecodedCertSize > 0) {
>> -      break;
>> -    }
>> -    //
>> -    // Fall through: the above Base64Decode() call is ill-specified in BaseLib
>> -    // if Source decodes to zero bytes (for example if it consists of ignored
>> -    // whitespace only).
>> -    //
>> +    ASSERT (DecodedCertSize > 0);
>> +    break;
>>    case EFI_SUCCESS:
>>      AsciiPrint ("error: empty certificate after app prefix %g\n",
>>        &gOvmfPkKek1AppPrefixGuid);
>>

Reviewed-by: Philippe Mathieu-Daude <philmd@redhat.com>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] MdePkg/BaseLib: re-specify Base64Decode(), and add temporary stub impl
  2019-07-02 10:28 ` [PATCH 1/3] MdePkg/BaseLib: re-specify Base64Decode(), and add temporary stub impl Laszlo Ersek
@ 2019-07-16  8:38   ` Philippe Mathieu-Daudé
  2019-07-16  9:41     ` Philippe Mathieu-Daudé
  2019-07-16 10:49   ` Laszlo Ersek
  1 sibling, 1 reply; 26+ messages in thread
From: Philippe Mathieu-Daudé @ 2019-07-16  8:38 UTC (permalink / raw)
  To: Laszlo Ersek, edk2-devel-groups-io
  Cc: Liming Gao, Marvin Häuser, Michael D Kinney, Zhichao Gao

On 7/2/19 12:28 PM, Laszlo Ersek wrote:
> Rewrite Base64Decode() from scratch, due to reasons listed in the second
> reference below.
> 
> As first step, redo the interface contract, and replace the current
> implementation with a stub that asserts FALSE, then fails.
> 
> Cc: Liming Gao <liming.gao@intel.com>
> Cc: Marvin Häuser <mhaeuser@outlook.de>
> Cc: Michael D Kinney <michael.d.kinney@intel.com>
> Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
> Cc: Zhichao Gao <zhichao.gao@intel.com>
> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=1891
> Ref: http://mid.mail-archive.com/c495bd0b-ea4d-7206-8a4f-a7149760d19a@redhat.com
> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
> ---
>  MdePkg/Include/Library/BaseLib.h |  99 +++++--
>  MdePkg/Library/BaseLib/String.c  | 285 ++++++--------------
>  2 files changed, 168 insertions(+), 216 deletions(-)
> 
> diff --git a/MdePkg/Include/Library/BaseLib.h b/MdePkg/Include/Library/BaseLib.h
> index ebd7dd274cf4..5ef03e24edb1 100644
> --- a/MdePkg/Include/Library/BaseLib.h
> +++ b/MdePkg/Include/Library/BaseLib.h
> @@ -2785,31 +2785,94 @@ Base64Encode (
>    );
>  
>  /**
> -  Convert Base64 ascii string to binary data based on RFC4648.
> +  Decode Base64 ASCII encoded data to 8-bit binary representation, based on
> +  RFC4648.
>  
> -  Produce Null-terminated binary data in the output buffer specified by Destination and DestinationSize.
> -  The binary data is produced by converting the Base64 ascii string specified by Source and SourceLength.
> +  Decoding occurs according to "Table 1: The Base 64 Alphabet" in RFC4648.
>  
> -  @param Source          Input ASCII characters
> -  @param SourceLength    Number of ASCII characters
> -  @param Destination     Pointer to output buffer
> -  @param DestinationSize Caller is responsible for passing in buffer of at least DestinationSize.
> -                         Set 0 to get the size needed. Set to bytes stored on return.
> +  Whitespace is ignored at all positions:
> +  - 0x09 ('\t') horizontal tab
> +  - 0x0A ('\n') new line
> +  - 0x0B ('\v') vertical tab
> +  - 0x0C ('\f') form feed
> +  - 0x0D ('\r') carriage return
> +  - 0x20 (' ')  space
>  
> -  @retval RETURN_SUCCESS             When binary buffer is filled in.
> -  @retval RETURN_INVALID_PARAMETER   If Source is NULL or DestinationSize is NULL.
> -  @retval RETURN_INVALID_PARAMETER   If SourceLength or DestinationSize is bigger than (MAX_ADDRESS -(UINTN)Destination ).
> -  @retval RETURN_INVALID_PARAMETER   If there is any invalid character in input stream.
> -  @retval RETURN_BUFFER_TOO_SMALL    If buffer length is smaller than required buffer size.
> +  The minimum amount of required padding (with ASCII 0x3D, '=') is tolerated
> +  and enforced at the end of the Base64 ASCII encoded data, and only there.
>  
> - **/
> +  Other characters outside of the encoding alphabet cause the function to
> +  reject the Base64 ASCII encoded data.
> +
> +  @param[in] Source               Array of CHAR8 elements containing the Base64
> +                                  ASCII encoding. May be NULL if SourceSize is
> +                                  zero.
> +
> +  @param[in] SourceSize           Number of CHAR8 elements in Source.
> +
> +  @param[out] Destination         Array of UINT8 elements receiving the decoded
> +                                  8-bit binary representation. Allocated by the
> +                                  caller. May be NULL if DestinationSize is
> +                                  zero on input. If NULL, decoding is
> +                                  performed, but the 8-bit binary
> +                                  representation is not stored. If non-NULL and
> +                                  the function returns an error, the contents
> +                                  of Destination are indeterminate.
> +
> +  @param[in,out] DestinationSize  On input, the number of UINT8 elements that
> +                                  the caller allocated for Destination. On
> +                                  output, if the function returns
> +                                  RETURN_SUCCESS or RETURN_BUFFER_TOO_SMALL,
> +                                  the number of UINT8 elements that are
> +                                  required for decoding the Base64 ASCII
> +                                  representation. If the function returns a
> +                                  value different from both RETURN_SUCCESS and
> +                                  RETURN_BUFFER_TOO_SMALL, then DestinationSize
> +                                  is indeterminate on output.
> +
> +  @retval RETURN_SUCCESS            SourceSize CHAR8 elements at Source have
> +                                    been decoded to on-output DestinationSize
> +                                    UINT8 elements at Destination. Note that
> +                                    RETURN_SUCCESS covers the case when
> +                                    DestinationSize is zero on input, and
> +                                    Source decodes to zero bytes (due to
> +                                    containing at most ignored whitespace).
> +
> +  @retval RETURN_BUFFER_TOO_SMALL   The input value of DestinationSize is not
> +                                    large enough for decoding SourceSize CHAR8
> +                                    elements at Source. The required number of
> +                                    UINT8 elements has been stored to
> +                                    DestinationSize.
> +
> +  @retval RETURN_INVALID_PARAMETER  DestinationSize is NULL.
> +
> +  @retval RETURN_INVALID_PARAMETER  Source is NULL, but SourceSize is not zero.
> +
> +  @retval RETURN_INVALID_PARAMETER  Destination is NULL, but DestinationSize is
> +                                    not zero on input.
> +
> +  @retval RETURN_INVALID_PARAMETER  Source is non-NULL, and (Source +
> +                                    SourceSize) would wrap around MAX_ADDRESS.
> +
> +  @retval RETURN_INVALID_PARAMETER  Destination is non-NULL, and (Destination +
> +                                    DestinationSize) would wrap around
> +                                    MAX_ADDRESS, as specified on input.
> +
> +  @retval RETURN_INVALID_PARAMETER  None of Source and Destination are NULL,
> +                                    and CHAR8[SourceSize] at Source overlaps
> +                                    UINT8[DestinationSize] at Destination, as
> +                                    specified on input.
> +
> +  @retval RETURN_INVALID_PARAMETER  Invalid CHAR8 element encountered in
> +                                    Source.
> +**/
>  RETURN_STATUS
>  EFIAPI
>  Base64Decode (
> -  IN  CONST CHAR8  *Source,
> -  IN        UINTN   SourceLength,
> -  OUT       UINT8  *Destination  OPTIONAL,
> -  IN OUT    UINTN  *DestinationSize
> +  IN     CONST CHAR8 *Source          OPTIONAL,
> +  IN     UINTN       SourceSize,
> +  OUT    UINT8       *Destination     OPTIONAL,
> +  IN OUT UINTN       *DestinationSize
>    );
>  
>  /**
> diff --git a/MdePkg/Library/BaseLib/String.c b/MdePkg/Library/BaseLib/String.c
> index 32e189791cb8..f8397035c32a 100644
> --- a/MdePkg/Library/BaseLib/String.c
> +++ b/MdePkg/Library/BaseLib/String.c
> @@ -1757,45 +1757,10 @@ AsciiStrToUnicodeStr (
>  
>  #endif
>  
> -//
> -// The basis for Base64 encoding is RFC 4686 https://tools.ietf.org/html/rfc4648
> -//
> -// RFC 4686 has a number of MAY and SHOULD cases.  This implementation chooses
> -// the more restrictive versions for security concerns (see RFC 4686 section 3.3).
> -//
> -// A invalid character, if encountered during the decode operation, causes the data
> -// to be rejected. In addition, the '=' padding character is only allowed at the end
> -// of the Base64 encoded string.
> -//
> -#define BAD_V  99
> -
>  STATIC CHAR8 EncodingTable[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
>                                  "abcdefghijklmnopqrstuvwxyz"
>                                  "0123456789+/";
>  
> -STATIC UINT8 DecodingTable[] = {
> -  //
> -  // Valid characters ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/
> -  // Also, set '=' as a zero for decoding
> -  // 0  ,            1,           2,           3,            4,           5,            6,           7,           8,            9,           a,            b,            c,           d,            e,            f
> -  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,   //   0
> -  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,   //  10
> -  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,     62,  BAD_V,  BAD_V,  BAD_V,     63,   //  20
> -     52,     53,     54,     55,     56,     57,     58,     59,     60,     61,  BAD_V,  BAD_V,  BAD_V,      0,  BAD_V,  BAD_V,   //  30
> -  BAD_V,      0,      1,      2,      3,      4,      5,      6,      7,      8,      9,     10,     11,     12,     13,     14,   //  40
> -     15,     16,     17,     18,     19,     20,     21,     22,     23,     24,     25,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,   //  50
> -  BAD_V,     26,     27,     28,     29,     30,     31,     32,     33,     34,     35,     36,     37,     38,     39,     40,   //  60
> -     41,     42,     43,     44,     45,     46,     47,     48,     49,     50,     51,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,   //  70
> -  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,   //  80
> -  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,   //  90
> -  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,   //  a0
> -  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,   //  b0
> -  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,   //  c0
> -  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,   //  d0
> -  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,   //  d0
> -  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V    //  f0
> -};
> -
>  /**
>    Convert binary data to a Base64 encoded ascii string based on RFC4648.
>  
> @@ -1918,174 +1883,98 @@ Base64Encode (
>  }
>  
>  /**
> -  Convert Base64 ascii string to binary data based on RFC4648.
> -
> -  Produce Null-terminated binary data in the output buffer specified by Destination and DestinationSize.
> -  The binary data is produced by converting the Base64 ascii string specified by Source and SourceLength.
> -
> -  @param Source            Input ASCII characters
> -  @param SourceLength      Number of ASCII characters
> -  @param Destination       Pointer to output buffer
> -  @param DestinationSize   Caller is responsible for passing in buffer of at least DestinationSize.
> -                           Set 0 to get the size needed. Set to bytes stored on return.
> -
> -  @retval RETURN_SUCCESS             When binary buffer is filled in.
> -  @retval RETURN_INVALID_PARAMETER   If Source is NULL or DestinationSize is NULL.
> -  @retval RETURN_INVALID_PARAMETER   If SourceLength or DestinationSize is bigger than (MAX_ADDRESS -(UINTN)Destination ).
> -  @retval RETURN_INVALID_PARAMETER   If there is any invalid character in input stream.
> -  @retval RETURN_BUFFER_TOO_SMALL    If buffer length is smaller than required buffer size.
> - **/
> +  Decode Base64 ASCII encoded data to 8-bit binary representation, based on
> +  RFC4648.
> +
> +  Decoding occurs according to "Table 1: The Base 64 Alphabet" in RFC4648.
> +
> +  Whitespace is ignored at all positions:
> +  - 0x09 ('\t') horizontal tab
> +  - 0x0A ('\n') new line
> +  - 0x0B ('\v') vertical tab
> +  - 0x0C ('\f') form feed
> +  - 0x0D ('\r') carriage return
> +  - 0x20 (' ')  space
> +
> +  The minimum amount of required padding (with ASCII 0x3D, '=') is tolerated
> +  and enforced at the end of the Base64 ASCII encoded data, and only there.
> +
> +  Other characters outside of the encoding alphabet cause the function to
> +  reject the Base64 ASCII encoded data.
> +
> +  @param[in] Source               Array of CHAR8 elements containing the Base64
> +                                  ASCII encoding. May be NULL if SourceSize is
> +                                  zero.
> +
> +  @param[in] SourceSize           Number of CHAR8 elements in Source.
> +
> +  @param[out] Destination         Array of UINT8 elements receiving the decoded
> +                                  8-bit binary representation. Allocated by the
> +                                  caller. May be NULL if DestinationSize is
> +                                  zero on input. If NULL, decoding is
> +                                  performed, but the 8-bit binary
> +                                  representation is not stored. If non-NULL and
> +                                  the function returns an error, the contents
> +                                  of Destination are indeterminate.
> +
> +  @param[in,out] DestinationSize  On input, the number of UINT8 elements that
> +                                  the caller allocated for Destination. On
> +                                  output, if the function returns
> +                                  RETURN_SUCCESS or RETURN_BUFFER_TOO_SMALL,
> +                                  the number of UINT8 elements that are
> +                                  required for decoding the Base64 ASCII
> +                                  representation. If the function returns a
> +                                  value different from both RETURN_SUCCESS and
> +                                  RETURN_BUFFER_TOO_SMALL, then DestinationSize
> +                                  is indeterminate on output.
> +
> +  @retval RETURN_SUCCESS            SourceSize CHAR8 elements at Source have
> +                                    been decoded to on-output DestinationSize
> +                                    UINT8 elements at Destination. Note that
> +                                    RETURN_SUCCESS covers the case when
> +                                    DestinationSize is zero on input, and
> +                                    Source decodes to zero bytes (due to
> +                                    containing at most ignored whitespace).
> +
> +  @retval RETURN_BUFFER_TOO_SMALL   The input value of DestinationSize is not
> +                                    large enough for decoding SourceSize CHAR8
> +                                    elements at Source. The required number of
> +                                    UINT8 elements has been stored to
> +                                    DestinationSize.
> +
> +  @retval RETURN_INVALID_PARAMETER  DestinationSize is NULL.
> +
> +  @retval RETURN_INVALID_PARAMETER  Source is NULL, but SourceSize is not zero.
> +
> +  @retval RETURN_INVALID_PARAMETER  Destination is NULL, but DestinationSize is
> +                                    not zero on input.
> +
> +  @retval RETURN_INVALID_PARAMETER  Source is non-NULL, and (Source +
> +                                    SourceSize) would wrap around MAX_ADDRESS.
> +
> +  @retval RETURN_INVALID_PARAMETER  Destination is non-NULL, and (Destination +
> +                                    DestinationSize) would wrap around
> +                                    MAX_ADDRESS, as specified on input.
> +
> +  @retval RETURN_INVALID_PARAMETER  None of Source and Destination are NULL,
> +                                    and CHAR8[SourceSize] at Source overlaps
> +                                    UINT8[DestinationSize] at Destination, as
> +                                    specified on input.
> +
> +  @retval RETURN_INVALID_PARAMETER  Invalid CHAR8 element encountered in
> +                                    Source.
> +**/
>  RETURN_STATUS
>  EFIAPI
>  Base64Decode (
> -  IN  CONST CHAR8  *Source,
> -  IN        UINTN   SourceLength,
> -  OUT       UINT8  *Destination   OPTIONAL,
> -  IN OUT    UINTN  *DestinationSize
> +  IN     CONST CHAR8 *Source          OPTIONAL,
> +  IN     UINTN       SourceSize,
> +  OUT    UINT8       *Destination     OPTIONAL,
> +  IN OUT UINTN       *DestinationSize
>    )
>  {
> -
> -  UINT32   Value;
> -  CHAR8    Chr;
> -  INTN     BufferSize;
> -  UINTN    SourceIndex;
> -  UINTN    DestinationIndex;
> -  UINTN    Index;
> -  UINTN    ActualSourceLength;
> -
> -  //
> -  // Check pointers are not NULL
> -  //
> -  if ((Source == NULL) || (DestinationSize == NULL)) {
> -    return RETURN_INVALID_PARAMETER;
> -  }
> -
> -  //
> -  // Check if SourceLength or  DestinationSize is valid
> -  //
> -  if ((SourceLength >= (MAX_ADDRESS - (UINTN)Source)) || (*DestinationSize >= (MAX_ADDRESS - (UINTN)Destination))){
> -    return RETURN_INVALID_PARAMETER;
> -  }
> -
> -  ActualSourceLength = 0;
> -  BufferSize = 0;
> -
> -  //
> -  // Determine the actual number of valid characters in the string.
> -  // All invalid characters except selected white space characters,
> -  // will cause the Base64 string to be rejected. White space to allow
> -  // properly formatted XML will be ignored.
> -  //
> -  // See section 3.3 of RFC 4648.
> -  //
> -  for (SourceIndex = 0; SourceIndex < SourceLength; SourceIndex++) {
> -
> -    //
> -    // '=' is part of the quantum
> -    //
> -    if (Source[SourceIndex] == '=') {
> -      ActualSourceLength++;
> -      BufferSize--;
> -
> -      //
> -      // Only two '=' characters can be valid.
> -      //
> -      if (BufferSize < -2) {
> -        return RETURN_INVALID_PARAMETER;
> -      }
> -    }
> -    else {
> -      Chr = Source[SourceIndex];
> -      if (BAD_V != DecodingTable[(UINT8) Chr]) {
> -
> -        //
> -        // The '=' characters are only valid at the end, so any
> -        // valid character after an '=', will be flagged as an error.
> -        //
> -        if (BufferSize < 0) {
> -          return RETURN_INVALID_PARAMETER;
> -        }
> -          ActualSourceLength++;
> -      }
> -        else {
> -
> -        //
> -        // The reset of the decoder will ignore all invalid characters allowed here.
> -        // Ignoring selected white space is useful.  In this case, the decoder will
> -        // ignore ' ', '\t', '\n', and '\r'.
> -        //
> -        if ((Chr != ' ') &&(Chr != '\t') &&(Chr != '\n') &&(Chr != '\r')) {
> -          return RETURN_INVALID_PARAMETER;
> -        }
> -      }
> -    }
> -  }
> -
> -  //
> -  // The Base64 character string must be a multiple of 4 character quantums.
> -  //
> -  if (ActualSourceLength % 4 != 0) {
> -    return RETURN_INVALID_PARAMETER;
> -  }
> -
> -  BufferSize += ActualSourceLength / 4 * 3;
> -    if (BufferSize < 0) {
> -      return RETURN_INVALID_PARAMETER;
> -  }
> -
> -  //
> -  // BufferSize is >= 0
> -  //
> -  if ((Destination == NULL) || (*DestinationSize < (UINTN) BufferSize)) {
> -    *DestinationSize = BufferSize;
> -    return RETURN_BUFFER_TOO_SMALL;
> -  }
> -
> -  //
> -  // If no decodable characters, return a size of zero. RFC 4686 test vector 1.
> -  //
> -  if (ActualSourceLength == 0) {
> -    *DestinationSize = 0;
> -    return RETURN_SUCCESS;
> -  }
> -
> -  //
> -  // Input data is verified to be a multiple of 4 valid charcters.  Process four
> -  // characters at a time. Uncounted (ie. invalid)  characters will be ignored.
> -  //
> -  for (SourceIndex = 0, DestinationIndex = 0; (SourceIndex < SourceLength) && (DestinationIndex < *DestinationSize); ) {
> -    Value = 0;
> -
> -    //
> -    // Get 24 bits of data from 4 input characters, each character representing 6 bits
> -    //
> -    for (Index = 0; Index < 4; Index++) {
> -      do {
> -      Chr = DecodingTable[(UINT8) Source[SourceIndex++]];
> -      } while (Chr == BAD_V);
> -      Value <<= 6;
> -      Value |= (UINT32)Chr;
> -    }
> -
> -    //
> -    // Store 3 bytes of binary data (24 bits)
> -    //
> -    *Destination++ = (UINT8) (Value >> 16);
> -    DestinationIndex++;
> -
> -    //
> -    // Due to the '=' special cases for the two bytes at the end,
> -    // we have to check the length and not store the padding data
> -    //
> -    if (DestinationIndex++ < *DestinationSize) {
> -      *Destination++ = (UINT8) (Value >>  8);
> -    }
> -    if (DestinationIndex++ < *DestinationSize) {
> -      *Destination++ = (UINT8) Value;
> -    }
> -  }
> -
> -  return RETURN_SUCCESS;
> +  ASSERT (FALSE);
> +  return RETURN_INVALID_PARAMETER;
>  }
>  
>  /**
> 

Reviewed-by: Philippe Mathieu-Daude <philmd@redhat.com>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] MdePkg/BaseLib: re-specify Base64Decode(), and add temporary stub impl
  2019-07-16  8:38   ` Philippe Mathieu-Daudé
@ 2019-07-16  9:41     ` Philippe Mathieu-Daudé
  2019-07-16 14:14       ` Laszlo Ersek
  0 siblings, 1 reply; 26+ messages in thread
From: Philippe Mathieu-Daudé @ 2019-07-16  9:41 UTC (permalink / raw)
  To: Laszlo Ersek, edk2-devel-groups-io
  Cc: Liming Gao, Marvin Häuser, Michael D Kinney, Zhichao Gao

Hi Laszlo,

On 7/16/19 10:38 AM, Philippe Mathieu-Daudé wrote:
> On 7/2/19 12:28 PM, Laszlo Ersek wrote:
>> Rewrite Base64Decode() from scratch, due to reasons listed in the second
>> reference below.
>>
>> As first step, redo the interface contract, and replace the current
>> implementation with a stub that asserts FALSE, then fails.
>>
>> Cc: Liming Gao <liming.gao@intel.com>
>> Cc: Marvin Häuser <mhaeuser@outlook.de>
>> Cc: Michael D Kinney <michael.d.kinney@intel.com>
>> Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
>> Cc: Zhichao Gao <zhichao.gao@intel.com>
>> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=1891
>> Ref: http://mid.mail-archive.com/c495bd0b-ea4d-7206-8a4f-a7149760d19a@redhat.com
>> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
>> ---
>>  MdePkg/Include/Library/BaseLib.h |  99 +++++--
>>  MdePkg/Library/BaseLib/String.c  | 285 ++++++--------------
>>  2 files changed, 168 insertions(+), 216 deletions(-)

You forgot to update the copyright in both files.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/3] MdePkg/BaseLib: rewrite Base64Decode()
  2019-07-02 10:28 ` [PATCH 2/3] MdePkg/BaseLib: rewrite Base64Decode() Laszlo Ersek
  2019-07-12  2:31   ` [edk2-devel] " Gao, Zhichao
  2019-07-15 18:44   ` mhaeuser
@ 2019-07-16 10:05   ` Philippe Mathieu-Daudé
  2019-07-16 14:17     ` Laszlo Ersek
  2 siblings, 1 reply; 26+ messages in thread
From: Philippe Mathieu-Daudé @ 2019-07-16 10:05 UTC (permalink / raw)
  To: Laszlo Ersek, edk2-devel-groups-io
  Cc: Liming Gao, Marvin Häuser, Michael D Kinney, Zhichao Gao

Hi Laszlo,

On 7/2/19 12:28 PM, Laszlo Ersek wrote:
> Rewrite Base64Decode() from scratch, due to reasons listed in the second
> reference below.
> 
> Implement Base64Decode() according to the specification added in the
> previous patch. The decoder scans the input buffer once, it has no inner
> loop(s), and it spills each output byte as soon as the output byte is
> complete.

Sorry it took me so long, I was reluctant to review this at first,
because reimplementing a piece of code to fix a bug often introduce new
bugs. However your implementation is very clean to follow (well
described) and certainly safer.

> Cc: Liming Gao <liming.gao@intel.com>
> Cc: Marvin Häuser <mhaeuser@outlook.de>
> Cc: Michael D Kinney <michael.d.kinney@intel.com>
> Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
> Cc: Zhichao Gao <zhichao.gao@intel.com>
> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=1891
> Ref: http://mid.mail-archive.com/c495bd0b-ea4d-7206-8a4f-a7149760d19a@redhat.com
> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
> ---
>  MdePkg/Library/BaseLib/String.c | 249 +++++++++++++++++++-
>  1 file changed, 247 insertions(+), 2 deletions(-)
> 
> diff --git a/MdePkg/Library/BaseLib/String.c b/MdePkg/Library/BaseLib/String.c
> index f8397035c32a..6198ccbc9672 100644
> --- a/MdePkg/Library/BaseLib/String.c
> +++ b/MdePkg/Library/BaseLib/String.c
> @@ -1973,8 +1973,253 @@ Base64Decode (
>    IN OUT UINTN       *DestinationSize
>    )
>  {
> -  ASSERT (FALSE);
> -  return RETURN_INVALID_PARAMETER;
> +  BOOLEAN PaddingMode;
> +  UINTN   SixBitGroupsConsumed;
> +  UINT32  Accumulator;
> +  UINTN   OriginalDestinationSize;
> +  UINTN   SourceIndex;
> +
> +  if (DestinationSize == NULL) {
> +    return RETURN_INVALID_PARAMETER;
> +  }
> +
> +  //
> +  // Check Source array validity.
> +  //
> +  if (Source == NULL) {
> +    if (SourceSize > 0) {
> +      //
> +      // At least one CHAR8 element at NULL Source.
> +      //
> +      return RETURN_INVALID_PARAMETER;
> +    }
> +  } else if (SourceSize > MAX_ADDRESS - (UINTN)Source) {
> +    //
> +    // Non-NULL Source, but it wraps around.

You might want to add a comment in the commit description from your
reply to Marvin regarding keeping MAX_ADDRESS, mostly "The original code
included similar MAX_ADDRESS checks".

No more comments :)
Reviewed-by: Philippe Mathieu-Daude <philmd@redhat.com>

> +    //
> +    return RETURN_INVALID_PARAMETER;
> +  }
> +
> +  //
> +  // Check Destination array validity.
> +  //
> +  if (Destination == NULL) {
> +    if (*DestinationSize > 0) {
> +      //
> +      // At least one UINT8 element at NULL Destination.
> +      //
> +      return RETURN_INVALID_PARAMETER;
> +    }
> +  } else if (*DestinationSize > MAX_ADDRESS - (UINTN)Destination) {
> +    //
> +    // Non-NULL Destination, but it wraps around.
> +    //
> +    return RETURN_INVALID_PARAMETER;
> +  }
> +
> +  //
> +  // Check for overlap.
> +  //
> +  if (Source != NULL && Destination != NULL) {
> +    //
> +    // Both arrays have been provided, and we know from earlier that each array
> +    // is valid in itself.
> +    //
> +    if ((UINTN)Source + SourceSize <= (UINTN)Destination) {
> +      //
> +      // Source array precedes Destination array, OK.
> +      //
> +    } else if ((UINTN)Destination + *DestinationSize <= (UINTN)Source) {
> +      //
> +      // Destination array precedes Source array, OK.
> +      //
> +    } else {
> +      //
> +      // Overlap.
> +      //
> +      return RETURN_INVALID_PARAMETER;
> +    }
> +  }
> +
> +  //
> +  // Decoding loop setup.
> +  //
> +  PaddingMode             = FALSE;
> +  SixBitGroupsConsumed    = 0;
> +  Accumulator             = 0;
> +  OriginalDestinationSize = *DestinationSize;
> +  *DestinationSize        = 0;
> +
> +  //
> +  // Decoding loop.
> +  //
> +  for (SourceIndex = 0; SourceIndex < SourceSize; SourceIndex++) {
> +    CHAR8  SourceChar;
> +    UINT32 Base64Value;
> +    UINT8  DestinationOctet;
> +
> +    SourceChar = Source[SourceIndex];
> +
> +    //
> +    // Whitespace is ignored at all positions (regardless of padding mode).
> +    //
> +    if (SourceChar == '\t' || SourceChar == '\n' || SourceChar == '\v' ||
> +        SourceChar == '\f' || SourceChar == '\r' || SourceChar == ' ') {
> +      continue;
> +    }
> +
> +    //
> +    // If we're in padding mode, accept another padding character, as long as
> +    // that padding character completes the quantum. This completes case (2)
> +    // from RFC4648, Chapter 4. "Base 64 Encoding":
> +    //
> +    // (2) The final quantum of encoding input is exactly 8 bits; here, the
> +    //     final unit of encoded output will be two characters followed by two
> +    //     "=" padding characters.
> +    //
> +    if (PaddingMode) {
> +      if (SourceChar == '=' && SixBitGroupsConsumed == 3) {
> +        SixBitGroupsConsumed = 0;
> +        continue;
> +      }
> +      return RETURN_INVALID_PARAMETER;
> +    }
> +
> +    //
> +    // When not in padding mode, decode Base64Value based on RFC4648, "Table 1:
> +    // The Base 64 Alphabet".
> +    //
> +    if ('A' <= SourceChar && SourceChar <= 'Z') {
> +      Base64Value = SourceChar - 'A';
> +    } else if ('a' <= SourceChar && SourceChar <= 'z') {
> +      Base64Value = 26 + (SourceChar - 'a');
> +    } else if ('0' <= SourceChar && SourceChar <= '9') {
> +      Base64Value = 52 + (SourceChar - '0');
> +    } else if (SourceChar == '+') {
> +      Base64Value = 62;
> +    } else if (SourceChar == '/') {
> +      Base64Value = 63;
> +    } else if (SourceChar == '=') {
> +      //
> +      // Enter padding mode.
> +      //
> +      PaddingMode = TRUE;
> +
> +      if (SixBitGroupsConsumed == 2) {
> +        //
> +        // If we have consumed two 6-bit groups from the current quantum before
> +        // encountering the first padding character, then this is case (2) from
> +        // RFC4648, Chapter 4. "Base 64 Encoding". Bump SixBitGroupsConsumed,
> +        // and we'll enforce another padding character.
> +        //
> +        SixBitGroupsConsumed = 3;
> +      } else if (SixBitGroupsConsumed == 3) {
> +        //
> +        // If we have consumed three 6-bit groups from the current quantum
> +        // before encountering the first padding character, then this is case
> +        // (3) from RFC4648, Chapter 4. "Base 64 Encoding". The quantum is now
> +        // complete.
> +        //
> +        SixBitGroupsConsumed = 0;
> +      } else {
> +        //
> +        // Padding characters are not allowed at the first two positions of a
> +        // quantum.
> +        //
> +        return RETURN_INVALID_PARAMETER;
> +      }
> +
> +      //
> +      // Wherever in a quantum we enter padding mode, we enforce the padding
> +      // bits pending in the accumulator -- from the last 6-bit group just
> +      // preceding the padding character -- to be zero. Refer to RFC4648,
> +      // Chapter 3.5. "Canonical Encoding".
> +      //
> +      if (Accumulator != 0) {
> +        return RETURN_INVALID_PARAMETER;
> +      }
> +
> +      //
> +      // Advance to the next source character.
> +      //
> +      continue;
> +    } else {
> +      //
> +      // Other characters outside of the encoding alphabet are rejected.
> +      //
> +      return RETURN_INVALID_PARAMETER;
> +    }
> +
> +    //
> +    // Feed the bits of the current 6-bit group of the quantum to the
> +    // accumulator.
> +    //
> +    Accumulator = (Accumulator << 6) | Base64Value;
> +    SixBitGroupsConsumed++;
> +    switch (SixBitGroupsConsumed) {
> +    case 1:
> +      //
> +      // No octet to spill after consuming the first 6-bit group of the
> +      // quantum; advance to the next source character.
> +      //
> +      continue;
> +    case 2:
> +      //
> +      // 12 bits accumulated (6 pending + 6 new); prepare for spilling an
> +      // octet. 4 bits remain pending.
> +      //
> +      DestinationOctet = (UINT8)(Accumulator >> 4);
> +      Accumulator &= 0xF;
> +      break;
> +    case 3:
> +      //
> +      // 10 bits accumulated (4 pending + 6 new); prepare for spilling an
> +      // octet. 2 bits remain pending.
> +      //
> +      DestinationOctet = (UINT8)(Accumulator >> 2);
> +      Accumulator &= 0x3;
> +      break;
> +    default:
> +      ASSERT (SixBitGroupsConsumed == 4);
> +      //
> +      // 8 bits accumulated (2 pending + 6 new); prepare for spilling an octet.
> +      // The quantum is complete, 0 bits remain pending.
> +      //
> +      DestinationOctet = (UINT8)Accumulator;
> +      Accumulator = 0;
> +      SixBitGroupsConsumed = 0;
> +      break;
> +    }
> +
> +    //
> +    // Store the decoded octet if there's room left. Increment
> +    // (*DestinationSize) unconditionally.
> +    //
> +    if (*DestinationSize < OriginalDestinationSize) {
> +      ASSERT (Destination != NULL);
> +      Destination[*DestinationSize] = DestinationOctet;
> +    }
> +    (*DestinationSize)++;
> +
> +    //
> +    // Advance to the next source character.
> +    //
> +  }
> +
> +  //
> +  // If Source terminates mid-quantum, then Source is invalid.
> +  //
> +  if (SixBitGroupsConsumed != 0) {
> +    return RETURN_INVALID_PARAMETER;
> +  }
> +
> +  //
> +  // Done.
> +  //
> +  if (*DestinationSize <= OriginalDestinationSize) {
> +    return RETURN_SUCCESS;
> +  }
> +  return RETURN_BUFFER_TOO_SMALL;
>  }
>  
>  /**
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [edk2-devel] [PATCH 2/3] MdePkg/BaseLib: rewrite Base64Decode()
  2019-07-16  1:18           ` Gao, Zhichao
@ 2019-07-16 10:48             ` Laszlo Ersek
  0 siblings, 0 replies; 26+ messages in thread
From: Laszlo Ersek @ 2019-07-16 10:48 UTC (permalink / raw)
  To: Gao, Zhichao, Gao, Liming
  Cc: devel@edk2.groups.io, Kinney, Michael D, Marvin Häuser,
	Philippe Mathieu-Daudé

On 07/16/19 03:18, Gao, Zhichao wrote:
> 
> 
>> -----Original Message-----
>> From: Laszlo Ersek [mailto:lersek@redhat.com]
>> Sent: Tuesday, July 16, 2019 5:56 AM
>> To: Gao, Liming <liming.gao@intel.com>; Gao, Zhichao
>> <zhichao.gao@intel.com>
>> Cc: devel@edk2.groups.io; Kinney, Michael D <michael.d.kinney@intel.com>;
>> Marvin Häuser <mhaeuser@outlook.de>; Philippe Mathieu-Daudé
>> <philmd@redhat.com>
>> Subject: Re: [edk2-devel] [PATCH 2/3] MdePkg/BaseLib: rewrite
>> Base64Decode()
>>
>> On 07/15/19 17:22, Gao, Liming wrote:
>>> Laszlo:
>>>
>>>> -----Original Message-----
>>>> From: devel@edk2.groups.io [mailto:devel@edk2.groups.io] On Behalf Of
>>>> Laszlo Ersek
>>>> Sent: Saturday, July 13, 2019 3:31 AM
>>>> To: Gao, Zhichao <zhichao.gao@intel.com>; devel@edk2.groups.io; Gao,
>>>> Liming <liming.gao@intel.com>; Kinney, Michael D
>>>> <michael.d.kinney@intel.com>
>>>> Cc: Marvin Häuser <mhaeuser@outlook.de>; Philippe Mathieu-Daudé
>>>> <philmd@redhat.com>
>>>> Subject: Re: [edk2-devel] [PATCH 2/3] MdePkg/BaseLib: rewrite
>>>> Base64Decode()
>>>>
>>>> On 07/12/19 04:31, Gao, Zhichao wrote:
>>>>> Sorry for late respond.
>>>>> The whole code is OK for me. And I write a tiny test for it without
>>>>> the memory address check. See
>>>>
>> https://github.com/ZhichaoGao/edk2/commit/615356ba32d3147957d215deb
>> d8
>>>> 44e7709f06849 . It is tested in Emulator environment. If it is OK, I think you
>> can take my Tested-by for this patch. If there are some missing, please let
>> me know.
>>>>
>>>> Thanks for writing that test app. It seems to have pretty good coverage.
>>>> I like that it covers the exit points systematically.
>>>>
>>>> Mike, Liming: I intend to pick up Zhichao's T-b, from above. If you
>>>> are aware of another test suite (perhaps used in conjunction with the
>>>> originally contributed Base64Decode() impl), please let me know.
>>>
>>> OK to me.
>>>
>>>>
>>>>> Base64Decode parameter Source is indicate as OPTIONAL. Although it
>>>>> is OK to be NULL, and return DestinationSize to be zero with
>>>> success status to indicate no decode occurred . I don't know if it is
>>>> meaningful to report the result as that. In my opinion, NULL Source is an
>> invalid input. Just my opinion, if the maintainer is OK with that, I am OK too.
>>>
>>> Yes. Source is NULL, SourceSize is zero. Code does nothing. I am fine with it.
>>>
>>> I have no other comments for the code logic. Reviewed-by: Liming Gao
>>> <liming.gao@intel.com>
>>
>> Thank you!
>>
>> A few questions:
>>
>>
>> (1) Liming, does your R-b apply to the first patch in the series as well? (That
>> patch too is for MdePkg.)
>>
>> Here are two links to patch#1 in the series, for your convenience:
>>
>> http://mid.mail-archive.com/20190702102836.27589-2-lersek@redhat.com
>> https://edk2.groups.io/g/devel/message/43162
>>
>>
>> (2) Zhichao, you made a good remark about block-scoped variables, and the
>> CCS. You also gave your Tested-by for the present version. So, we have the
>> following two options:
>>
>> (2a)
>>
>> - I push the present patch as-is, including your Tested-by. (Together with
>> Liming's R-b -- he is OK with the present version.)
>>
>> - Subsequently, I post a separate (incremental) patch for moving the
>> variables from the inner block scope to the outer function block scope.
>> This incremental patch needs another MdePkg maintainer review, but it
>> doesn't need additional testing from you.
>>
>> (2b) Alternatively:
>>
>> - I post v2 of this series, which incorporates the movement of the variables
>> from the inner block scope to the outermost function block scope.
>>
>> - I keep Liming's R-b. (The variable definition movement is straight-forward
>> and I don't think it invalidates Liming's R-b).
>>
>> - I drop your Tested-by, because the variable movement technically changes
>> code that your testing exercised, and a Tested-by tag should only be applied
>> to the *exact* code that was exercised by the testing.
>>
>> - Therefore, for preserving your testing work in the git history, you would
>> have to redo the testing, please.
>>
>>
>> Zhichao, do you prefer (2a) or (2b)?
>>
>> Personally, I prefer (2a), because (2a) is safe -- importantly, the v1 code is
>> *valid* C --, and it is less work for the community (for you, Liming and myself,
>> together).
> 
> (2a) makes less work and makes a good history, (2b) makes a clean commit message. I have no idea on which is better.
> For me, I prefer (2a) too because I don't think clean commit message is such important.

Wait, I think we have a misunderstanding here. *Both* (2a) and (2b) are
"fully clean" with regard to commit messages.

Under (2a), we will have three patches for MdePkg, in the end:

- specification update & present code removal (patch v1 1/3)
- reimplementation (patch v1 2/3), carrying Liming's R-b and your T-b
- separate update to clean up the location of the variable definitions.
  This needs Liming's R-b, but not your T-b.

Note that the "separate update" in the end does *not* invalidate your
T-b on the reimplementation patch. You *did* test that version, and your
T-b is attached to *that* version. Your T-b would not be attached to the
final (separate) small patch at all -- and that is alright. So it's all
faithful to reality.

Clean commit messages are extremely important to me. The point is that
(2a) and (2b) *both* satisfy that. In other words, it's not a
distinguishing factor between (2a) and (2b).

> It's up to you to decide. If you choose (2b) I am fine to do a test again and provide a T-B.

OK, I think I will go with (2a) then. But, I still need Liming's R-b (or
Mike's) on the first patch in the present series.

And Jordan's on the last one (but maybe I'll just defer the last patch,
for OvmfPkg, to a separate TianoCore BZ).

Thanks!
Laszlo

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [edk2-devel] [PATCH 1/3] MdePkg/BaseLib: re-specify Base64Decode(), and add temporary stub impl
  2019-07-02 10:28 ` [PATCH 1/3] MdePkg/BaseLib: re-specify Base64Decode(), and add temporary stub impl Laszlo Ersek
  2019-07-16  8:38   ` Philippe Mathieu-Daudé
@ 2019-07-16 10:49   ` Laszlo Ersek
  2019-07-16 14:56     ` Liming Gao
  1 sibling, 1 reply; 26+ messages in thread
From: Laszlo Ersek @ 2019-07-16 10:49 UTC (permalink / raw)
  To: Liming Gao, Michael D Kinney
  Cc: edk2-devel-groups-io, Marvin Häuser,
	Philippe Mathieu-Daudé, Zhichao Gao

sending a separate ping here as well, for clarity -- Liming, Mike, can
one of you please R-b or A-b this specific patch too?

Thanks!
Laszlo

On 07/02/19 12:28, Laszlo Ersek wrote:
> Rewrite Base64Decode() from scratch, due to reasons listed in the second
> reference below.
> 
> As first step, redo the interface contract, and replace the current
> implementation with a stub that asserts FALSE, then fails.
> 
> Cc: Liming Gao <liming.gao@intel.com>
> Cc: Marvin Häuser <mhaeuser@outlook.de>
> Cc: Michael D Kinney <michael.d.kinney@intel.com>
> Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
> Cc: Zhichao Gao <zhichao.gao@intel.com>
> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=1891
> Ref: http://mid.mail-archive.com/c495bd0b-ea4d-7206-8a4f-a7149760d19a@redhat.com
> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
> ---
>  MdePkg/Include/Library/BaseLib.h |  99 +++++--
>  MdePkg/Library/BaseLib/String.c  | 285 ++++++--------------
>  2 files changed, 168 insertions(+), 216 deletions(-)
> 
> diff --git a/MdePkg/Include/Library/BaseLib.h b/MdePkg/Include/Library/BaseLib.h
> index ebd7dd274cf4..5ef03e24edb1 100644
> --- a/MdePkg/Include/Library/BaseLib.h
> +++ b/MdePkg/Include/Library/BaseLib.h
> @@ -2785,31 +2785,94 @@ Base64Encode (
>    );
>  
>  /**
> -  Convert Base64 ascii string to binary data based on RFC4648.
> +  Decode Base64 ASCII encoded data to 8-bit binary representation, based on
> +  RFC4648.
>  
> -  Produce Null-terminated binary data in the output buffer specified by Destination and DestinationSize.
> -  The binary data is produced by converting the Base64 ascii string specified by Source and SourceLength.
> +  Decoding occurs according to "Table 1: The Base 64 Alphabet" in RFC4648.
>  
> -  @param Source          Input ASCII characters
> -  @param SourceLength    Number of ASCII characters
> -  @param Destination     Pointer to output buffer
> -  @param DestinationSize Caller is responsible for passing in buffer of at least DestinationSize.
> -                         Set 0 to get the size needed. Set to bytes stored on return.
> +  Whitespace is ignored at all positions:
> +  - 0x09 ('\t') horizontal tab
> +  - 0x0A ('\n') new line
> +  - 0x0B ('\v') vertical tab
> +  - 0x0C ('\f') form feed
> +  - 0x0D ('\r') carriage return
> +  - 0x20 (' ')  space
>  
> -  @retval RETURN_SUCCESS             When binary buffer is filled in.
> -  @retval RETURN_INVALID_PARAMETER   If Source is NULL or DestinationSize is NULL.
> -  @retval RETURN_INVALID_PARAMETER   If SourceLength or DestinationSize is bigger than (MAX_ADDRESS -(UINTN)Destination ).
> -  @retval RETURN_INVALID_PARAMETER   If there is any invalid character in input stream.
> -  @retval RETURN_BUFFER_TOO_SMALL    If buffer length is smaller than required buffer size.
> +  The minimum amount of required padding (with ASCII 0x3D, '=') is tolerated
> +  and enforced at the end of the Base64 ASCII encoded data, and only there.
>  
> - **/
> +  Other characters outside of the encoding alphabet cause the function to
> +  reject the Base64 ASCII encoded data.
> +
> +  @param[in] Source               Array of CHAR8 elements containing the Base64
> +                                  ASCII encoding. May be NULL if SourceSize is
> +                                  zero.
> +
> +  @param[in] SourceSize           Number of CHAR8 elements in Source.
> +
> +  @param[out] Destination         Array of UINT8 elements receiving the decoded
> +                                  8-bit binary representation. Allocated by the
> +                                  caller. May be NULL if DestinationSize is
> +                                  zero on input. If NULL, decoding is
> +                                  performed, but the 8-bit binary
> +                                  representation is not stored. If non-NULL and
> +                                  the function returns an error, the contents
> +                                  of Destination are indeterminate.
> +
> +  @param[in,out] DestinationSize  On input, the number of UINT8 elements that
> +                                  the caller allocated for Destination. On
> +                                  output, if the function returns
> +                                  RETURN_SUCCESS or RETURN_BUFFER_TOO_SMALL,
> +                                  the number of UINT8 elements that are
> +                                  required for decoding the Base64 ASCII
> +                                  representation. If the function returns a
> +                                  value different from both RETURN_SUCCESS and
> +                                  RETURN_BUFFER_TOO_SMALL, then DestinationSize
> +                                  is indeterminate on output.
> +
> +  @retval RETURN_SUCCESS            SourceSize CHAR8 elements at Source have
> +                                    been decoded to on-output DestinationSize
> +                                    UINT8 elements at Destination. Note that
> +                                    RETURN_SUCCESS covers the case when
> +                                    DestinationSize is zero on input, and
> +                                    Source decodes to zero bytes (due to
> +                                    containing at most ignored whitespace).
> +
> +  @retval RETURN_BUFFER_TOO_SMALL   The input value of DestinationSize is not
> +                                    large enough for decoding SourceSize CHAR8
> +                                    elements at Source. The required number of
> +                                    UINT8 elements has been stored to
> +                                    DestinationSize.
> +
> +  @retval RETURN_INVALID_PARAMETER  DestinationSize is NULL.
> +
> +  @retval RETURN_INVALID_PARAMETER  Source is NULL, but SourceSize is not zero.
> +
> +  @retval RETURN_INVALID_PARAMETER  Destination is NULL, but DestinationSize is
> +                                    not zero on input.
> +
> +  @retval RETURN_INVALID_PARAMETER  Source is non-NULL, and (Source +
> +                                    SourceSize) would wrap around MAX_ADDRESS.
> +
> +  @retval RETURN_INVALID_PARAMETER  Destination is non-NULL, and (Destination +
> +                                    DestinationSize) would wrap around
> +                                    MAX_ADDRESS, as specified on input.
> +
> +  @retval RETURN_INVALID_PARAMETER  None of Source and Destination are NULL,
> +                                    and CHAR8[SourceSize] at Source overlaps
> +                                    UINT8[DestinationSize] at Destination, as
> +                                    specified on input.
> +
> +  @retval RETURN_INVALID_PARAMETER  Invalid CHAR8 element encountered in
> +                                    Source.
> +**/
>  RETURN_STATUS
>  EFIAPI
>  Base64Decode (
> -  IN  CONST CHAR8  *Source,
> -  IN        UINTN   SourceLength,
> -  OUT       UINT8  *Destination  OPTIONAL,
> -  IN OUT    UINTN  *DestinationSize
> +  IN     CONST CHAR8 *Source          OPTIONAL,
> +  IN     UINTN       SourceSize,
> +  OUT    UINT8       *Destination     OPTIONAL,
> +  IN OUT UINTN       *DestinationSize
>    );
>  
>  /**
> diff --git a/MdePkg/Library/BaseLib/String.c b/MdePkg/Library/BaseLib/String.c
> index 32e189791cb8..f8397035c32a 100644
> --- a/MdePkg/Library/BaseLib/String.c
> +++ b/MdePkg/Library/BaseLib/String.c
> @@ -1757,45 +1757,10 @@ AsciiStrToUnicodeStr (
>  
>  #endif
>  
> -//
> -// The basis for Base64 encoding is RFC 4686 https://tools.ietf.org/html/rfc4648
> -//
> -// RFC 4686 has a number of MAY and SHOULD cases.  This implementation chooses
> -// the more restrictive versions for security concerns (see RFC 4686 section 3.3).
> -//
> -// A invalid character, if encountered during the decode operation, causes the data
> -// to be rejected. In addition, the '=' padding character is only allowed at the end
> -// of the Base64 encoded string.
> -//
> -#define BAD_V  99
> -
>  STATIC CHAR8 EncodingTable[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
>                                  "abcdefghijklmnopqrstuvwxyz"
>                                  "0123456789+/";
>  
> -STATIC UINT8 DecodingTable[] = {
> -  //
> -  // Valid characters ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/
> -  // Also, set '=' as a zero for decoding
> -  // 0  ,            1,           2,           3,            4,           5,            6,           7,           8,            9,           a,            b,            c,           d,            e,            f
> -  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,   //   0
> -  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,   //  10
> -  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,     62,  BAD_V,  BAD_V,  BAD_V,     63,   //  20
> -     52,     53,     54,     55,     56,     57,     58,     59,     60,     61,  BAD_V,  BAD_V,  BAD_V,      0,  BAD_V,  BAD_V,   //  30
> -  BAD_V,      0,      1,      2,      3,      4,      5,      6,      7,      8,      9,     10,     11,     12,     13,     14,   //  40
> -     15,     16,     17,     18,     19,     20,     21,     22,     23,     24,     25,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,   //  50
> -  BAD_V,     26,     27,     28,     29,     30,     31,     32,     33,     34,     35,     36,     37,     38,     39,     40,   //  60
> -     41,     42,     43,     44,     45,     46,     47,     48,     49,     50,     51,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,   //  70
> -  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,   //  80
> -  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,   //  90
> -  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,   //  a0
> -  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,   //  b0
> -  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,   //  c0
> -  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,   //  d0
> -  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,   //  d0
> -  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V    //  f0
> -};
> -
>  /**
>    Convert binary data to a Base64 encoded ascii string based on RFC4648.
>  
> @@ -1918,174 +1883,98 @@ Base64Encode (
>  }
>  
>  /**
> -  Convert Base64 ascii string to binary data based on RFC4648.
> -
> -  Produce Null-terminated binary data in the output buffer specified by Destination and DestinationSize.
> -  The binary data is produced by converting the Base64 ascii string specified by Source and SourceLength.
> -
> -  @param Source            Input ASCII characters
> -  @param SourceLength      Number of ASCII characters
> -  @param Destination       Pointer to output buffer
> -  @param DestinationSize   Caller is responsible for passing in buffer of at least DestinationSize.
> -                           Set 0 to get the size needed. Set to bytes stored on return.
> -
> -  @retval RETURN_SUCCESS             When binary buffer is filled in.
> -  @retval RETURN_INVALID_PARAMETER   If Source is NULL or DestinationSize is NULL.
> -  @retval RETURN_INVALID_PARAMETER   If SourceLength or DestinationSize is bigger than (MAX_ADDRESS -(UINTN)Destination ).
> -  @retval RETURN_INVALID_PARAMETER   If there is any invalid character in input stream.
> -  @retval RETURN_BUFFER_TOO_SMALL    If buffer length is smaller than required buffer size.
> - **/
> +  Decode Base64 ASCII encoded data to 8-bit binary representation, based on
> +  RFC4648.
> +
> +  Decoding occurs according to "Table 1: The Base 64 Alphabet" in RFC4648.
> +
> +  Whitespace is ignored at all positions:
> +  - 0x09 ('\t') horizontal tab
> +  - 0x0A ('\n') new line
> +  - 0x0B ('\v') vertical tab
> +  - 0x0C ('\f') form feed
> +  - 0x0D ('\r') carriage return
> +  - 0x20 (' ')  space
> +
> +  The minimum amount of required padding (with ASCII 0x3D, '=') is tolerated
> +  and enforced at the end of the Base64 ASCII encoded data, and only there.
> +
> +  Other characters outside of the encoding alphabet cause the function to
> +  reject the Base64 ASCII encoded data.
> +
> +  @param[in] Source               Array of CHAR8 elements containing the Base64
> +                                  ASCII encoding. May be NULL if SourceSize is
> +                                  zero.
> +
> +  @param[in] SourceSize           Number of CHAR8 elements in Source.
> +
> +  @param[out] Destination         Array of UINT8 elements receiving the decoded
> +                                  8-bit binary representation. Allocated by the
> +                                  caller. May be NULL if DestinationSize is
> +                                  zero on input. If NULL, decoding is
> +                                  performed, but the 8-bit binary
> +                                  representation is not stored. If non-NULL and
> +                                  the function returns an error, the contents
> +                                  of Destination are indeterminate.
> +
> +  @param[in,out] DestinationSize  On input, the number of UINT8 elements that
> +                                  the caller allocated for Destination. On
> +                                  output, if the function returns
> +                                  RETURN_SUCCESS or RETURN_BUFFER_TOO_SMALL,
> +                                  the number of UINT8 elements that are
> +                                  required for decoding the Base64 ASCII
> +                                  representation. If the function returns a
> +                                  value different from both RETURN_SUCCESS and
> +                                  RETURN_BUFFER_TOO_SMALL, then DestinationSize
> +                                  is indeterminate on output.
> +
> +  @retval RETURN_SUCCESS            SourceSize CHAR8 elements at Source have
> +                                    been decoded to on-output DestinationSize
> +                                    UINT8 elements at Destination. Note that
> +                                    RETURN_SUCCESS covers the case when
> +                                    DestinationSize is zero on input, and
> +                                    Source decodes to zero bytes (due to
> +                                    containing at most ignored whitespace).
> +
> +  @retval RETURN_BUFFER_TOO_SMALL   The input value of DestinationSize is not
> +                                    large enough for decoding SourceSize CHAR8
> +                                    elements at Source. The required number of
> +                                    UINT8 elements has been stored to
> +                                    DestinationSize.
> +
> +  @retval RETURN_INVALID_PARAMETER  DestinationSize is NULL.
> +
> +  @retval RETURN_INVALID_PARAMETER  Source is NULL, but SourceSize is not zero.
> +
> +  @retval RETURN_INVALID_PARAMETER  Destination is NULL, but DestinationSize is
> +                                    not zero on input.
> +
> +  @retval RETURN_INVALID_PARAMETER  Source is non-NULL, and (Source +
> +                                    SourceSize) would wrap around MAX_ADDRESS.
> +
> +  @retval RETURN_INVALID_PARAMETER  Destination is non-NULL, and (Destination +
> +                                    DestinationSize) would wrap around
> +                                    MAX_ADDRESS, as specified on input.
> +
> +  @retval RETURN_INVALID_PARAMETER  None of Source and Destination are NULL,
> +                                    and CHAR8[SourceSize] at Source overlaps
> +                                    UINT8[DestinationSize] at Destination, as
> +                                    specified on input.
> +
> +  @retval RETURN_INVALID_PARAMETER  Invalid CHAR8 element encountered in
> +                                    Source.
> +**/
>  RETURN_STATUS
>  EFIAPI
>  Base64Decode (
> -  IN  CONST CHAR8  *Source,
> -  IN        UINTN   SourceLength,
> -  OUT       UINT8  *Destination   OPTIONAL,
> -  IN OUT    UINTN  *DestinationSize
> +  IN     CONST CHAR8 *Source          OPTIONAL,
> +  IN     UINTN       SourceSize,
> +  OUT    UINT8       *Destination     OPTIONAL,
> +  IN OUT UINTN       *DestinationSize
>    )
>  {
> -
> -  UINT32   Value;
> -  CHAR8    Chr;
> -  INTN     BufferSize;
> -  UINTN    SourceIndex;
> -  UINTN    DestinationIndex;
> -  UINTN    Index;
> -  UINTN    ActualSourceLength;
> -
> -  //
> -  // Check pointers are not NULL
> -  //
> -  if ((Source == NULL) || (DestinationSize == NULL)) {
> -    return RETURN_INVALID_PARAMETER;
> -  }
> -
> -  //
> -  // Check if SourceLength or  DestinationSize is valid
> -  //
> -  if ((SourceLength >= (MAX_ADDRESS - (UINTN)Source)) || (*DestinationSize >= (MAX_ADDRESS - (UINTN)Destination))){
> -    return RETURN_INVALID_PARAMETER;
> -  }
> -
> -  ActualSourceLength = 0;
> -  BufferSize = 0;
> -
> -  //
> -  // Determine the actual number of valid characters in the string.
> -  // All invalid characters except selected white space characters,
> -  // will cause the Base64 string to be rejected. White space to allow
> -  // properly formatted XML will be ignored.
> -  //
> -  // See section 3.3 of RFC 4648.
> -  //
> -  for (SourceIndex = 0; SourceIndex < SourceLength; SourceIndex++) {
> -
> -    //
> -    // '=' is part of the quantum
> -    //
> -    if (Source[SourceIndex] == '=') {
> -      ActualSourceLength++;
> -      BufferSize--;
> -
> -      //
> -      // Only two '=' characters can be valid.
> -      //
> -      if (BufferSize < -2) {
> -        return RETURN_INVALID_PARAMETER;
> -      }
> -    }
> -    else {
> -      Chr = Source[SourceIndex];
> -      if (BAD_V != DecodingTable[(UINT8) Chr]) {
> -
> -        //
> -        // The '=' characters are only valid at the end, so any
> -        // valid character after an '=', will be flagged as an error.
> -        //
> -        if (BufferSize < 0) {
> -          return RETURN_INVALID_PARAMETER;
> -        }
> -          ActualSourceLength++;
> -      }
> -        else {
> -
> -        //
> -        // The reset of the decoder will ignore all invalid characters allowed here.
> -        // Ignoring selected white space is useful.  In this case, the decoder will
> -        // ignore ' ', '\t', '\n', and '\r'.
> -        //
> -        if ((Chr != ' ') &&(Chr != '\t') &&(Chr != '\n') &&(Chr != '\r')) {
> -          return RETURN_INVALID_PARAMETER;
> -        }
> -      }
> -    }
> -  }
> -
> -  //
> -  // The Base64 character string must be a multiple of 4 character quantums.
> -  //
> -  if (ActualSourceLength % 4 != 0) {
> -    return RETURN_INVALID_PARAMETER;
> -  }
> -
> -  BufferSize += ActualSourceLength / 4 * 3;
> -    if (BufferSize < 0) {
> -      return RETURN_INVALID_PARAMETER;
> -  }
> -
> -  //
> -  // BufferSize is >= 0
> -  //
> -  if ((Destination == NULL) || (*DestinationSize < (UINTN) BufferSize)) {
> -    *DestinationSize = BufferSize;
> -    return RETURN_BUFFER_TOO_SMALL;
> -  }
> -
> -  //
> -  // If no decodable characters, return a size of zero. RFC 4686 test vector 1.
> -  //
> -  if (ActualSourceLength == 0) {
> -    *DestinationSize = 0;
> -    return RETURN_SUCCESS;
> -  }
> -
> -  //
> -  // Input data is verified to be a multiple of 4 valid charcters.  Process four
> -  // characters at a time. Uncounted (ie. invalid)  characters will be ignored.
> -  //
> -  for (SourceIndex = 0, DestinationIndex = 0; (SourceIndex < SourceLength) && (DestinationIndex < *DestinationSize); ) {
> -    Value = 0;
> -
> -    //
> -    // Get 24 bits of data from 4 input characters, each character representing 6 bits
> -    //
> -    for (Index = 0; Index < 4; Index++) {
> -      do {
> -      Chr = DecodingTable[(UINT8) Source[SourceIndex++]];
> -      } while (Chr == BAD_V);
> -      Value <<= 6;
> -      Value |= (UINT32)Chr;
> -    }
> -
> -    //
> -    // Store 3 bytes of binary data (24 bits)
> -    //
> -    *Destination++ = (UINT8) (Value >> 16);
> -    DestinationIndex++;
> -
> -    //
> -    // Due to the '=' special cases for the two bytes at the end,
> -    // we have to check the length and not store the padding data
> -    //
> -    if (DestinationIndex++ < *DestinationSize) {
> -      *Destination++ = (UINT8) (Value >>  8);
> -    }
> -    if (DestinationIndex++ < *DestinationSize) {
> -      *Destination++ = (UINT8) Value;
> -    }
> -  }
> -
> -  return RETURN_SUCCESS;
> +  ASSERT (FALSE);
> +  return RETURN_INVALID_PARAMETER;
>  }
>  
>  /**
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] MdePkg/BaseLib: re-specify Base64Decode(), and add temporary stub impl
  2019-07-16  9:41     ` Philippe Mathieu-Daudé
@ 2019-07-16 14:14       ` Laszlo Ersek
  2019-07-16 14:59         ` Philippe Mathieu-Daudé
  0 siblings, 1 reply; 26+ messages in thread
From: Laszlo Ersek @ 2019-07-16 14:14 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé, edk2-devel-groups-io
  Cc: Liming Gao, Marvin Häuser, Michael D Kinney, Zhichao Gao

On 07/16/19 11:41, Philippe Mathieu-Daudé wrote:
> Hi Laszlo,
> 
> On 7/16/19 10:38 AM, Philippe Mathieu-Daudé wrote:
>> On 7/2/19 12:28 PM, Laszlo Ersek wrote:
>>> Rewrite Base64Decode() from scratch, due to reasons listed in the second
>>> reference below.
>>>
>>> As first step, redo the interface contract, and replace the current
>>> implementation with a stub that asserts FALSE, then fails.
>>>
>>> Cc: Liming Gao <liming.gao@intel.com>
>>> Cc: Marvin Häuser <mhaeuser@outlook.de>
>>> Cc: Michael D Kinney <michael.d.kinney@intel.com>
>>> Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
>>> Cc: Zhichao Gao <zhichao.gao@intel.com>
>>> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=1891
>>> Ref: http://mid.mail-archive.com/c495bd0b-ea4d-7206-8a4f-a7149760d19a@redhat.com
>>> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
>>> ---
>>>  MdePkg/Include/Library/BaseLib.h |  99 +++++--
>>>  MdePkg/Library/BaseLib/String.c  | 285 ++++++--------------
>>>  2 files changed, 168 insertions(+), 216 deletions(-)
> 
> You forgot to update the copyright in both files.

I didn't: I never intended to.

Updating or extending *existing* copyright notices is not a general edk2
requirement. Some companies insist that their associates do that, when
they contribute patches. Red Hat doesn't (we don't extend copyright
notices like that in QEMU either).

Thanks,
Laszlo


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/3] MdePkg/BaseLib: rewrite Base64Decode()
  2019-07-16 10:05   ` Philippe Mathieu-Daudé
@ 2019-07-16 14:17     ` Laszlo Ersek
  0 siblings, 0 replies; 26+ messages in thread
From: Laszlo Ersek @ 2019-07-16 14:17 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé, edk2-devel-groups-io
  Cc: Liming Gao, Marvin Häuser, Michael D Kinney, Zhichao Gao

On 07/16/19 12:05, Philippe Mathieu-Daudé wrote:
> Hi Laszlo,
> 
> On 7/2/19 12:28 PM, Laszlo Ersek wrote:
>> Rewrite Base64Decode() from scratch, due to reasons listed in the second
>> reference below.
>>
>> Implement Base64Decode() according to the specification added in the
>> previous patch. The decoder scans the input buffer once, it has no inner
>> loop(s), and it spills each output byte as soon as the output byte is
>> complete.
> 
> Sorry it took me so long, I was reluctant to review this at first,
> because reimplementing a piece of code to fix a bug often introduce new
> bugs. However your implementation is very clean to follow (well
> described) and certainly safer.
> 
>> Cc: Liming Gao <liming.gao@intel.com>
>> Cc: Marvin Häuser <mhaeuser@outlook.de>
>> Cc: Michael D Kinney <michael.d.kinney@intel.com>
>> Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
>> Cc: Zhichao Gao <zhichao.gao@intel.com>
>> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=1891
>> Ref: http://mid.mail-archive.com/c495bd0b-ea4d-7206-8a4f-a7149760d19a@redhat.com
>> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
>> ---
>>  MdePkg/Library/BaseLib/String.c | 249 +++++++++++++++++++-
>>  1 file changed, 247 insertions(+), 2 deletions(-)
>>
>> diff --git a/MdePkg/Library/BaseLib/String.c b/MdePkg/Library/BaseLib/String.c
>> index f8397035c32a..6198ccbc9672 100644
>> --- a/MdePkg/Library/BaseLib/String.c
>> +++ b/MdePkg/Library/BaseLib/String.c
>> @@ -1973,8 +1973,253 @@ Base64Decode (
>>    IN OUT UINTN       *DestinationSize
>>    )
>>  {
>> -  ASSERT (FALSE);
>> -  return RETURN_INVALID_PARAMETER;
>> +  BOOLEAN PaddingMode;
>> +  UINTN   SixBitGroupsConsumed;
>> +  UINT32  Accumulator;
>> +  UINTN   OriginalDestinationSize;
>> +  UINTN   SourceIndex;
>> +
>> +  if (DestinationSize == NULL) {
>> +    return RETURN_INVALID_PARAMETER;
>> +  }
>> +
>> +  //
>> +  // Check Source array validity.
>> +  //
>> +  if (Source == NULL) {
>> +    if (SourceSize > 0) {
>> +      //
>> +      // At least one CHAR8 element at NULL Source.
>> +      //
>> +      return RETURN_INVALID_PARAMETER;
>> +    }
>> +  } else if (SourceSize > MAX_ADDRESS - (UINTN)Source) {
>> +    //
>> +    // Non-NULL Source, but it wraps around.
> 
> You might want to add a comment in the commit description from your
> reply to Marvin regarding keeping MAX_ADDRESS, mostly "The original code
> included similar MAX_ADDRESS checks".

Good point -- I'll say that the intent is to only strengthen the sanity
checks, and hence e.g. the MAX_ADDRESS checks are preserved.

> 
> No more comments :)
> Reviewed-by: Philippe Mathieu-Daude <philmd@redhat.com>

Thank you!
Laszlo

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [edk2-devel] [PATCH 1/3] MdePkg/BaseLib: re-specify Base64Decode(), and add temporary stub impl
  2019-07-16 10:49   ` Laszlo Ersek
@ 2019-07-16 14:56     ` Liming Gao
  2019-07-16 17:15       ` Laszlo Ersek
  0 siblings, 1 reply; 26+ messages in thread
From: Liming Gao @ 2019-07-16 14:56 UTC (permalink / raw)
  To: devel@edk2.groups.io, lersek@redhat.com, Kinney, Michael D
  Cc: Marvin H?user, Philippe Mathieu-Daudé, Gao, Zhichao

Laszlo:
  Yes. Patch 1 & Patch 2 in MdePkg are both good to me. I have no other comments. Reviewed-by: Liming Gao <liming.gao@intel.com>

Thanks
Liming
> -----Original Message-----
> From: devel@edk2.groups.io [mailto:devel@edk2.groups.io] On Behalf Of Laszlo Ersek
> Sent: Tuesday, July 16, 2019 6:49 PM
> To: Gao, Liming <liming.gao@intel.com>; Kinney, Michael D <michael.d.kinney@intel.com>
> Cc: edk2-devel-groups-io <devel@edk2.groups.io>; Marvin Häuser <mhaeuser@outlook.de>; Philippe Mathieu-Daudé
> <philmd@redhat.com>; Gao, Zhichao <zhichao.gao@intel.com>
> Subject: Re: [edk2-devel] [PATCH 1/3] MdePkg/BaseLib: re-specify Base64Decode(), and add temporary stub impl
> 
> sending a separate ping here as well, for clarity -- Liming, Mike, can
> one of you please R-b or A-b this specific patch too?
> 
> Thanks!
> Laszlo
> 
> On 07/02/19 12:28, Laszlo Ersek wrote:
> > Rewrite Base64Decode() from scratch, due to reasons listed in the second
> > reference below.
> >
> > As first step, redo the interface contract, and replace the current
> > implementation with a stub that asserts FALSE, then fails.
> >
> > Cc: Liming Gao <liming.gao@intel.com>
> > Cc: Marvin Häuser <mhaeuser@outlook.de>
> > Cc: Michael D Kinney <michael.d.kinney@intel.com>
> > Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
> > Cc: Zhichao Gao <zhichao.gao@intel.com>
> > Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=1891
> > Ref: http://mid.mail-archive.com/c495bd0b-ea4d-7206-8a4f-a7149760d19a@redhat.com
> > Signed-off-by: Laszlo Ersek <lersek@redhat.com>
> > ---
> >  MdePkg/Include/Library/BaseLib.h |  99 +++++--
> >  MdePkg/Library/BaseLib/String.c  | 285 ++++++--------------
> >  2 files changed, 168 insertions(+), 216 deletions(-)
> >
> > diff --git a/MdePkg/Include/Library/BaseLib.h b/MdePkg/Include/Library/BaseLib.h
> > index ebd7dd274cf4..5ef03e24edb1 100644
> > --- a/MdePkg/Include/Library/BaseLib.h
> > +++ b/MdePkg/Include/Library/BaseLib.h
> > @@ -2785,31 +2785,94 @@ Base64Encode (
> >    );
> >
> >  /**
> > -  Convert Base64 ascii string to binary data based on RFC4648.
> > +  Decode Base64 ASCII encoded data to 8-bit binary representation, based on
> > +  RFC4648.
> >
> > -  Produce Null-terminated binary data in the output buffer specified by Destination and DestinationSize.
> > -  The binary data is produced by converting the Base64 ascii string specified by Source and SourceLength.
> > +  Decoding occurs according to "Table 1: The Base 64 Alphabet" in RFC4648.
> >
> > -  @param Source          Input ASCII characters
> > -  @param SourceLength    Number of ASCII characters
> > -  @param Destination     Pointer to output buffer
> > -  @param DestinationSize Caller is responsible for passing in buffer of at least DestinationSize.
> > -                         Set 0 to get the size needed. Set to bytes stored on return.
> > +  Whitespace is ignored at all positions:
> > +  - 0x09 ('\t') horizontal tab
> > +  - 0x0A ('\n') new line
> > +  - 0x0B ('\v') vertical tab
> > +  - 0x0C ('\f') form feed
> > +  - 0x0D ('\r') carriage return
> > +  - 0x20 (' ')  space
> >
> > -  @retval RETURN_SUCCESS             When binary buffer is filled in.
> > -  @retval RETURN_INVALID_PARAMETER   If Source is NULL or DestinationSize is NULL.
> > -  @retval RETURN_INVALID_PARAMETER   If SourceLength or DestinationSize is bigger than (MAX_ADDRESS -(UINTN)Destination ).
> > -  @retval RETURN_INVALID_PARAMETER   If there is any invalid character in input stream.
> > -  @retval RETURN_BUFFER_TOO_SMALL    If buffer length is smaller than required buffer size.
> > +  The minimum amount of required padding (with ASCII 0x3D, '=') is tolerated
> > +  and enforced at the end of the Base64 ASCII encoded data, and only there.
> >
> > - **/
> > +  Other characters outside of the encoding alphabet cause the function to
> > +  reject the Base64 ASCII encoded data.
> > +
> > +  @param[in] Source               Array of CHAR8 elements containing the Base64
> > +                                  ASCII encoding. May be NULL if SourceSize is
> > +                                  zero.
> > +
> > +  @param[in] SourceSize           Number of CHAR8 elements in Source.
> > +
> > +  @param[out] Destination         Array of UINT8 elements receiving the decoded
> > +                                  8-bit binary representation. Allocated by the
> > +                                  caller. May be NULL if DestinationSize is
> > +                                  zero on input. If NULL, decoding is
> > +                                  performed, but the 8-bit binary
> > +                                  representation is not stored. If non-NULL and
> > +                                  the function returns an error, the contents
> > +                                  of Destination are indeterminate.
> > +
> > +  @param[in,out] DestinationSize  On input, the number of UINT8 elements that
> > +                                  the caller allocated for Destination. On
> > +                                  output, if the function returns
> > +                                  RETURN_SUCCESS or RETURN_BUFFER_TOO_SMALL,
> > +                                  the number of UINT8 elements that are
> > +                                  required for decoding the Base64 ASCII
> > +                                  representation. If the function returns a
> > +                                  value different from both RETURN_SUCCESS and
> > +                                  RETURN_BUFFER_TOO_SMALL, then DestinationSize
> > +                                  is indeterminate on output.
> > +
> > +  @retval RETURN_SUCCESS            SourceSize CHAR8 elements at Source have
> > +                                    been decoded to on-output DestinationSize
> > +                                    UINT8 elements at Destination. Note that
> > +                                    RETURN_SUCCESS covers the case when
> > +                                    DestinationSize is zero on input, and
> > +                                    Source decodes to zero bytes (due to
> > +                                    containing at most ignored whitespace).
> > +
> > +  @retval RETURN_BUFFER_TOO_SMALL   The input value of DestinationSize is not
> > +                                    large enough for decoding SourceSize CHAR8
> > +                                    elements at Source. The required number of
> > +                                    UINT8 elements has been stored to
> > +                                    DestinationSize.
> > +
> > +  @retval RETURN_INVALID_PARAMETER  DestinationSize is NULL.
> > +
> > +  @retval RETURN_INVALID_PARAMETER  Source is NULL, but SourceSize is not zero.
> > +
> > +  @retval RETURN_INVALID_PARAMETER  Destination is NULL, but DestinationSize is
> > +                                    not zero on input.
> > +
> > +  @retval RETURN_INVALID_PARAMETER  Source is non-NULL, and (Source +
> > +                                    SourceSize) would wrap around MAX_ADDRESS.
> > +
> > +  @retval RETURN_INVALID_PARAMETER  Destination is non-NULL, and (Destination +
> > +                                    DestinationSize) would wrap around
> > +                                    MAX_ADDRESS, as specified on input.
> > +
> > +  @retval RETURN_INVALID_PARAMETER  None of Source and Destination are NULL,
> > +                                    and CHAR8[SourceSize] at Source overlaps
> > +                                    UINT8[DestinationSize] at Destination, as
> > +                                    specified on input.
> > +
> > +  @retval RETURN_INVALID_PARAMETER  Invalid CHAR8 element encountered in
> > +                                    Source.
> > +**/
> >  RETURN_STATUS
> >  EFIAPI
> >  Base64Decode (
> > -  IN  CONST CHAR8  *Source,
> > -  IN        UINTN   SourceLength,
> > -  OUT       UINT8  *Destination  OPTIONAL,
> > -  IN OUT    UINTN  *DestinationSize
> > +  IN     CONST CHAR8 *Source          OPTIONAL,
> > +  IN     UINTN       SourceSize,
> > +  OUT    UINT8       *Destination     OPTIONAL,
> > +  IN OUT UINTN       *DestinationSize
> >    );
> >
> >  /**
> > diff --git a/MdePkg/Library/BaseLib/String.c b/MdePkg/Library/BaseLib/String.c
> > index 32e189791cb8..f8397035c32a 100644
> > --- a/MdePkg/Library/BaseLib/String.c
> > +++ b/MdePkg/Library/BaseLib/String.c
> > @@ -1757,45 +1757,10 @@ AsciiStrToUnicodeStr (
> >
> >  #endif
> >
> > -//
> > -// The basis for Base64 encoding is RFC 4686 https://tools.ietf.org/html/rfc4648
> > -//
> > -// RFC 4686 has a number of MAY and SHOULD cases.  This implementation chooses
> > -// the more restrictive versions for security concerns (see RFC 4686 section 3.3).
> > -//
> > -// A invalid character, if encountered during the decode operation, causes the data
> > -// to be rejected. In addition, the '=' padding character is only allowed at the end
> > -// of the Base64 encoded string.
> > -//
> > -#define BAD_V  99
> > -
> >  STATIC CHAR8 EncodingTable[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
> >                                  "abcdefghijklmnopqrstuvwxyz"
> >                                  "0123456789+/";
> >
> > -STATIC UINT8 DecodingTable[] = {
> > -  //
> > -  // Valid characters ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/
> > -  // Also, set '=' as a zero for decoding
> > -  // 0  ,            1,           2,           3,            4,           5,            6,           7,           8,
> 9,           a,            b,            c,           d,            e,            f
> > -  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,
> BAD_V,  BAD_V,   //   0
> > -  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,
> BAD_V,  BAD_V,   //  10
> > -  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,     62,  BAD_V,  BAD_V,
> BAD_V,     63,   //  20
> > -     52,     53,     54,     55,     56,     57,     58,     59,     60,     61,  BAD_V,  BAD_V,  BAD_V,      0,
> BAD_V,  BAD_V,   //  30
> > -  BAD_V,      0,      1,      2,      3,      4,      5,      6,      7,      8,      9,     10,     11,     12,     13,
> 14,   //  40
> > -     15,     16,     17,     18,     19,     20,     21,     22,     23,     24,     25,  BAD_V,  BAD_V,  BAD_V,
> BAD_V,  BAD_V,   //  50
> > -  BAD_V,     26,     27,     28,     29,     30,     31,     32,     33,     34,     35,     36,     37,     38,     39,
> 40,   //  60
> > -     41,     42,     43,     44,     45,     46,     47,     48,     49,     50,     51,  BAD_V,  BAD_V,  BAD_V,
> BAD_V,  BAD_V,   //  70
> > -  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,
> BAD_V,  BAD_V,   //  80
> > -  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,
> BAD_V,  BAD_V,   //  90
> > -  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,
> BAD_V,  BAD_V,   //  a0
> > -  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,
> BAD_V,  BAD_V,   //  b0
> > -  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,
> BAD_V,  BAD_V,   //  c0
> > -  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,
> BAD_V,  BAD_V,   //  d0
> > -  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,
> BAD_V,  BAD_V,   //  d0
> > -  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,  BAD_V,
> BAD_V,  BAD_V    //  f0
> > -};
> > -
> >  /**
> >    Convert binary data to a Base64 encoded ascii string based on RFC4648.
> >
> > @@ -1918,174 +1883,98 @@ Base64Encode (
> >  }
> >
> >  /**
> > -  Convert Base64 ascii string to binary data based on RFC4648.
> > -
> > -  Produce Null-terminated binary data in the output buffer specified by Destination and DestinationSize.
> > -  The binary data is produced by converting the Base64 ascii string specified by Source and SourceLength.
> > -
> > -  @param Source            Input ASCII characters
> > -  @param SourceLength      Number of ASCII characters
> > -  @param Destination       Pointer to output buffer
> > -  @param DestinationSize   Caller is responsible for passing in buffer of at least DestinationSize.
> > -                           Set 0 to get the size needed. Set to bytes stored on return.
> > -
> > -  @retval RETURN_SUCCESS             When binary buffer is filled in.
> > -  @retval RETURN_INVALID_PARAMETER   If Source is NULL or DestinationSize is NULL.
> > -  @retval RETURN_INVALID_PARAMETER   If SourceLength or DestinationSize is bigger than (MAX_ADDRESS -(UINTN)Destination ).
> > -  @retval RETURN_INVALID_PARAMETER   If there is any invalid character in input stream.
> > -  @retval RETURN_BUFFER_TOO_SMALL    If buffer length is smaller than required buffer size.
> > - **/
> > +  Decode Base64 ASCII encoded data to 8-bit binary representation, based on
> > +  RFC4648.
> > +
> > +  Decoding occurs according to "Table 1: The Base 64 Alphabet" in RFC4648.
> > +
> > +  Whitespace is ignored at all positions:
> > +  - 0x09 ('\t') horizontal tab
> > +  - 0x0A ('\n') new line
> > +  - 0x0B ('\v') vertical tab
> > +  - 0x0C ('\f') form feed
> > +  - 0x0D ('\r') carriage return
> > +  - 0x20 (' ')  space
> > +
> > +  The minimum amount of required padding (with ASCII 0x3D, '=') is tolerated
> > +  and enforced at the end of the Base64 ASCII encoded data, and only there.
> > +
> > +  Other characters outside of the encoding alphabet cause the function to
> > +  reject the Base64 ASCII encoded data.
> > +
> > +  @param[in] Source               Array of CHAR8 elements containing the Base64
> > +                                  ASCII encoding. May be NULL if SourceSize is
> > +                                  zero.
> > +
> > +  @param[in] SourceSize           Number of CHAR8 elements in Source.
> > +
> > +  @param[out] Destination         Array of UINT8 elements receiving the decoded
> > +                                  8-bit binary representation. Allocated by the
> > +                                  caller. May be NULL if DestinationSize is
> > +                                  zero on input. If NULL, decoding is
> > +                                  performed, but the 8-bit binary
> > +                                  representation is not stored. If non-NULL and
> > +                                  the function returns an error, the contents
> > +                                  of Destination are indeterminate.
> > +
> > +  @param[in,out] DestinationSize  On input, the number of UINT8 elements that
> > +                                  the caller allocated for Destination. On
> > +                                  output, if the function returns
> > +                                  RETURN_SUCCESS or RETURN_BUFFER_TOO_SMALL,
> > +                                  the number of UINT8 elements that are
> > +                                  required for decoding the Base64 ASCII
> > +                                  representation. If the function returns a
> > +                                  value different from both RETURN_SUCCESS and
> > +                                  RETURN_BUFFER_TOO_SMALL, then DestinationSize
> > +                                  is indeterminate on output.
> > +
> > +  @retval RETURN_SUCCESS            SourceSize CHAR8 elements at Source have
> > +                                    been decoded to on-output DestinationSize
> > +                                    UINT8 elements at Destination. Note that
> > +                                    RETURN_SUCCESS covers the case when
> > +                                    DestinationSize is zero on input, and
> > +                                    Source decodes to zero bytes (due to
> > +                                    containing at most ignored whitespace).
> > +
> > +  @retval RETURN_BUFFER_TOO_SMALL   The input value of DestinationSize is not
> > +                                    large enough for decoding SourceSize CHAR8
> > +                                    elements at Source. The required number of
> > +                                    UINT8 elements has been stored to
> > +                                    DestinationSize.
> > +
> > +  @retval RETURN_INVALID_PARAMETER  DestinationSize is NULL.
> > +
> > +  @retval RETURN_INVALID_PARAMETER  Source is NULL, but SourceSize is not zero.
> > +
> > +  @retval RETURN_INVALID_PARAMETER  Destination is NULL, but DestinationSize is
> > +                                    not zero on input.
> > +
> > +  @retval RETURN_INVALID_PARAMETER  Source is non-NULL, and (Source +
> > +                                    SourceSize) would wrap around MAX_ADDRESS.
> > +
> > +  @retval RETURN_INVALID_PARAMETER  Destination is non-NULL, and (Destination +
> > +                                    DestinationSize) would wrap around
> > +                                    MAX_ADDRESS, as specified on input.
> > +
> > +  @retval RETURN_INVALID_PARAMETER  None of Source and Destination are NULL,
> > +                                    and CHAR8[SourceSize] at Source overlaps
> > +                                    UINT8[DestinationSize] at Destination, as
> > +                                    specified on input.
> > +
> > +  @retval RETURN_INVALID_PARAMETER  Invalid CHAR8 element encountered in
> > +                                    Source.
> > +**/
> >  RETURN_STATUS
> >  EFIAPI
> >  Base64Decode (
> > -  IN  CONST CHAR8  *Source,
> > -  IN        UINTN   SourceLength,
> > -  OUT       UINT8  *Destination   OPTIONAL,
> > -  IN OUT    UINTN  *DestinationSize
> > +  IN     CONST CHAR8 *Source          OPTIONAL,
> > +  IN     UINTN       SourceSize,
> > +  OUT    UINT8       *Destination     OPTIONAL,
> > +  IN OUT UINTN       *DestinationSize
> >    )
> >  {
> > -
> > -  UINT32   Value;
> > -  CHAR8    Chr;
> > -  INTN     BufferSize;
> > -  UINTN    SourceIndex;
> > -  UINTN    DestinationIndex;
> > -  UINTN    Index;
> > -  UINTN    ActualSourceLength;
> > -
> > -  //
> > -  // Check pointers are not NULL
> > -  //
> > -  if ((Source == NULL) || (DestinationSize == NULL)) {
> > -    return RETURN_INVALID_PARAMETER;
> > -  }
> > -
> > -  //
> > -  // Check if SourceLength or  DestinationSize is valid
> > -  //
> > -  if ((SourceLength >= (MAX_ADDRESS - (UINTN)Source)) || (*DestinationSize >= (MAX_ADDRESS - (UINTN)Destination))){
> > -    return RETURN_INVALID_PARAMETER;
> > -  }
> > -
> > -  ActualSourceLength = 0;
> > -  BufferSize = 0;
> > -
> > -  //
> > -  // Determine the actual number of valid characters in the string.
> > -  // All invalid characters except selected white space characters,
> > -  // will cause the Base64 string to be rejected. White space to allow
> > -  // properly formatted XML will be ignored.
> > -  //
> > -  // See section 3.3 of RFC 4648.
> > -  //
> > -  for (SourceIndex = 0; SourceIndex < SourceLength; SourceIndex++) {
> > -
> > -    //
> > -    // '=' is part of the quantum
> > -    //
> > -    if (Source[SourceIndex] == '=') {
> > -      ActualSourceLength++;
> > -      BufferSize--;
> > -
> > -      //
> > -      // Only two '=' characters can be valid.
> > -      //
> > -      if (BufferSize < -2) {
> > -        return RETURN_INVALID_PARAMETER;
> > -      }
> > -    }
> > -    else {
> > -      Chr = Source[SourceIndex];
> > -      if (BAD_V != DecodingTable[(UINT8) Chr]) {
> > -
> > -        //
> > -        // The '=' characters are only valid at the end, so any
> > -        // valid character after an '=', will be flagged as an error.
> > -        //
> > -        if (BufferSize < 0) {
> > -          return RETURN_INVALID_PARAMETER;
> > -        }
> > -          ActualSourceLength++;
> > -      }
> > -        else {
> > -
> > -        //
> > -        // The reset of the decoder will ignore all invalid characters allowed here.
> > -        // Ignoring selected white space is useful.  In this case, the decoder will
> > -        // ignore ' ', '\t', '\n', and '\r'.
> > -        //
> > -        if ((Chr != ' ') &&(Chr != '\t') &&(Chr != '\n') &&(Chr != '\r')) {
> > -          return RETURN_INVALID_PARAMETER;
> > -        }
> > -      }
> > -    }
> > -  }
> > -
> > -  //
> > -  // The Base64 character string must be a multiple of 4 character quantums.
> > -  //
> > -  if (ActualSourceLength % 4 != 0) {
> > -    return RETURN_INVALID_PARAMETER;
> > -  }
> > -
> > -  BufferSize += ActualSourceLength / 4 * 3;
> > -    if (BufferSize < 0) {
> > -      return RETURN_INVALID_PARAMETER;
> > -  }
> > -
> > -  //
> > -  // BufferSize is >= 0
> > -  //
> > -  if ((Destination == NULL) || (*DestinationSize < (UINTN) BufferSize)) {
> > -    *DestinationSize = BufferSize;
> > -    return RETURN_BUFFER_TOO_SMALL;
> > -  }
> > -
> > -  //
> > -  // If no decodable characters, return a size of zero. RFC 4686 test vector 1.
> > -  //
> > -  if (ActualSourceLength == 0) {
> > -    *DestinationSize = 0;
> > -    return RETURN_SUCCESS;
> > -  }
> > -
> > -  //
> > -  // Input data is verified to be a multiple of 4 valid charcters.  Process four
> > -  // characters at a time. Uncounted (ie. invalid)  characters will be ignored.
> > -  //
> > -  for (SourceIndex = 0, DestinationIndex = 0; (SourceIndex < SourceLength) && (DestinationIndex < *DestinationSize); ) {
> > -    Value = 0;
> > -
> > -    //
> > -    // Get 24 bits of data from 4 input characters, each character representing 6 bits
> > -    //
> > -    for (Index = 0; Index < 4; Index++) {
> > -      do {
> > -      Chr = DecodingTable[(UINT8) Source[SourceIndex++]];
> > -      } while (Chr == BAD_V);
> > -      Value <<= 6;
> > -      Value |= (UINT32)Chr;
> > -    }
> > -
> > -    //
> > -    // Store 3 bytes of binary data (24 bits)
> > -    //
> > -    *Destination++ = (UINT8) (Value >> 16);
> > -    DestinationIndex++;
> > -
> > -    //
> > -    // Due to the '=' special cases for the two bytes at the end,
> > -    // we have to check the length and not store the padding data
> > -    //
> > -    if (DestinationIndex++ < *DestinationSize) {
> > -      *Destination++ = (UINT8) (Value >>  8);
> > -    }
> > -    if (DestinationIndex++ < *DestinationSize) {
> > -      *Destination++ = (UINT8) Value;
> > -    }
> > -  }
> > -
> > -  return RETURN_SUCCESS;
> > +  ASSERT (FALSE);
> > +  return RETURN_INVALID_PARAMETER;
> >  }
> >
> >  /**
> >
> 
> 
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/3] MdePkg/BaseLib: re-specify Base64Decode(), and add temporary stub impl
  2019-07-16 14:14       ` Laszlo Ersek
@ 2019-07-16 14:59         ` Philippe Mathieu-Daudé
  2019-07-16 18:53           ` [edk2-devel] " Laszlo Ersek
  0 siblings, 1 reply; 26+ messages in thread
From: Philippe Mathieu-Daudé @ 2019-07-16 14:59 UTC (permalink / raw)
  To: Laszlo Ersek, edk2-devel-groups-io
  Cc: Liming Gao, Marvin Häuser, Michael D Kinney, Zhichao Gao

On 7/16/19 4:14 PM, Laszlo Ersek wrote:
> On 07/16/19 11:41, Philippe Mathieu-Daudé wrote:
>> Hi Laszlo,
>>
>> On 7/16/19 10:38 AM, Philippe Mathieu-Daudé wrote:
>>> On 7/2/19 12:28 PM, Laszlo Ersek wrote:
>>>> Rewrite Base64Decode() from scratch, due to reasons listed in the second
>>>> reference below.
>>>>
>>>> As first step, redo the interface contract, and replace the current
>>>> implementation with a stub that asserts FALSE, then fails.
>>>>
>>>> Cc: Liming Gao <liming.gao@intel.com>
>>>> Cc: Marvin Häuser <mhaeuser@outlook.de>
>>>> Cc: Michael D Kinney <michael.d.kinney@intel.com>
>>>> Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
>>>> Cc: Zhichao Gao <zhichao.gao@intel.com>
>>>> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=1891
>>>> Ref: http://mid.mail-archive.com/c495bd0b-ea4d-7206-8a4f-a7149760d19a@redhat.com
>>>> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
>>>> ---
>>>>  MdePkg/Include/Library/BaseLib.h |  99 +++++--
>>>>  MdePkg/Library/BaseLib/String.c  | 285 ++++++--------------
>>>>  2 files changed, 168 insertions(+), 216 deletions(-)
>>
>> You forgot to update the copyright in both files.
> 
> I didn't: I never intended to.
> 
> Updating or extending *existing* copyright notices is not a general edk2
> requirement. Some companies insist that their associates do that, when
> they contribute patches. Red Hat doesn't (we don't extend copyright
> notices like that in QEMU either).

Oh OK, I did not know :S

Thanks for telling me,

Phil.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [edk2-devel] [PATCH 1/3] MdePkg/BaseLib: re-specify Base64Decode(), and add temporary stub impl
  2019-07-16 14:56     ` Liming Gao
@ 2019-07-16 17:15       ` Laszlo Ersek
  0 siblings, 0 replies; 26+ messages in thread
From: Laszlo Ersek @ 2019-07-16 17:15 UTC (permalink / raw)
  To: Gao, Liming, devel@edk2.groups.io, Kinney, Michael D
  Cc: Marvin H?user, Philippe Mathieu-Daudé, Gao, Zhichao

On 07/16/19 16:56, Gao, Liming wrote:
> Laszlo:
>   Yes. Patch 1 & Patch 2 in MdePkg are both good to me. I have no other comments. Reviewed-by: Liming Gao <liming.gao@intel.com>

Many thanks!
Laszlo

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [edk2-devel] [PATCH 1/3] MdePkg/BaseLib: re-specify Base64Decode(), and add temporary stub impl
  2019-07-16 14:59         ` Philippe Mathieu-Daudé
@ 2019-07-16 18:53           ` Laszlo Ersek
  0 siblings, 0 replies; 26+ messages in thread
From: Laszlo Ersek @ 2019-07-16 18:53 UTC (permalink / raw)
  To: devel, philmd
  Cc: Liming Gao, Marvin Häuser, Michael D Kinney, Zhichao Gao

On 07/16/19 16:59, Philippe Mathieu-Daudé wrote:
> On 7/16/19 4:14 PM, Laszlo Ersek wrote:
>> On 07/16/19 11:41, Philippe Mathieu-Daudé wrote:
>>> Hi Laszlo,
>>>
>>> On 7/16/19 10:38 AM, Philippe Mathieu-Daudé wrote:
>>>> On 7/2/19 12:28 PM, Laszlo Ersek wrote:
>>>>> Rewrite Base64Decode() from scratch, due to reasons listed in the second
>>>>> reference below.
>>>>>
>>>>> As first step, redo the interface contract, and replace the current
>>>>> implementation with a stub that asserts FALSE, then fails.
>>>>>
>>>>> Cc: Liming Gao <liming.gao@intel.com>
>>>>> Cc: Marvin Häuser <mhaeuser@outlook.de>
>>>>> Cc: Michael D Kinney <michael.d.kinney@intel.com>
>>>>> Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
>>>>> Cc: Zhichao Gao <zhichao.gao@intel.com>
>>>>> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=1891
>>>>> Ref: http://mid.mail-archive.com/c495bd0b-ea4d-7206-8a4f-a7149760d19a@redhat.com
>>>>> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
>>>>> ---
>>>>>  MdePkg/Include/Library/BaseLib.h |  99 +++++--
>>>>>  MdePkg/Library/BaseLib/String.c  | 285 ++++++--------------
>>>>>  2 files changed, 168 insertions(+), 216 deletions(-)
>>>
>>> You forgot to update the copyright in both files.
>>
>> I didn't: I never intended to.
>>
>> Updating or extending *existing* copyright notices is not a general edk2
>> requirement. Some companies insist that their associates do that, when
>> they contribute patches. Red Hat doesn't (we don't extend copyright
>> notices like that in QEMU either).
> 
> Oh OK, I did not know :S

In retrospect, I may have mislead you, unwittingly -- perhaps you recall
me pointing out the same issue to other contributors.

But, in those cases, I must have raised the topic only for one of two
reasons:

- I had known from experience that the contributor's employer would
expect an update to the (C) notice

- there was an update already in the patch, but it looked incorrect
(extending a (C) notice to a year different from the current year,
inadvertently deleting a different (C) notice, and so on).

Thanks!
Laszlo

> Thanks for telling me,
> 
> Phil.
> 
> 
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [edk2-devel] [PATCH 0/3] MdePkg, OvmfPkg: rewrite Base64Decode(), clean up call site
  2019-07-02 10:28 [PATCH 0/3] MdePkg, OvmfPkg: rewrite Base64Decode(), clean up call site Laszlo Ersek
                   ` (3 preceding siblings ...)
  2019-07-10  9:20 ` [edk2-devel] [PATCH 0/3] MdePkg, OvmfPkg: rewrite Base64Decode(), clean up call site Laszlo Ersek
@ 2019-07-16 22:02 ` Laszlo Ersek
  4 siblings, 0 replies; 26+ messages in thread
From: Laszlo Ersek @ 2019-07-16 22:02 UTC (permalink / raw)
  To: edk2-devel-groups-io
  Cc: Ard Biesheuvel, Jordan Justen, Liming Gao, Marvin Häuser,
	Michael D Kinney, Philippe Mathieu-Daudé, Zhichao Gao

On 07/02/19 12:28, Laszlo Ersek wrote:
> Repo:   https://github.com/lersek/edk2.git
> Branch: base64_decode_bz1891
>
> Base64Decode() has a number of issues; see
>
> - <https://bugzilla.tianocore.org/show_bug.cgi?id=1891>
>
> - and the mailing list discussion linked from
>   <https://bugzilla.tianocore.org/show_bug.cgi?id=1891#c6>.
>
> In my opinion, rewriting Base64Decode() from scratch, using a different
> (state machine-based) approach is safer / more robust than attempting to
> identify and patch up individual problems in the current implementation.
> The emphasis of the proposed implementation is to reject invalid input;
> decoding valid input is kind of secondary. (This is the safe approach
> for all parsers that process untrusted input, in my opinion.)
>
> My understanding is that unit tests for Base64Decode() already exist in
> some repository. While I tested the new implementation through OvmfPkg's
> EnrollDefaultKeys application -- which makes the sole calls to
> Base64Decode() in the open source edk2 tree -- I didn't run a unit test
> suite. Help with that (pointers to the test suite, or actual unit
> testing) would be highly appreciated.
>
> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> Cc: Jordan Justen <jordan.l.justen@intel.com>
> Cc: Liming Gao <liming.gao@intel.com>
> Cc: Marvin Häuser <mhaeuser@outlook.de>
> Cc: Michael D Kinney <michael.d.kinney@intel.com>
> Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
> Cc: Zhichao Gao <zhichao.gao@intel.com>
>
> Thanks
> Laszlo
>
> Laszlo Ersek (3):
>   MdePkg/BaseLib: re-specify Base64Decode(), and add temporary stub impl
>   MdePkg/BaseLib: rewrite Base64Decode()
>   OvmfPkg/EnrollDefaultKeys: clean up Base64Decode() retval handling
>
>  MdePkg/Include/Library/BaseLib.h              |  99 ++++-
>  MdePkg/Library/BaseLib/String.c               | 448 +++++++++++++-------
>  OvmfPkg/EnrollDefaultKeys/EnrollDefaultKeys.c |  10 +-
>  3 files changed, 374 insertions(+), 183 deletions(-)
>

Thank you all for the feedback!

* Added a paragraph to the commit message of patch#2, as requested by
  Phil, based on the discussion with Marvin:

    +    The intent is to only strengthen the checks (sanity and input) relative to
    +    the previous implementation, hence the MAX_ADDRESS checks are reinstated.
...
    +    [lersek@redhat.com: add last para to commit msg per talks w/ Marvin & Phil]

* Pushed the first two patches in the series as commit range
  84a459472075..35e242b698cd; closing
  <https://bugzilla.tianocore.org/show_bug.cgi?id=1891>.

* Filed <https://bugzilla.tianocore.org/show_bug.cgi?id=1980> to track
  the CCS non-conformance issue, under approach "2a" (i.e.,
  incrementally), as agreed upon with Zhichao. Assigned the BZ to
  myself; will post a patch soon.

* Split off the last patch in the series to a separate TianoCore BZ,
  namely <https://bugzilla.tianocore.org/show_bug.cgi?id=1981>. Assigned
  that BZ to myself too.

Thanks,
Laszlo

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2019-07-16 22:02 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-07-02 10:28 [PATCH 0/3] MdePkg, OvmfPkg: rewrite Base64Decode(), clean up call site Laszlo Ersek
2019-07-02 10:28 ` [PATCH 1/3] MdePkg/BaseLib: re-specify Base64Decode(), and add temporary stub impl Laszlo Ersek
2019-07-16  8:38   ` Philippe Mathieu-Daudé
2019-07-16  9:41     ` Philippe Mathieu-Daudé
2019-07-16 14:14       ` Laszlo Ersek
2019-07-16 14:59         ` Philippe Mathieu-Daudé
2019-07-16 18:53           ` [edk2-devel] " Laszlo Ersek
2019-07-16 10:49   ` Laszlo Ersek
2019-07-16 14:56     ` Liming Gao
2019-07-16 17:15       ` Laszlo Ersek
2019-07-02 10:28 ` [PATCH 2/3] MdePkg/BaseLib: rewrite Base64Decode() Laszlo Ersek
2019-07-12  2:31   ` [edk2-devel] " Gao, Zhichao
2019-07-12 19:31     ` Laszlo Ersek
2019-07-15 15:22       ` Liming Gao
2019-07-15 21:56         ` Laszlo Ersek
2019-07-16  1:18           ` Gao, Zhichao
2019-07-16 10:48             ` Laszlo Ersek
2019-07-15 18:44   ` mhaeuser
2019-07-16  0:45     ` Laszlo Ersek
2019-07-16 10:05   ` Philippe Mathieu-Daudé
2019-07-16 14:17     ` Laszlo Ersek
2019-07-02 10:28 ` [PATCH 3/3] OvmfPkg/EnrollDefaultKeys: clean up Base64Decode() retval handling Laszlo Ersek
2019-07-15 21:58   ` [edk2-devel] " Laszlo Ersek
2019-07-16  8:36     ` Philippe Mathieu-Daudé
2019-07-10  9:20 ` [edk2-devel] [PATCH 0/3] MdePkg, OvmfPkg: rewrite Base64Decode(), clean up call site Laszlo Ersek
2019-07-16 22:02 ` Laszlo Ersek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox