From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: mx.groups.io; dkim=missing; spf=pass (domain: redhat.com, ip: 209.132.183.28, mailfrom: lersek@redhat.com) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by groups.io with SMTP; Tue, 16 Jul 2019 03:49:25 -0700 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id BDFBF3082D6C; Tue, 16 Jul 2019 10:49:24 +0000 (UTC) Received: from lacos-laptop-7.usersys.redhat.com (ovpn-116-224.ams2.redhat.com [10.36.116.224]) by smtp.corp.redhat.com (Postfix) with ESMTP id A340B1001B32; Tue, 16 Jul 2019 10:49:20 +0000 (UTC) Subject: Re: [edk2-devel] [PATCH 1/3] MdePkg/BaseLib: re-specify Base64Decode(), and add temporary stub impl From: "Laszlo Ersek" To: Liming Gao , Michael D Kinney Cc: edk2-devel-groups-io , =?UTF-8?Q?Marvin_H=c3=a4user?= , =?UTF-8?Q?Philippe_Mathieu-Daud=c3=a9?= , Zhichao Gao Reply-To: devel@edk2.groups.io, lersek@redhat.com References: <20190702102836.27589-1-lersek@redhat.com> <20190702102836.27589-2-lersek@redhat.com> Message-ID: <7013183a-31b7-d2a4-4656-d226e166e9da@redhat.com> Date: Tue, 16 Jul 2019 12:49:19 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <20190702102836.27589-2-lersek@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.46]); Tue, 16 Jul 2019 10:49:24 +0000 (UTC) Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: quoted-printable sending a separate ping here as well, for clarity -- Liming, Mike, can one of you please R-b or A-b this specific patch too? Thanks! Laszlo On 07/02/19 12:28, Laszlo Ersek wrote: > Rewrite Base64Decode() from scratch, due to reasons listed in the secon= d > reference below. >=20 > As first step, redo the interface contract, and replace the current > implementation with a stub that asserts FALSE, then fails. >=20 > Cc: Liming Gao > Cc: Marvin H=C3=A4user > Cc: Michael D Kinney > Cc: Philippe Mathieu-Daud=C3=A9 > Cc: Zhichao Gao > Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3D1891 > Ref: http://mid.mail-archive.com/c495bd0b-ea4d-7206-8a4f-a7149760d19a@r= edhat.com > Signed-off-by: Laszlo Ersek > --- > MdePkg/Include/Library/BaseLib.h | 99 +++++-- > MdePkg/Library/BaseLib/String.c | 285 ++++++-------------- > 2 files changed, 168 insertions(+), 216 deletions(-) >=20 > diff --git a/MdePkg/Include/Library/BaseLib.h b/MdePkg/Include/Library/= BaseLib.h > index ebd7dd274cf4..5ef03e24edb1 100644 > --- a/MdePkg/Include/Library/BaseLib.h > +++ b/MdePkg/Include/Library/BaseLib.h > @@ -2785,31 +2785,94 @@ Base64Encode ( > ); > =20 > /** > - Convert Base64 ascii string to binary data based on RFC4648. > + Decode Base64 ASCII encoded data to 8-bit binary representation, bas= ed on > + RFC4648. > =20 > - Produce Null-terminated binary data in the output buffer specified b= y Destination and DestinationSize. > - The binary data is produced by converting the Base64 ascii string sp= ecified by Source and SourceLength. > + Decoding occurs according to "Table 1: The Base 64 Alphabet" in RFC4= 648. > =20 > - @param Source Input ASCII characters > - @param SourceLength Number of ASCII characters > - @param Destination Pointer to output buffer > - @param DestinationSize Caller is responsible for passing in buffer o= f at least DestinationSize. > - Set 0 to get the size needed. Set to bytes st= ored on return. > + Whitespace is ignored at all positions: > + - 0x09 ('\t') horizontal tab > + - 0x0A ('\n') new line > + - 0x0B ('\v') vertical tab > + - 0x0C ('\f') form feed > + - 0x0D ('\r') carriage return > + - 0x20 (' ') space > =20 > - @retval RETURN_SUCCESS When binary buffer is filled in. > - @retval RETURN_INVALID_PARAMETER If Source is NULL or DestinationS= ize is NULL. > - @retval RETURN_INVALID_PARAMETER If SourceLength or DestinationSiz= e is bigger than (MAX_ADDRESS -(UINTN)Destination ). > - @retval RETURN_INVALID_PARAMETER If there is any invalid character= in input stream. > - @retval RETURN_BUFFER_TOO_SMALL If buffer length is smaller than = required buffer size. > + The minimum amount of required padding (with ASCII 0x3D, '=3D') is t= olerated > + and enforced at the end of the Base64 ASCII encoded data, and only t= here. > =20 > - **/ > + Other characters outside of the encoding alphabet cause the function= to > + reject the Base64 ASCII encoded data. > + > + @param[in] Source Array of CHAR8 elements containing t= he Base64 > + ASCII encoding. May be NULL if Sourc= eSize is > + zero. > + > + @param[in] SourceSize Number of CHAR8 elements in Source. > + > + @param[out] Destination Array of UINT8 elements receiving th= e decoded > + 8-bit binary representation. Allocat= ed by the > + caller. May be NULL if DestinationSi= ze is > + zero on input. If NULL, decoding is > + performed, but the 8-bit binary > + representation is not stored. If non= -NULL and > + the function returns an error, the c= ontents > + of Destination are indeterminate. > + > + @param[in,out] DestinationSize On input, the number of UINT8 elemen= ts that > + the caller allocated for Destination= . On > + output, if the function returns > + RETURN_SUCCESS or RETURN_BUFFER_TOO_= SMALL, > + the number of UINT8 elements that ar= e > + required for decoding the Base64 ASC= II > + representation. If the function retu= rns a > + value different from both RETURN_SUC= CESS and > + RETURN_BUFFER_TOO_SMALL, then Destin= ationSize > + is indeterminate on output. > + > + @retval RETURN_SUCCESS SourceSize CHAR8 elements at Sourc= e have > + been decoded to on-output Destinat= ionSize > + UINT8 elements at Destination. Not= e that > + RETURN_SUCCESS covers the case whe= n > + DestinationSize is zero on input, = and > + Source decodes to zero bytes (due = to > + containing at most ignored whitesp= ace). > + > + @retval RETURN_BUFFER_TOO_SMALL The input value of DestinationSize= is not > + large enough for decoding SourceSi= ze CHAR8 > + elements at Source. The required n= umber of > + UINT8 elements has been stored to > + DestinationSize. > + > + @retval RETURN_INVALID_PARAMETER DestinationSize is NULL. > + > + @retval RETURN_INVALID_PARAMETER Source is NULL, but SourceSize is = not zero. > + > + @retval RETURN_INVALID_PARAMETER Destination is NULL, but Destinati= onSize is > + not zero on input. > + > + @retval RETURN_INVALID_PARAMETER Source is non-NULL, and (Source + > + SourceSize) would wrap around MAX_= ADDRESS. > + > + @retval RETURN_INVALID_PARAMETER Destination is non-NULL, and (Dest= ination + > + DestinationSize) would wrap around > + MAX_ADDRESS, as specified on input= . > + > + @retval RETURN_INVALID_PARAMETER None of Source and Destination are= NULL, > + and CHAR8[SourceSize] at Source ov= erlaps > + UINT8[DestinationSize] at Destinat= ion, as > + specified on input. > + > + @retval RETURN_INVALID_PARAMETER Invalid CHAR8 element encountered = in > + Source. > +**/ > RETURN_STATUS > EFIAPI > Base64Decode ( > - IN CONST CHAR8 *Source, > - IN UINTN SourceLength, > - OUT UINT8 *Destination OPTIONAL, > - IN OUT UINTN *DestinationSize > + IN CONST CHAR8 *Source OPTIONAL, > + IN UINTN SourceSize, > + OUT UINT8 *Destination OPTIONAL, > + IN OUT UINTN *DestinationSize > ); > =20 > /** > diff --git a/MdePkg/Library/BaseLib/String.c b/MdePkg/Library/BaseLib/S= tring.c > index 32e189791cb8..f8397035c32a 100644 > --- a/MdePkg/Library/BaseLib/String.c > +++ b/MdePkg/Library/BaseLib/String.c > @@ -1757,45 +1757,10 @@ AsciiStrToUnicodeStr ( > =20 > #endif > =20 > -// > -// The basis for Base64 encoding is RFC 4686 https://tools.ietf.org/ht= ml/rfc4648 > -// > -// RFC 4686 has a number of MAY and SHOULD cases. This implementation= chooses > -// the more restrictive versions for security concerns (see RFC 4686 s= ection 3.3). > -// > -// A invalid character, if encountered during the decode operation, ca= uses the data > -// to be rejected. In addition, the '=3D' padding character is only al= lowed at the end > -// of the Base64 encoded string. > -// > -#define BAD_V 99 > - > STATIC CHAR8 EncodingTable[] =3D "ABCDEFGHIJKLMNOPQRSTUVWXYZ" > "abcdefghijklmnopqrstuvwxyz" > "0123456789+/"; > =20 > -STATIC UINT8 DecodingTable[] =3D { > - // > - // Valid characters ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuv= wxyz0123456789+/ > - // Also, set '=3D' as a zero for decoding > - // 0 , 1, 2, 3, 4, = 5, 6, 7, 8, 9, a,= b, c, d, e, f > - BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_= V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, // 0 > - BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_= V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, // 10 > - BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_= V, BAD_V, BAD_V, 62, BAD_V, BAD_V, BAD_V, 63, // 20 > - 52, 53, 54, 55, 56, 57, 58, 59, 6= 0, 61, BAD_V, BAD_V, BAD_V, 0, BAD_V, BAD_V, // 30 > - BAD_V, 0, 1, 2, 3, 4, 5, 6, = 7, 8, 9, 10, 11, 12, 13, 14, // 40 > - 15, 16, 17, 18, 19, 20, 21, 22, 2= 3, 24, 25, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, // 50 > - BAD_V, 26, 27, 28, 29, 30, 31, 32, 3= 3, 34, 35, 36, 37, 38, 39, 40, // 60 > - 41, 42, 43, 44, 45, 46, 47, 48, 4= 9, 50, 51, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, // 70 > - BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_= V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, // 80 > - BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_= V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, // 90 > - BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_= V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, // a0 > - BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_= V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, // b0 > - BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_= V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, // c0 > - BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_= V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, // d0 > - BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_= V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, // d0 > - BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_= V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V, BAD_V // f0 > -}; > - > /** > Convert binary data to a Base64 encoded ascii string based on RFC464= 8. > =20 > @@ -1918,174 +1883,98 @@ Base64Encode ( > } > =20 > /** > - Convert Base64 ascii string to binary data based on RFC4648. > - > - Produce Null-terminated binary data in the output buffer specified b= y Destination and DestinationSize. > - The binary data is produced by converting the Base64 ascii string sp= ecified by Source and SourceLength. > - > - @param Source Input ASCII characters > - @param SourceLength Number of ASCII characters > - @param Destination Pointer to output buffer > - @param DestinationSize Caller is responsible for passing in buffer= of at least DestinationSize. > - Set 0 to get the size needed. Set to bytes = stored on return. > - > - @retval RETURN_SUCCESS When binary buffer is filled in. > - @retval RETURN_INVALID_PARAMETER If Source is NULL or DestinationS= ize is NULL. > - @retval RETURN_INVALID_PARAMETER If SourceLength or DestinationSiz= e is bigger than (MAX_ADDRESS -(UINTN)Destination ). > - @retval RETURN_INVALID_PARAMETER If there is any invalid character= in input stream. > - @retval RETURN_BUFFER_TOO_SMALL If buffer length is smaller than = required buffer size. > - **/ > + Decode Base64 ASCII encoded data to 8-bit binary representation, bas= ed on > + RFC4648. > + > + Decoding occurs according to "Table 1: The Base 64 Alphabet" in RFC4= 648. > + > + Whitespace is ignored at all positions: > + - 0x09 ('\t') horizontal tab > + - 0x0A ('\n') new line > + - 0x0B ('\v') vertical tab > + - 0x0C ('\f') form feed > + - 0x0D ('\r') carriage return > + - 0x20 (' ') space > + > + The minimum amount of required padding (with ASCII 0x3D, '=3D') is t= olerated > + and enforced at the end of the Base64 ASCII encoded data, and only t= here. > + > + Other characters outside of the encoding alphabet cause the function= to > + reject the Base64 ASCII encoded data. > + > + @param[in] Source Array of CHAR8 elements containing t= he Base64 > + ASCII encoding. May be NULL if Sourc= eSize is > + zero. > + > + @param[in] SourceSize Number of CHAR8 elements in Source. > + > + @param[out] Destination Array of UINT8 elements receiving th= e decoded > + 8-bit binary representation. Allocat= ed by the > + caller. May be NULL if DestinationSi= ze is > + zero on input. If NULL, decoding is > + performed, but the 8-bit binary > + representation is not stored. If non= -NULL and > + the function returns an error, the c= ontents > + of Destination are indeterminate. > + > + @param[in,out] DestinationSize On input, the number of UINT8 elemen= ts that > + the caller allocated for Destination= . On > + output, if the function returns > + RETURN_SUCCESS or RETURN_BUFFER_TOO_= SMALL, > + the number of UINT8 elements that ar= e > + required for decoding the Base64 ASC= II > + representation. If the function retu= rns a > + value different from both RETURN_SUC= CESS and > + RETURN_BUFFER_TOO_SMALL, then Destin= ationSize > + is indeterminate on output. > + > + @retval RETURN_SUCCESS SourceSize CHAR8 elements at Sourc= e have > + been decoded to on-output Destinat= ionSize > + UINT8 elements at Destination. Not= e that > + RETURN_SUCCESS covers the case whe= n > + DestinationSize is zero on input, = and > + Source decodes to zero bytes (due = to > + containing at most ignored whitesp= ace). > + > + @retval RETURN_BUFFER_TOO_SMALL The input value of DestinationSize= is not > + large enough for decoding SourceSi= ze CHAR8 > + elements at Source. The required n= umber of > + UINT8 elements has been stored to > + DestinationSize. > + > + @retval RETURN_INVALID_PARAMETER DestinationSize is NULL. > + > + @retval RETURN_INVALID_PARAMETER Source is NULL, but SourceSize is = not zero. > + > + @retval RETURN_INVALID_PARAMETER Destination is NULL, but Destinati= onSize is > + not zero on input. > + > + @retval RETURN_INVALID_PARAMETER Source is non-NULL, and (Source + > + SourceSize) would wrap around MAX_= ADDRESS. > + > + @retval RETURN_INVALID_PARAMETER Destination is non-NULL, and (Dest= ination + > + DestinationSize) would wrap around > + MAX_ADDRESS, as specified on input= . > + > + @retval RETURN_INVALID_PARAMETER None of Source and Destination are= NULL, > + and CHAR8[SourceSize] at Source ov= erlaps > + UINT8[DestinationSize] at Destinat= ion, as > + specified on input. > + > + @retval RETURN_INVALID_PARAMETER Invalid CHAR8 element encountered = in > + Source. > +**/ > RETURN_STATUS > EFIAPI > Base64Decode ( > - IN CONST CHAR8 *Source, > - IN UINTN SourceLength, > - OUT UINT8 *Destination OPTIONAL, > - IN OUT UINTN *DestinationSize > + IN CONST CHAR8 *Source OPTIONAL, > + IN UINTN SourceSize, > + OUT UINT8 *Destination OPTIONAL, > + IN OUT UINTN *DestinationSize > ) > { > - > - UINT32 Value; > - CHAR8 Chr; > - INTN BufferSize; > - UINTN SourceIndex; > - UINTN DestinationIndex; > - UINTN Index; > - UINTN ActualSourceLength; > - > - // > - // Check pointers are not NULL > - // > - if ((Source =3D=3D NULL) || (DestinationSize =3D=3D NULL)) { > - return RETURN_INVALID_PARAMETER; > - } > - > - // > - // Check if SourceLength or DestinationSize is valid > - // > - if ((SourceLength >=3D (MAX_ADDRESS - (UINTN)Source)) || (*Destinati= onSize >=3D (MAX_ADDRESS - (UINTN)Destination))){ > - return RETURN_INVALID_PARAMETER; > - } > - > - ActualSourceLength =3D 0; > - BufferSize =3D 0; > - > - // > - // Determine the actual number of valid characters in the string. > - // All invalid characters except selected white space characters, > - // will cause the Base64 string to be rejected. White space to allow > - // properly formatted XML will be ignored. > - // > - // See section 3.3 of RFC 4648. > - // > - for (SourceIndex =3D 0; SourceIndex < SourceLength; SourceIndex++) { > - > - // > - // '=3D' is part of the quantum > - // > - if (Source[SourceIndex] =3D=3D '=3D') { > - ActualSourceLength++; > - BufferSize--; > - > - // > - // Only two '=3D' characters can be valid. > - // > - if (BufferSize < -2) { > - return RETURN_INVALID_PARAMETER; > - } > - } > - else { > - Chr =3D Source[SourceIndex]; > - if (BAD_V !=3D DecodingTable[(UINT8) Chr]) { > - > - // > - // The '=3D' characters are only valid at the end, so any > - // valid character after an '=3D', will be flagged as an error= . > - // > - if (BufferSize < 0) { > - return RETURN_INVALID_PARAMETER; > - } > - ActualSourceLength++; > - } > - else { > - > - // > - // The reset of the decoder will ignore all invalid characters= allowed here. > - // Ignoring selected white space is useful. In this case, the= decoder will > - // ignore ' ', '\t', '\n', and '\r'. > - // > - if ((Chr !=3D ' ') &&(Chr !=3D '\t') &&(Chr !=3D '\n') &&(Chr = !=3D '\r')) { > - return RETURN_INVALID_PARAMETER; > - } > - } > - } > - } > - > - // > - // The Base64 character string must be a multiple of 4 character qua= ntums. > - // > - if (ActualSourceLength % 4 !=3D 0) { > - return RETURN_INVALID_PARAMETER; > - } > - > - BufferSize +=3D ActualSourceLength / 4 * 3; > - if (BufferSize < 0) { > - return RETURN_INVALID_PARAMETER; > - } > - > - // > - // BufferSize is >=3D 0 > - // > - if ((Destination =3D=3D NULL) || (*DestinationSize < (UINTN) BufferS= ize)) { > - *DestinationSize =3D BufferSize; > - return RETURN_BUFFER_TOO_SMALL; > - } > - > - // > - // If no decodable characters, return a size of zero. RFC 4686 test = vector 1. > - // > - if (ActualSourceLength =3D=3D 0) { > - *DestinationSize =3D 0; > - return RETURN_SUCCESS; > - } > - > - // > - // Input data is verified to be a multiple of 4 valid charcters. Pr= ocess four > - // characters at a time. Uncounted (ie. invalid) characters will be= ignored. > - // > - for (SourceIndex =3D 0, DestinationIndex =3D 0; (SourceIndex < Sourc= eLength) && (DestinationIndex < *DestinationSize); ) { > - Value =3D 0; > - > - // > - // Get 24 bits of data from 4 input characters, each character rep= resenting 6 bits > - // > - for (Index =3D 0; Index < 4; Index++) { > - do { > - Chr =3D DecodingTable[(UINT8) Source[SourceIndex++]]; > - } while (Chr =3D=3D BAD_V); > - Value <<=3D 6; > - Value |=3D (UINT32)Chr; > - } > - > - // > - // Store 3 bytes of binary data (24 bits) > - // > - *Destination++ =3D (UINT8) (Value >> 16); > - DestinationIndex++; > - > - // > - // Due to the '=3D' special cases for the two bytes at the end, > - // we have to check the length and not store the padding data > - // > - if (DestinationIndex++ < *DestinationSize) { > - *Destination++ =3D (UINT8) (Value >> 8); > - } > - if (DestinationIndex++ < *DestinationSize) { > - *Destination++ =3D (UINT8) Value; > - } > - } > - > - return RETURN_SUCCESS; > + ASSERT (FALSE); > + return RETURN_INVALID_PARAMETER; > } > =20 > /** >=20