Re: [PATCH] EmbeddedPkg/GdbSerialLib: avoid left shift of negative quantity

From: Ard Biesheuvel <ard.biesheuvel@linaro.org>
To: Laszlo Ersek <lersek@redhat.com>
Cc: Leif Lindholm <leif.lindholm@linaro.org>,
	 "edk2-devel@lists.01.org" <edk2-devel@lists.01.org>
Subject: Re: [PATCH] EmbeddedPkg/GdbSerialLib: avoid left shift of negative quantity
Date: Tue, 19 Jun 2018 08:37:06 +0200	[thread overview]
Message-ID: <CAKv+Gu-vFVQHu+6PcOuoj13hPx9JomAejUB4rwC=cphS5Os8vA@mail.gmail.com> (raw)
In-Reply-To: <96210413-f241-63d1-3e96-a77757d3b0e4@redhat.com>

On 19 June 2018 at 00:51, Laszlo Ersek <lersek@redhat.com> wrote:
> On 06/18/18 23:57, Leif Lindholm wrote:
>> On Mon, Jun 18, 2018 at 10:49:18PM +0200, Ard Biesheuvel wrote:
>>> Clang complains about left shifting a negative value being undefined.
>>
>> As well it should.
>>
>>>   EmbeddedPkg/Library/GdbSerialLib/GdbSerialLib.c:151:30:
>>>   error: shifting a negative signed value is undefined [-Werror,-Wshift-negative-value]
>>>   OutputData = (UINT8)((~DLAB<<7)|((BreakSet<<6)|((Parity<<3)|((StopBits<<2)| Data))));
>>>
>>> Redefine all bit pattern constants as unsigned to work around this.
>>
>> So, I'm totally OK with this and
>> Reviewed-by: Leif Lindholm <leif.lindholm@linaro.org>
>> but ...
>> would it be worth fixing up BIT0-31 in Base.h and use those?
>
> If we started with the BITxx macros now, I'd agree. Given the current
> tree, I don't :) I've made an argument against the suggestion earlier:
>
>   UINT64 Value64;
>
>   Value64 = ~0x1;
>   Value64 = ~0x1U;
>
> The two assignments produce different results.
>
> In the first case, the integer constant 0x1 has type "int". Our C
> language implementation uses, for type "int", two's complement
> representation, 1 sign bit, 31 value bits, and no padding bits. So after
> the bitwise complement, we get 0x1111_1111_1111_1110, also with type
> int, and with value (-2). Taking UINT64 for "unsigned long int" or
> "unsigned long long int", after the conversion implied by the assignment
> we get ((UINT64_MAX + 1) + (-2)), aka (UINT64_MAX - 1).
>
> In the second case, the constant 0x1U has type "unsigned int". After the
> bitwise complement, we get (UINT32_MAX - 1), also of type "unsigned
> int". Taking UINT64 for "unsigned long int" or "unsigned long long int",
> after the conversion implied by assignment we get the exact same value,
> (UINT32_MAX - 1).
>
> In assembly parlance this is called "sign-extended" vs. "zero-extended".
> I dislike those terms when speaking about C; they are not necessary to
> explain what happens. (The direction is the opposite -- the compiler
> uses sign-extension / zero-extension, if the ISA supports those, for
> implementing the C semantics.)
>
> So, I don't recommend changing BIT0 through BIT30 in Base.h, unless we'd
> like to audit all their uses :)
>

To be honest, I can't say that I had all of this on my radar, but I
agree that redefining the BITn macros may create more problems than it
solves.

Thanks for the elaborate write up.

> Note: "through BIT30" is not a typo above. BIT31 already has type
> "unsigned int". That's because of how the "type ladder" for integer
> constants works in C. It's easy to look up in the standard (or, well, in
> the final draft), but here's the mental model I like to use for it:
>
> - The ladder starts with "int", and works towards integer types with
>   higher conversion ranks, until the constant fits.
>
> - Normally only signed types are considered; however, when using the 0x
>   (hex) or 0 (octal) prefixes, we add unsigned types to the ladder. Each
>   of those will be considered right after the corresponding signed
>   integer type, with equal conversion rank. The lesson here is that the
>   0x and 0 prefixes *extend* the set of candidate types.
>
> - The suffix "u" (or equivalently "U") *restricts* the ladder to
>   unsigned types, however. (Regardless of prefix.)
>
> - The suffixes "l" and "ll" (or equivalently, "L" and "LL", resp.) don't
>   affect signedness, instead they affect how high we set our foot on the
>   ladder at first. And, we climb up from there.
>
> Given our "signed int" and "unsigned int" representations (see above),
> BIT30 (0x40000000) fits in "int", so it gets the type "int". However,
> BIT31 (0x80000000) does not fit in "int". Because we use the 0x prefix
> with it, it gets the type "unsigned int", because there it fits. Because
> BIT31 already gets type "unsigned int", we could append the "u" suffix
> to BIT31 (and BIT31 only), without any change in behavior.
>
> This also means that you already get very different results for the
> following two assignments:
>
>   Value64 = ~BIT30;
>   Value64 = ~BIT31;
>
> Now, in an ideal world:
> - all BIT0..BIT31 macros would carry the U suffix,
> - we'd *never* apply bitwise complement to signed integers (even though
>   the result of that is implementation-defined, not undefined or
>   unspecified),
> - we'd write all expressions similar to the above as
>
>   Value64 = ~(UINT64)BIT30;
>   Value64 = ~(UINT64)BIT31;
>
> I don't think we can audit all such uses now, however.
>
> The present patch differs because it's -- probably -- not hard to review
> all uses of the macros being modified here.
>
> Thanks
> Laszlo
>
>>
>> /
>>     Leif
>>
>>> Contributed-under: TianoCore Contribution Agreement 1.1
>>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>>> ---
>>>  EmbeddedPkg/Library/GdbSerialLib/GdbSerialLib.c | 10 +++++-----
>>>  1 file changed, 5 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/EmbeddedPkg/Library/GdbSerialLib/GdbSerialLib.c b/EmbeddedPkg/Library/GdbSerialLib/GdbSerialLib.c
>>> index 069d87ca780d..7931d1ac4e2b 100644
>>> --- a/EmbeddedPkg/Library/GdbSerialLib/GdbSerialLib.c
>>> +++ b/EmbeddedPkg/Library/GdbSerialLib/GdbSerialLib.c
>>> @@ -40,11 +40,11 @@
>>>  //---------------------------------------------
>>>  // UART Register Bit Defines
>>>  //---------------------------------------------
>>> -#define LSR_TXRDY               0x20
>>> -#define LSR_RXDA                0x01
>>> -#define DLAB                    0x01
>>> -#define ENABLE_FIFO             0x01
>>> -#define CLEAR_FIFOS             0x06
>>> +#define LSR_TXRDY               0x20U
>>> +#define LSR_RXDA                0x01U
>>> +#define DLAB                    0x01U
>>> +#define ENABLE_FIFO             0x01U
>>> +#define CLEAR_FIFOS             0x06U
>>>
>>>
>>>
>>> --
>>> 2.17.1
>>>
>> _______________________________________________
>> edk2-devel mailing list
>> edk2-devel@lists.01.org
>> https://lists.01.org/mailman/listinfo/edk2-devel
>>
>