[PATCH] EmbeddedPkg/GdbSerialLib: avoid left shift of negative quantity

public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed

* [PATCH] EmbeddedPkg/GdbSerialLib: avoid left shift of negative quantity
@ 2018-06-18 20:49 Ard Biesheuvel
  2018-06-18 21:57 ` Leif Lindholm
  0 siblings, 1 reply; 8+ messages in thread
From: Ard Biesheuvel @ 2018-06-18 20:49 UTC (permalink / raw)
  To: edk2-devel; +Cc: leif.lindholm, Ard Biesheuvel

Clang complains about left shifting a negative value being undefined.

  EmbeddedPkg/Library/GdbSerialLib/GdbSerialLib.c:151:30:
  error: shifting a negative signed value is undefined [-Werror,-Wshift-negative-value]
  OutputData = (UINT8)((~DLAB<<7)|((BreakSet<<6)|((Parity<<3)|((StopBits<<2)| Data))));

Redefine all bit pattern constants as unsigned to work around this.

Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 EmbeddedPkg/Library/GdbSerialLib/GdbSerialLib.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/EmbeddedPkg/Library/GdbSerialLib/GdbSerialLib.c b/EmbeddedPkg/Library/GdbSerialLib/GdbSerialLib.c
index 069d87ca780d..7931d1ac4e2b 100644
--- a/EmbeddedPkg/Library/GdbSerialLib/GdbSerialLib.c
+++ b/EmbeddedPkg/Library/GdbSerialLib/GdbSerialLib.c
@@ -40,11 +40,11 @@
 //---------------------------------------------
 // UART Register Bit Defines
 //---------------------------------------------
-#define LSR_TXRDY               0x20
-#define LSR_RXDA                0x01
-#define DLAB                    0x01
-#define ENABLE_FIFO             0x01
-#define CLEAR_FIFOS             0x06
+#define LSR_TXRDY               0x20U
+#define LSR_RXDA                0x01U
+#define DLAB                    0x01U
+#define ENABLE_FIFO             0x01U
+#define CLEAR_FIFOS             0x06U
 
 
 
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] EmbeddedPkg/GdbSerialLib: avoid left shift of negative quantity
  2018-06-18 20:49 [PATCH] EmbeddedPkg/GdbSerialLib: avoid left shift of negative quantity Ard Biesheuvel
@ 2018-06-18 21:57 ` Leif Lindholm
  2018-06-18 22:07   ` Ard Biesheuvel
  2018-06-18 22:51   ` Laszlo Ersek
  0 siblings, 2 replies; 8+ messages in thread
From: Leif Lindholm @ 2018-06-18 21:57 UTC (permalink / raw)
  To: Ard Biesheuvel; +Cc: edk2-devel

On Mon, Jun 18, 2018 at 10:49:18PM +0200, Ard Biesheuvel wrote:
> Clang complains about left shifting a negative value being undefined.

As well it should.

>   EmbeddedPkg/Library/GdbSerialLib/GdbSerialLib.c:151:30:
>   error: shifting a negative signed value is undefined [-Werror,-Wshift-negative-value]
>   OutputData = (UINT8)((~DLAB<<7)|((BreakSet<<6)|((Parity<<3)|((StopBits<<2)| Data))));
> 
> Redefine all bit pattern constants as unsigned to work around this.

So, I'm totally OK with this and
Reviewed-by: Leif Lindholm <leif.lindholm@linaro.org>
but ...
would it be worth fixing up BIT0-31 in Base.h and use those?

/
    Leif

> Contributed-under: TianoCore Contribution Agreement 1.1
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  EmbeddedPkg/Library/GdbSerialLib/GdbSerialLib.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/EmbeddedPkg/Library/GdbSerialLib/GdbSerialLib.c b/EmbeddedPkg/Library/GdbSerialLib/GdbSerialLib.c
> index 069d87ca780d..7931d1ac4e2b 100644
> --- a/EmbeddedPkg/Library/GdbSerialLib/GdbSerialLib.c
> +++ b/EmbeddedPkg/Library/GdbSerialLib/GdbSerialLib.c
> @@ -40,11 +40,11 @@
>  //---------------------------------------------
>  // UART Register Bit Defines
>  //---------------------------------------------
> -#define LSR_TXRDY               0x20
> -#define LSR_RXDA                0x01
> -#define DLAB                    0x01
> -#define ENABLE_FIFO             0x01
> -#define CLEAR_FIFOS             0x06
> +#define LSR_TXRDY               0x20U
> +#define LSR_RXDA                0x01U
> +#define DLAB                    0x01U
> +#define ENABLE_FIFO             0x01U
> +#define CLEAR_FIFOS             0x06U
>  
>  
>  
> -- 
> 2.17.1
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] EmbeddedPkg/GdbSerialLib: avoid left shift of negative quantity
  2018-06-18 21:57 ` Leif Lindholm
@ 2018-06-18 22:07   ` Ard Biesheuvel
  2018-06-18 22:51   ` Laszlo Ersek
  1 sibling, 0 replies; 8+ messages in thread
From: Ard Biesheuvel @ 2018-06-18 22:07 UTC (permalink / raw)
  To: Leif Lindholm; +Cc: edk2-devel@lists.01.org

On 18 June 2018 at 23:57, Leif Lindholm <leif.lindholm@linaro.org> wrote:
> On Mon, Jun 18, 2018 at 10:49:18PM +0200, Ard Biesheuvel wrote:
>> Clang complains about left shifting a negative value being undefined.
>
> As well it should.
>
>>   EmbeddedPkg/Library/GdbSerialLib/GdbSerialLib.c:151:30:
>>   error: shifting a negative signed value is undefined [-Werror,-Wshift-negative-value]
>>   OutputData = (UINT8)((~DLAB<<7)|((BreakSet<<6)|((Parity<<3)|((StopBits<<2)| Data))));
>>
>> Redefine all bit pattern constants as unsigned to work around this.
>
> So, I'm totally OK with this and
> Reviewed-by: Leif Lindholm <leif.lindholm@linaro.org>

Thanks. Pushed as 02ec23abebcb271a16fae94f8e3659bb9282880d

> but ...
> would it be worth fixing up BIT0-31 in Base.h and use those?
>

Going forward, I think it makes sense for the BITn constants to be
typed as unsigned explicitly. I'm not sure what kind of toolchain
pathologies we may hit, though, so I'd rather not depend on that.

>> Contributed-under: TianoCore Contribution Agreement 1.1
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> ---
>>  EmbeddedPkg/Library/GdbSerialLib/GdbSerialLib.c | 10 +++++-----
>>  1 file changed, 5 insertions(+), 5 deletions(-)
>>
>> diff --git a/EmbeddedPkg/Library/GdbSerialLib/GdbSerialLib.c b/EmbeddedPkg/Library/GdbSerialLib/GdbSerialLib.c
>> index 069d87ca780d..7931d1ac4e2b 100644
>> --- a/EmbeddedPkg/Library/GdbSerialLib/GdbSerialLib.c
>> +++ b/EmbeddedPkg/Library/GdbSerialLib/GdbSerialLib.c
>> @@ -40,11 +40,11 @@
>>  //---------------------------------------------
>>  // UART Register Bit Defines
>>  //---------------------------------------------
>> -#define LSR_TXRDY               0x20
>> -#define LSR_RXDA                0x01
>> -#define DLAB                    0x01
>> -#define ENABLE_FIFO             0x01
>> -#define CLEAR_FIFOS             0x06
>> +#define LSR_TXRDY               0x20U
>> +#define LSR_RXDA                0x01U
>> +#define DLAB                    0x01U
>> +#define ENABLE_FIFO             0x01U
>> +#define CLEAR_FIFOS             0x06U
>>
>>
>>
>> --
>> 2.17.1
>>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] EmbeddedPkg/GdbSerialLib: avoid left shift of negative quantity
  2018-06-18 21:57 ` Leif Lindholm
  2018-06-18 22:07   ` Ard Biesheuvel
@ 2018-06-18 22:51   ` Laszlo Ersek
  2018-06-19  6:37     ` Ard Biesheuvel
  2018-06-19  9:54     ` Leif Lindholm
  1 sibling, 2 replies; 8+ messages in thread
From: Laszlo Ersek @ 2018-06-18 22:51 UTC (permalink / raw)
  To: Leif Lindholm, Ard Biesheuvel; +Cc: edk2-devel

On 06/18/18 23:57, Leif Lindholm wrote:
> On Mon, Jun 18, 2018 at 10:49:18PM +0200, Ard Biesheuvel wrote:
>> Clang complains about left shifting a negative value being undefined.
> 
> As well it should.
> 
>>   EmbeddedPkg/Library/GdbSerialLib/GdbSerialLib.c:151:30:
>>   error: shifting a negative signed value is undefined [-Werror,-Wshift-negative-value]
>>   OutputData = (UINT8)((~DLAB<<7)|((BreakSet<<6)|((Parity<<3)|((StopBits<<2)| Data))));
>>
>> Redefine all bit pattern constants as unsigned to work around this.
> 
> So, I'm totally OK with this and
> Reviewed-by: Leif Lindholm <leif.lindholm@linaro.org>
> but ...
> would it be worth fixing up BIT0-31 in Base.h and use those?

If we started with the BITxx macros now, I'd agree. Given the current
tree, I don't :) I've made an argument against the suggestion earlier:

  UINT64 Value64;

  Value64 = ~0x1;
  Value64 = ~0x1U;

The two assignments produce different results.

In the first case, the integer constant 0x1 has type "int". Our C
language implementation uses, for type "int", two's complement
representation, 1 sign bit, 31 value bits, and no padding bits. So after
the bitwise complement, we get 0x1111_1111_1111_1110, also with type
int, and with value (-2). Taking UINT64 for "unsigned long int" or
"unsigned long long int", after the conversion implied by the assignment
we get ((UINT64_MAX + 1) + (-2)), aka (UINT64_MAX - 1).

In the second case, the constant 0x1U has type "unsigned int". After the
bitwise complement, we get (UINT32_MAX - 1), also of type "unsigned
int". Taking UINT64 for "unsigned long int" or "unsigned long long int",
after the conversion implied by assignment we get the exact same value,
(UINT32_MAX - 1).

In assembly parlance this is called "sign-extended" vs. "zero-extended".
I dislike those terms when speaking about C; they are not necessary to
explain what happens. (The direction is the opposite -- the compiler
uses sign-extension / zero-extension, if the ISA supports those, for
implementing the C semantics.)

So, I don't recommend changing BIT0 through BIT30 in Base.h, unless we'd
like to audit all their uses :)

Note: "through BIT30" is not a typo above. BIT31 already has type
"unsigned int". That's because of how the "type ladder" for integer
constants works in C. It's easy to look up in the standard (or, well, in
the final draft), but here's the mental model I like to use for it:

- The ladder starts with "int", and works towards integer types with
  higher conversion ranks, until the constant fits.

- Normally only signed types are considered; however, when using the 0x
  (hex) or 0 (octal) prefixes, we add unsigned types to the ladder. Each
  of those will be considered right after the corresponding signed
  integer type, with equal conversion rank. The lesson here is that the
  0x and 0 prefixes *extend* the set of candidate types.

- The suffix "u" (or equivalently "U") *restricts* the ladder to
  unsigned types, however. (Regardless of prefix.)

- The suffixes "l" and "ll" (or equivalently, "L" and "LL", resp.) don't
  affect signedness, instead they affect how high we set our foot on the
  ladder at first. And, we climb up from there.

Given our "signed int" and "unsigned int" representations (see above),
BIT30 (0x40000000) fits in "int", so it gets the type "int". However,
BIT31 (0x80000000) does not fit in "int". Because we use the 0x prefix
with it, it gets the type "unsigned int", because there it fits. Because
BIT31 already gets type "unsigned int", we could append the "u" suffix
to BIT31 (and BIT31 only), without any change in behavior.

This also means that you already get very different results for the
following two assignments:

  Value64 = ~BIT30;
  Value64 = ~BIT31;

Now, in an ideal world:
- all BIT0..BIT31 macros would carry the U suffix,
- we'd *never* apply bitwise complement to signed integers (even though
  the result of that is implementation-defined, not undefined or
  unspecified),
- we'd write all expressions similar to the above as

  Value64 = ~(UINT64)BIT30;
  Value64 = ~(UINT64)BIT31;

I don't think we can audit all such uses now, however.

The present patch differs because it's -- probably -- not hard to review
all uses of the macros being modified here.

Thanks
Laszlo

> 
> /
>     Leif
> 
>> Contributed-under: TianoCore Contribution Agreement 1.1
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> ---
>>  EmbeddedPkg/Library/GdbSerialLib/GdbSerialLib.c | 10 +++++-----
>>  1 file changed, 5 insertions(+), 5 deletions(-)
>>
>> diff --git a/EmbeddedPkg/Library/GdbSerialLib/GdbSerialLib.c b/EmbeddedPkg/Library/GdbSerialLib/GdbSerialLib.c
>> index 069d87ca780d..7931d1ac4e2b 100644
>> --- a/EmbeddedPkg/Library/GdbSerialLib/GdbSerialLib.c
>> +++ b/EmbeddedPkg/Library/GdbSerialLib/GdbSerialLib.c
>> @@ -40,11 +40,11 @@
>>  //---------------------------------------------
>>  // UART Register Bit Defines
>>  //---------------------------------------------
>> -#define LSR_TXRDY               0x20
>> -#define LSR_RXDA                0x01
>> -#define DLAB                    0x01
>> -#define ENABLE_FIFO             0x01
>> -#define CLEAR_FIFOS             0x06
>> +#define LSR_TXRDY               0x20U
>> +#define LSR_RXDA                0x01U
>> +#define DLAB                    0x01U
>> +#define ENABLE_FIFO             0x01U
>> +#define CLEAR_FIFOS             0x06U
>>  
>>  
>>  
>> -- 
>> 2.17.1
>>
> _______________________________________________
> edk2-devel mailing list
> edk2-devel@lists.01.org
> https://lists.01.org/mailman/listinfo/edk2-devel
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] EmbeddedPkg/GdbSerialLib: avoid left shift of negative quantity
  2018-06-18 22:51   ` Laszlo Ersek
@ 2018-06-19  6:37     ` Ard Biesheuvel
  2018-06-19  9:54     ` Leif Lindholm
  1 sibling, 0 replies; 8+ messages in thread
From: Ard Biesheuvel @ 2018-06-19  6:37 UTC (permalink / raw)
  To: Laszlo Ersek; +Cc: Leif Lindholm, edk2-devel@lists.01.org

On 19 June 2018 at 00:51, Laszlo Ersek <lersek@redhat.com> wrote:
> On 06/18/18 23:57, Leif Lindholm wrote:
>> On Mon, Jun 18, 2018 at 10:49:18PM +0200, Ard Biesheuvel wrote:
>>> Clang complains about left shifting a negative value being undefined.
>>
>> As well it should.
>>
>>>   EmbeddedPkg/Library/GdbSerialLib/GdbSerialLib.c:151:30:
>>>   error: shifting a negative signed value is undefined [-Werror,-Wshift-negative-value]
>>>   OutputData = (UINT8)((~DLAB<<7)|((BreakSet<<6)|((Parity<<3)|((StopBits<<2)| Data))));
>>>
>>> Redefine all bit pattern constants as unsigned to work around this.
>>
>> So, I'm totally OK with this and
>> Reviewed-by: Leif Lindholm <leif.lindholm@linaro.org>
>> but ...
>> would it be worth fixing up BIT0-31 in Base.h and use those?
>
> If we started with the BITxx macros now, I'd agree. Given the current
> tree, I don't :) I've made an argument against the suggestion earlier:
>
>   UINT64 Value64;
>
>   Value64 = ~0x1;
>   Value64 = ~0x1U;
>
> The two assignments produce different results.
>
> In the first case, the integer constant 0x1 has type "int". Our C
> language implementation uses, for type "int", two's complement
> representation, 1 sign bit, 31 value bits, and no padding bits. So after
> the bitwise complement, we get 0x1111_1111_1111_1110, also with type
> int, and with value (-2). Taking UINT64 for "unsigned long int" or
> "unsigned long long int", after the conversion implied by the assignment
> we get ((UINT64_MAX + 1) + (-2)), aka (UINT64_MAX - 1).
>
> In the second case, the constant 0x1U has type "unsigned int". After the
> bitwise complement, we get (UINT32_MAX - 1), also of type "unsigned
> int". Taking UINT64 for "unsigned long int" or "unsigned long long int",
> after the conversion implied by assignment we get the exact same value,
> (UINT32_MAX - 1).
>
> In assembly parlance this is called "sign-extended" vs. "zero-extended".
> I dislike those terms when speaking about C; they are not necessary to
> explain what happens. (The direction is the opposite -- the compiler
> uses sign-extension / zero-extension, if the ISA supports those, for
> implementing the C semantics.)
>
> So, I don't recommend changing BIT0 through BIT30 in Base.h, unless we'd
> like to audit all their uses :)
>

To be honest, I can't say that I had all of this on my radar, but I
agree that redefining the BITn macros may create more problems than it
solves.

Thanks for the elaborate write up.

> Note: "through BIT30" is not a typo above. BIT31 already has type
> "unsigned int". That's because of how the "type ladder" for integer
> constants works in C. It's easy to look up in the standard (or, well, in
> the final draft), but here's the mental model I like to use for it:
>
> - The ladder starts with "int", and works towards integer types with
>   higher conversion ranks, until the constant fits.
>
> - Normally only signed types are considered; however, when using the 0x
>   (hex) or 0 (octal) prefixes, we add unsigned types to the ladder. Each
>   of those will be considered right after the corresponding signed
>   integer type, with equal conversion rank. The lesson here is that the
>   0x and 0 prefixes *extend* the set of candidate types.
>
> - The suffix "u" (or equivalently "U") *restricts* the ladder to
>   unsigned types, however. (Regardless of prefix.)
>
> - The suffixes "l" and "ll" (or equivalently, "L" and "LL", resp.) don't
>   affect signedness, instead they affect how high we set our foot on the
>   ladder at first. And, we climb up from there.
>
> Given our "signed int" and "unsigned int" representations (see above),
> BIT30 (0x40000000) fits in "int", so it gets the type "int". However,
> BIT31 (0x80000000) does not fit in "int". Because we use the 0x prefix
> with it, it gets the type "unsigned int", because there it fits. Because
> BIT31 already gets type "unsigned int", we could append the "u" suffix
> to BIT31 (and BIT31 only), without any change in behavior.
>
> This also means that you already get very different results for the
> following two assignments:
>
>   Value64 = ~BIT30;
>   Value64 = ~BIT31;
>
> Now, in an ideal world:
> - all BIT0..BIT31 macros would carry the U suffix,
> - we'd *never* apply bitwise complement to signed integers (even though
>   the result of that is implementation-defined, not undefined or
>   unspecified),
> - we'd write all expressions similar to the above as
>
>   Value64 = ~(UINT64)BIT30;
>   Value64 = ~(UINT64)BIT31;
>
> I don't think we can audit all such uses now, however.
>
> The present patch differs because it's -- probably -- not hard to review
> all uses of the macros being modified here.
>
> Thanks
> Laszlo
>
>>
>> /
>>     Leif
>>
>>> Contributed-under: TianoCore Contribution Agreement 1.1
>>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>>> ---
>>>  EmbeddedPkg/Library/GdbSerialLib/GdbSerialLib.c | 10 +++++-----
>>>  1 file changed, 5 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/EmbeddedPkg/Library/GdbSerialLib/GdbSerialLib.c b/EmbeddedPkg/Library/GdbSerialLib/GdbSerialLib.c
>>> index 069d87ca780d..7931d1ac4e2b 100644
>>> --- a/EmbeddedPkg/Library/GdbSerialLib/GdbSerialLib.c
>>> +++ b/EmbeddedPkg/Library/GdbSerialLib/GdbSerialLib.c
>>> @@ -40,11 +40,11 @@
>>>  //---------------------------------------------
>>>  // UART Register Bit Defines
>>>  //---------------------------------------------
>>> -#define LSR_TXRDY               0x20
>>> -#define LSR_RXDA                0x01
>>> -#define DLAB                    0x01
>>> -#define ENABLE_FIFO             0x01
>>> -#define CLEAR_FIFOS             0x06
>>> +#define LSR_TXRDY               0x20U
>>> +#define LSR_RXDA                0x01U
>>> +#define DLAB                    0x01U
>>> +#define ENABLE_FIFO             0x01U
>>> +#define CLEAR_FIFOS             0x06U
>>>
>>>
>>>
>>> --
>>> 2.17.1
>>>
>> _______________________________________________
>> edk2-devel mailing list
>> edk2-devel@lists.01.org
>> https://lists.01.org/mailman/listinfo/edk2-devel
>>
>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] EmbeddedPkg/GdbSerialLib: avoid left shift of negative quantity
  2018-06-18 22:51   ` Laszlo Ersek
  2018-06-19  6:37     ` Ard Biesheuvel
@ 2018-06-19  9:54     ` Leif Lindholm
  2018-06-19 12:51       ` Laszlo Ersek
  1 sibling, 1 reply; 8+ messages in thread
From: Leif Lindholm @ 2018-06-19  9:54 UTC (permalink / raw)
  To: Laszlo Ersek; +Cc: Ard Biesheuvel, edk2-devel

On Tue, Jun 19, 2018 at 12:51:49AM +0200, Laszlo Ersek wrote:
> On 06/18/18 23:57, Leif Lindholm wrote:
> > On Mon, Jun 18, 2018 at 10:49:18PM +0200, Ard Biesheuvel wrote:
> >> Clang complains about left shifting a negative value being undefined.
> > 
> > As well it should.
> > 
> >>   EmbeddedPkg/Library/GdbSerialLib/GdbSerialLib.c:151:30:
> >>   error: shifting a negative signed value is undefined [-Werror,-Wshift-negative-value]
> >>   OutputData = (UINT8)((~DLAB<<7)|((BreakSet<<6)|((Parity<<3)|((StopBits<<2)| Data))));
> >>
> >> Redefine all bit pattern constants as unsigned to work around this.
> > 
> > So, I'm totally OK with this and
> > Reviewed-by: Leif Lindholm <leif.lindholm@linaro.org>
> > but ...
> > would it be worth fixing up BIT0-31 in Base.h and use those?
> 
> If we started with the BITxx macros now, I'd agree. Given the current
> tree, I don't :) I've made an argument against the suggestion earlier:
> 
>   UINT64 Value64;
> 
>   Value64 = ~0x1;
>   Value64 = ~0x1U;
> 
> The two assignments produce different results.
> 
> In the first case, the integer constant 0x1 has type "int". Our C
> language implementation uses, for type "int", two's complement
> representation, 1 sign bit, 31 value bits, and no padding bits. So after
> the bitwise complement, we get 0x1111_1111_1111_1110, also with type
> int, and with value (-2). Taking UINT64 for "unsigned long int" or
> "unsigned long long int", after the conversion implied by the assignment
> we get ((UINT64_MAX + 1) + (-2)), aka (UINT64_MAX - 1).

Err, if by UINT64_MAX + 1 you mean 0, as a way of introducing the
second part, I'm with you. Otherwise I think I need to go back to
school.

> In the second case, the constant 0x1U has type "unsigned int". After the
> bitwise complement, we get (UINT32_MAX - 1), also of type "unsigned
> int". Taking UINT64 for "unsigned long int" or "unsigned long long int",
> after the conversion implied by assignment we get the exact same value,
> (UINT32_MAX - 1).

But these examples make it look like the differences are an
unobservable internal side effect. That's not the case.

> In assembly parlance this is called "sign-extended" vs. "zero-extended".
> I dislike those terms when speaking about C; they are not necessary to
> explain what happens. (The direction is the opposite -- the compiler
> uses sign-extension / zero-extension, if the ISA supports those, for
> implementing the C semantics.)

I always saw it as the C language emulates a machine that supports
sign-/zero-extension.

> So, I don't recommend changing BIT0 through BIT30 in Base.h, unless we'd
> like to audit all their uses :)

I would actually still argue for this if we expected the vast majority
of users to be in the open source trees. I don't think that's the case.

> Note: "through BIT30" is not a typo above. BIT31 already has type
> "unsigned int". That's because of how the "type ladder" for integer
> constants works in C. It's easy to look up in the standard (or, well, in
> the final draft), but here's the mental model I like to use for it:
> 
> - The ladder starts with "int", and works towards integer types with
>   higher conversion ranks, until the constant fits.
> 
> - Normally only signed types are considered; however, when using the 0x
>   (hex) or 0 (octal) prefixes, we add unsigned types to the ladder. Each
>   of those will be considered right after the corresponding signed
>   integer type, with equal conversion rank. The lesson here is that the
>   0x and 0 prefixes *extend* the set of candidate types.
> 
> - The suffix "u" (or equivalently "U") *restricts* the ladder to
>   unsigned types, however. (Regardless of prefix.)
> 
> - The suffixes "l" and "ll" (or equivalently, "L" and "LL", resp.) don't
>   affect signedness, instead they affect how high we set our foot on the
>   ladder at first. And, we climb up from there.

Yeah, this bit I did know, but it's arcane enough (and _really_ shows
the origins of C) that I mainly actively try to avoid it by always
being explicit where it matters.
Certainly the most fun I tend to come across with it is with enums.

> Given our "signed int" and "unsigned int" representations (see above),
> BIT30 (0x40000000) fits in "int", so it gets the type "int". However,
> BIT31 (0x80000000) does not fit in "int". Because we use the 0x prefix
> with it, it gets the type "unsigned int", because there it fits. Because
> BIT31 already gets type "unsigned int", we could append the "u" suffix
> to BIT31 (and BIT31 only), without any change in behavior.
> 
> This also means that you already get very different results for the
> following two assignments:
> 
>   Value64 = ~BIT30;
>   Value64 = ~BIT31;
> 
> Now, in an ideal world:
> - all BIT0..BIT31 macros would carry the U suffix,
> - we'd *never* apply bitwise complement to signed integers (even though
>   the result of that is implementation-defined, not undefined or
>   unspecified),

That would certainly be desirable to implement anyway.
I'll have a look and see what -Weverything does with that.
It should certainly be purged from all Tianocore trees.

> - we'd write all expressions similar to the above as
> 
>   Value64 = ~(UINT64)BIT30;
>   Value64 = ~(UINT64)BIT31;
> 
> I don't think we can audit all such uses now, however.
> 
> The present patch differs because it's -- probably -- not hard to review
> all uses of the macros being modified here.

Really not.

Right, so thanks for the persuasive and educational argument.
I now agree we shouldn't change bits 0-31 (including 31 - what would
be the point?).

I shall commence scheming on replacing them instead.

Regards,

Leif

> Thanks
> Laszlo
> 
> > 
> > /
> >     Leif
> > 
> >> Contributed-under: TianoCore Contribution Agreement 1.1
> >> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> >> ---
> >>  EmbeddedPkg/Library/GdbSerialLib/GdbSerialLib.c | 10 +++++-----
> >>  1 file changed, 5 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/EmbeddedPkg/Library/GdbSerialLib/GdbSerialLib.c b/EmbeddedPkg/Library/GdbSerialLib/GdbSerialLib.c
> >> index 069d87ca780d..7931d1ac4e2b 100644
> >> --- a/EmbeddedPkg/Library/GdbSerialLib/GdbSerialLib.c
> >> +++ b/EmbeddedPkg/Library/GdbSerialLib/GdbSerialLib.c
> >> @@ -40,11 +40,11 @@
> >>  //---------------------------------------------
> >>  // UART Register Bit Defines
> >>  //---------------------------------------------
> >> -#define LSR_TXRDY               0x20
> >> -#define LSR_RXDA                0x01
> >> -#define DLAB                    0x01
> >> -#define ENABLE_FIFO             0x01
> >> -#define CLEAR_FIFOS             0x06
> >> +#define LSR_TXRDY               0x20U
> >> +#define LSR_RXDA                0x01U
> >> +#define DLAB                    0x01U
> >> +#define ENABLE_FIFO             0x01U
> >> +#define CLEAR_FIFOS             0x06U
> >>  
> >>  
> >>  
> >> -- 
> >> 2.17.1
> >>
> > _______________________________________________
> > edk2-devel mailing list
> > edk2-devel@lists.01.org
> > https://lists.01.org/mailman/listinfo/edk2-devel
> > 
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] EmbeddedPkg/GdbSerialLib: avoid left shift of negative quantity
  2018-06-19  9:54     ` Leif Lindholm
@ 2018-06-19 12:51       ` Laszlo Ersek
  2018-06-19 13:06         ` Leif Lindholm
  0 siblings, 1 reply; 8+ messages in thread
From: Laszlo Ersek @ 2018-06-19 12:51 UTC (permalink / raw)
  To: Leif Lindholm; +Cc: Ard Biesheuvel, edk2-devel

On 06/19/18 11:54, Leif Lindholm wrote:
> On Tue, Jun 19, 2018 at 12:51:49AM +0200, Laszlo Ersek wrote:
>> On 06/18/18 23:57, Leif Lindholm wrote:
>>> On Mon, Jun 18, 2018 at 10:49:18PM +0200, Ard Biesheuvel wrote:
>>>> Clang complains about left shifting a negative value being undefined.
>>>
>>> As well it should.
>>>
>>>>   EmbeddedPkg/Library/GdbSerialLib/GdbSerialLib.c:151:30:
>>>>   error: shifting a negative signed value is undefined [-Werror,-Wshift-negative-value]
>>>>   OutputData = (UINT8)((~DLAB<<7)|((BreakSet<<6)|((Parity<<3)|((StopBits<<2)| Data))));
>>>>
>>>> Redefine all bit pattern constants as unsigned to work around this.
>>>
>>> So, I'm totally OK with this and
>>> Reviewed-by: Leif Lindholm <leif.lindholm@linaro.org>
>>> but ...
>>> would it be worth fixing up BIT0-31 in Base.h and use those?
>>
>> If we started with the BITxx macros now, I'd agree. Given the current
>> tree, I don't :) I've made an argument against the suggestion earlier:
>>
>>   UINT64 Value64;
>>
>>   Value64 = ~0x1;
>>   Value64 = ~0x1U;
>>
>> The two assignments produce different results.
>>
>> In the first case, the integer constant 0x1 has type "int". Our C
>> language implementation uses, for type "int", two's complement
>> representation, 1 sign bit, 31 value bits, and no padding bits. So after
>> the bitwise complement, we get 0x1111_1111_1111_1110, also with type
>> int, and with value (-2). Taking UINT64 for "unsigned long int" or
>> "unsigned long long int", after the conversion implied by the assignment
>> we get ((UINT64_MAX + 1) + (-2)), aka (UINT64_MAX - 1).
> 
> Err, if by UINT64_MAX + 1 you mean 0,

No, I don't.

> as a way of introducing the
> second part, I'm with you. Otherwise I think I need to go back to
> school.

I really meant (UINT64_MAX + 1) as the mathematical integer
18,446,744,073,709,551,616; without the modular wrap-around that occurs
in C. The modular wrap-around of C is defined in terms of the
mathematical values. This is how the ISO C99 standard defines the behavior:

------
6.3.1.3 Signed and unsigned integers

1 When a value with integer type is converted to another integer type
  other than _Bool, if the value can be represented by the new type, it
  is unchanged.

2 Otherwise, if the new type is unsigned, the value is converted by
  repeatedly adding or subtracting one more than the maximum value that
  can be represented in the new type until the value is in the range of
  the new type. [footnote 49]

3 Otherwise, the new type is signed and the value cannot be represented
  in it; either the result is implementation-defined or an
  implementation-defined signal is raised.

[footnote 49] The rules describe arithmetic on the mathematical value,
not the value of a given type of expression.
------

Therefore we really have (18,446,744,073,709,551,616 + (-2)) in the
above reasoning.

> 
>> In the second case, the constant 0x1U has type "unsigned int". After the
>> bitwise complement, we get (UINT32_MAX - 1), also of type "unsigned
>> int". Taking UINT64 for "unsigned long int" or "unsigned long long int",
>> after the conversion implied by assignment we get the exact same value,
>> (UINT32_MAX - 1).
> 
> But these examples make it look like the differences are an
> unobservable internal side effect. That's not the case.

Hm, sorry, I don't understand. My point was that the difference was
extremely observable, because in the first assignment, we end up with
value 18,446,744,073,709,551,614; in the second, we end up with
4,294,967,294.

Perhaps my expression "exact same value" was misleading. There I
referred to 6.3.1.3p1, which I've now quoted above as well -- I meant
that converting 4,294,967,294 from "unsigned int" to either of "unsigned
long int" or "unsigned long long int" didn't change the value.

Thanks!
Laszlo

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] EmbeddedPkg/GdbSerialLib: avoid left shift of negative quantity
  2018-06-19 12:51       ` Laszlo Ersek
@ 2018-06-19 13:06         ` Leif Lindholm
  0 siblings, 0 replies; 8+ messages in thread
From: Leif Lindholm @ 2018-06-19 13:06 UTC (permalink / raw)
  To: Laszlo Ersek; +Cc: Ard Biesheuvel, edk2-devel

On Tue, Jun 19, 2018 at 02:51:41PM +0200, Laszlo Ersek wrote:
> On 06/19/18 11:54, Leif Lindholm wrote:
> > On Tue, Jun 19, 2018 at 12:51:49AM +0200, Laszlo Ersek wrote:
> >> On 06/18/18 23:57, Leif Lindholm wrote:
> >>> On Mon, Jun 18, 2018 at 10:49:18PM +0200, Ard Biesheuvel wrote:
> >>>> Clang complains about left shifting a negative value being undefined.
> >>>
> >>> As well it should.
> >>>
> >>>>   EmbeddedPkg/Library/GdbSerialLib/GdbSerialLib.c:151:30:
> >>>>   error: shifting a negative signed value is undefined [-Werror,-Wshift-negative-value]
> >>>>   OutputData = (UINT8)((~DLAB<<7)|((BreakSet<<6)|((Parity<<3)|((StopBits<<2)| Data))));
> >>>>
> >>>> Redefine all bit pattern constants as unsigned to work around this.
> >>>
> >>> So, I'm totally OK with this and
> >>> Reviewed-by: Leif Lindholm <leif.lindholm@linaro.org>
> >>> but ...
> >>> would it be worth fixing up BIT0-31 in Base.h and use those?
> >>
> >> If we started with the BITxx macros now, I'd agree. Given the current
> >> tree, I don't :) I've made an argument against the suggestion earlier:
> >>
> >>   UINT64 Value64;
> >>
> >>   Value64 = ~0x1;
> >>   Value64 = ~0x1U;
> >>
> >> The two assignments produce different results.
> >>
> >> In the first case, the integer constant 0x1 has type "int". Our C
> >> language implementation uses, for type "int", two's complement
> >> representation, 1 sign bit, 31 value bits, and no padding bits. So after
> >> the bitwise complement, we get 0x1111_1111_1111_1110, also with type
> >> int, and with value (-2). Taking UINT64 for "unsigned long int" or
> >> "unsigned long long int", after the conversion implied by the assignment
> >> we get ((UINT64_MAX + 1) + (-2)), aka (UINT64_MAX - 1).
> > 
> > Err, if by UINT64_MAX + 1 you mean 0,
> 
> No, I don't.
> 
> > as a way of introducing the
> > second part, I'm with you. Otherwise I think I need to go back to
> > school.
> 
> I really meant (UINT64_MAX + 1) as the mathematical integer
> 18,446,744,073,709,551,616; without the modular wrap-around that occurs
> in C. The modular wrap-around of C is defined in terms of the
> mathematical values. This is how the ISO C99 standard defines the behavior:
> 
> ------
> 6.3.1.3 Signed and unsigned integers
> 
> 1 When a value with integer type is converted to another integer type
>   other than _Bool, if the value can be represented by the new type, it
>   is unchanged.
> 
> 2 Otherwise, if the new type is unsigned, the value is converted by
>   repeatedly adding or subtracting one more than the maximum value that
>   can be represented in the new type until the value is in the range of
>   the new type. [footnote 49]

So for negative add, for positive subtract? I find the language in the
spec is indicative of the people who write it. And those people spend
a lot of time working on compiler optimisations.

> 3 Otherwise, the new type is signed and the value cannot be represented
>   in it; either the result is implementation-defined or an
>   implementation-defined signal is raised.

*twitch*

> [footnote 49] The rules describe arithmetic on the mathematical value,
> not the value of a given type of expression.
> ------
> 
> Therefore we really have (18,446,744,073,709,551,616 + (-2)) in the
> above reasoning.

Blimey. Live and learn.
Thanks for that - at least the period in school was shorter than I feared.

> >> In the second case, the constant 0x1U has type "unsigned int". After the
> >> bitwise complement, we get (UINT32_MAX - 1), also of type "unsigned
> >> int". Taking UINT64 for "unsigned long int" or "unsigned long long int",
> >> after the conversion implied by assignment we get the exact same value,
> >> (UINT32_MAX - 1).
> > 
> > But these examples make it look like the differences are an
> > unobservable internal side effect. That's not the case.
> 
> Hm, sorry, I don't understand. My point was that the difference was
> extremely observable, because in the first assignment, we end up with
> value 18,446,744,073,709,551,614; in the second, we end up with
> 4,294,967,294.
> 
> Perhaps my expression "exact same value" was misleading.

Yes, that threw me.
Combined with your detestation for sign-extension (which does
help describe the situation without going full spec).

> There I
> referred to 6.3.1.3p1, which I've now quoted above as well -- I meant
> that converting 4,294,967,294 from "unsigned int" to either of "unsigned
> long int" or "unsigned long long int" didn't change the value.

Understood. Thanks a lot!

/
    Leif


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-06-19 13:06 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-06-18 20:49 [PATCH] EmbeddedPkg/GdbSerialLib: avoid left shift of negative quantity Ard Biesheuvel
2018-06-18 21:57 ` Leif Lindholm
2018-06-18 22:07   ` Ard Biesheuvel
2018-06-18 22:51   ` Laszlo Ersek
2018-06-19  6:37     ` Ard Biesheuvel
2018-06-19  9:54     ` Leif Lindholm
2018-06-19 12:51       ` Laszlo Ersek
2018-06-19 13:06         ` Leif Lindholm

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox