From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <lersek@redhat.com>
Received-SPF: Pass (sender SPF authorized) identity=mailfrom;
 client-ip=66.187.233.73; helo=mx1.redhat.com; envelope-from=lersek@redhat.com;
 receiver=edk2-devel@lists.01.org 
Received: from mx1.redhat.com (mx3-rdu2.redhat.com [66.187.233.73])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (No client certificate requested)
 by ml01.01.org (Postfix) with ESMTPS id AD62B22352280
 for <edk2-devel@lists.01.org>; Wed, 28 Feb 2018 02:54:12 -0800 (PST)
Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com
 [10.11.54.4])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by mx1.redhat.com (Postfix) with ESMTPS id 3812D40FB646;
 Wed, 28 Feb 2018 11:00:18 +0000 (UTC)
Received: from lacos-laptop-7.usersys.redhat.com (ovpn-120-11.rdu2.redhat.com
 [10.10.120.11])
 by smtp.corp.redhat.com (Postfix) with ESMTP id 404FC2024CA6;
 Wed, 28 Feb 2018 11:00:17 +0000 (UTC)
To: =?UTF-8?Q?Marvin_H=c3=a4user?= <Marvin.Haeuser@outlook.com>,
 "edk2-devel@lists.01.org" <edk2-devel@lists.01.org>
Cc: "michael.d.kinney@intel.com" <michael.d.kinney@intel.com>,
 "liming.gao@intel.com" <liming.gao@intel.com>
References: <AM4PR06MB14910F94889E6F6D5D9B648C80C00@AM4PR06MB1491.eurprd06.prod.outlook.com>
 <62c9363b-7f27-cfff-492a-560660727b86@redhat.com>
 <AM4PR06MB14912DF47968C748F6BEB7A280C00@AM4PR06MB1491.eurprd06.prod.outlook.com>
From: Laszlo Ersek <lersek@redhat.com>
Message-ID: <366ffc0c-b55f-a3c1-973e-b80d3dd07d26@redhat.com>
Date: Wed, 28 Feb 2018 12:00:16 +0100
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101
 Thunderbird/52.6.0
MIME-Version: 1.0
In-Reply-To: <AM4PR06MB14912DF47968C748F6BEB7A280C00@AM4PR06MB1491.eurprd06.prod.outlook.com>
X-Scanned-By: MIMEDefang 2.78 on 10.11.54.4
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16
 (mx1.redhat.com [10.11.55.7]); Wed, 28 Feb 2018 11:00:18 +0000 (UTC)
X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.7]);
 Wed, 28 Feb 2018 11:00:18 +0000 (UTC) for IP:'10.11.54.4'
 DOMAIN:'int-mx04.intmail.prod.int.rdu2.redhat.com'
 HELO:'smtp.corp.redhat.com' FROM:'lersek@redhat.com' RCPT:''
Subject: Re: [PATCH 1/2] MdePkg/Base.h: Ensure safe bitwise operations.
X-BeenThere: edk2-devel@lists.01.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: EDK II Development  <edk2-devel.lists.01.org>
List-Unsubscribe: <https://lists.01.org/mailman/options/edk2-devel>,
 <mailto:edk2-devel-request@lists.01.org?subject=unsubscribe>
List-Archive: <http://lists.01.org/pipermail/edk2-devel/>
List-Post: <mailto:edk2-devel@lists.01.org>
List-Help: <mailto:edk2-devel-request@lists.01.org?subject=help>
List-Subscribe: <https://lists.01.org/mailman/listinfo/edk2-devel>,
 <mailto:edk2-devel-request@lists.01.org?subject=subscribe>
X-List-Received-Date: Wed, 28 Feb 2018 10:54:14 -0000
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: 8bit

On 02/27/18 21:31, Marvin Häuser wrote:
>> -----Original Message-----
>> From: Laszlo Ersek <lersek@redhat.com>
>> Sent: Tuesday, February 27, 2018 8:54 PM
>> To: Marvin Häuser <Marvin.Haeuser@outlook.com>; edk2-
>> devel@lists.01.org
>> Cc: michael.d.kinney@intel.com; liming.gao@intel.com
>> Subject: Re: [edk2] [PATCH 1/2] MdePkg/Base.h: Ensure safe bitwise
>> operations.
>>
>> On 02/27/18 17:47, Marvin Häuser wrote:

[...]

>> I think this change is a good idea *in theory* (arguably the macros
>> should have been introduced like this in the first place). However,
>> at this point, quite a bit of code could silently see its meaning
>> changed.
>
> You are right that likely at least some vendor has code that relies on
> the defines being signed. However, you seem to agree that bit
> definitions are naturally assumed to be unsigned and the same as this
> change could break code, not making the change could lead to
> non-functioning code that looks valid at first sight.

This is exactly how I feel, yes. My concern is that making the change
now runs the risk of tricky regressions that we might not even prevent
with a detailed audit.

Anyway, I don't want to spread FUD about this -- I think the goal of
these changes is good. It's up to the MdePkg maintainers to evaluate the
risks, I just wanted us to be aware of them. Once we reach an end state
where all the BIT, SIZE and BASE macros are unsigned, and no regressions
are found (or none remain), that's for the best!

> Regarding your comment on the 'truncating patches', saying the code
> relies on the definitions being signed: I honestly doubt the code was
> intentionally written with that in mind, but rather intuitively and it
> happened to work. I might be wrong of course.

I think I agree with your assessment, considering the usual application
of these macros in source code. I distinguish three kinds of uses:

(1) The trusting / maybe "naive" client code: silently expects the BIT
    macros to be unsigned, resultant code happens to work; when it
    doesn't, it is fixed up with individual patches.

(2) The paranoid code: is aware that BIT macros are not (all) unsigned,
    believes that they should be unsigned, casts BIT macro applications
    to unsigned types locally, all further manipulation is done with
    unsigned types.

(3) The clever code: is aware that BIT macros are (mostly) signed, and
    exploits that fact for various ends.

I agree that most of the edk2 codebase should fall under (1).

As you may expect, I personally write (2), and code like (2) should not
worry about BIT / SIZE / BASE becoming unsigned.

My concern is (3). There have been examples in core edk2 modules that
explicitly relied on undefined behavior, such as left shifts of negative
integers and such. We've only slowly fixed them up after compilers /
analyzers started flagging them.

If we don't think (3) is a real risk, I'm fine with the approach of
these patches. (I don't think I'll be able to send a real R-b for them,
because that would require me to evaluate all uses, and that's a
Herculean task I just can't take on; apologies.)

>> Let me give you a pathologic example (it *does* occur in the wild,
>> although maybe not in edk2):
>>
>> {
>>   UINT64 MyMask;
>>
>>   MyMask = ~BIT0;
>> }
>>
>> Before your patch, MyMask is assigned the value
>> 0xFFFF_FFFF_FFFF_FFFE.
>>
>> This is because:
>>
>> - BIT0 has type "int" and value 1
>>
>> - Due to our (implementation-defined) representation of negative
>> integers, namely two's complement, ~BIT0 (also having type "int")
>> stands for value (- 2).
>
> There is pretty much where my issue is. I don't think triggering
> implementation-defined behavior is a good idea, not to speak about
> relying on it. EDK no longer targets solely x86, there is even a new
> architecture (Arch-V or something?) on the way, so I consider relying
> on compiler specifics very dangerous.

Oh I totally agree about that! As I said, the goal of the patches is
great, I absolutely agree about it; my concern is, will we find tricks
like the above by source code audit, or by a series of functional
regressions, drawn out over time?

[...]

>> My point is that such innocent-looking (and otherwise really good!)
>> changes for *central* macros require a large audit effort.
>
> Indeed that audit effort is required. How would that be handled for
> non-edk2 code anyway? Might there be a note in the UDK changelog, if
> accepted?

Very good question, and I don't have the slightest idea.

Considering the Linux kernel as an example, if your stuff is "out of
tree", you are on your own, when incompatible changes are committed. (If
you are "in-tree", then whoever does the incompat change will fix up
your code too.)

Obviously this doesn't (or may not) fly for edk2, which is a
"Development Kit". I don't know what the right procedure is here. I do
remember a problem from the past, when the IP4Config protocol was
replaced with the IP4Config2 protocol. It was well documented in UDK
release notes (as I recall), yet it broke a bunch of (proprietary)
downstreams of edk2, and some annoyed messages were posted to
edk2-devel. Again, IMO something for the MdePkg maintainers to judge.

[...]

>>> -#define ENCODE_ERROR(StatusCode)     ((RETURN_STATUS)(MAX_BIT |
>> (StatusCode)))
>>> +#define ENCODE_ERROR(StatusCode)     ((RETURN_STATUS)(MAX_BIT |
>> (StatusCode##ULL)))
>>>
>>
>> MAX_BIT always has type UINTN (UINT64 aka "unsigned long long" on
>> 64-bit platforms, and UINT32 aka "unsigned int" on 32-bit platforms).
>> This means that ENCODE_ERROR results in a UINTN, right now.
>
> Good point, I didn't keep that in mind. It probably should be solely
> 'U'.

In fact there's no need even for that: because MAX_BIT has type UINTN
("unsigned long long" or "unsigned int"), the StatusCode macro argument,
which is expected to be a small nonnegative "int", will be converted to
UINTN, for the bitwise OR operator.

Now, of course the suggested U suffix doesn't hurt, no question about
that. If it improves readability, then it's a good thing. In that case
though, how about updating the constants in the invocations of
ENCODE_ERROR(), instead? Because, if you use ##U within the replacement
text, then ENCODE_ERROR(4U) will become invalid (which I think is
unexpected).

>> After the change, it would always result in a UINT64. I think that
>> could be a problem for 32-bit platforms, especially if you push
>> open-coded RETURN_xxx status codes to the stack, for example for
>> printing with "%r". Pushing a UINT64 arg instead of a UINT32 arg
>> through the ellipsis (...) might throw off the rest of the argument
>> processing.
>
> Actually the cast to RETURN_STATUS is covering the entire operation,
> so in the end it should be UINTN again, no?

True; sorry, I lost track of that.

>> So, I wouldn't recommend this change. What is the exact motivation
>> for it?
>
> If I am not terribly mistaken, the operation results in
> implementation-defined behavior as-is, which the patch as a whole
> intends to avoid.

What is implementation-defined in the current definition of
ENCODE_ERROR()?

Given the current definition, if StatusCode is a signed integer, then
one way or another, it will be converted to (not reinterpreted as) an
unsigned integer type. This is due to the rules on the bitwise OR
operator and the outermost cast (thank you for reminding me of that).
Such conversions are fully defined in the C standard, they are not
implementation-defined.

For example, if we are on a 32-bit platform (i.e. UINTN means UINT32),
and StatusCode is (for some reason) an INT64 value, then MAX_BIT will
first be converted to INT64 (because the latter has higher conversion
rank, and can represent all values of UINT32), the bit-OR is performed
in INT64, and the result is converted back to UINT32 (RETURN_STATUS).

I don't see any way for the sign bit to be set (or otherwise toggled) in
a signed integer here.

Anyway, the edk2 coding style spec does advise against intricate C code
that requires deep knowledge of the standard / language rules; so if we
can improve readability without limiting use, I'm fine with that.

[...]

>>>  **/
>>> -#define SIGNATURE_16(A, B)        ((A) | (B << 8))
>>> +#define SIGNATURE_16(A, B)        ((UINT16)(A) | (UINT16)((UINT16)(B) <<
>> 8U))
>>
>> This might not have the intended effect. Due to "int" (INT32) being
>> capable of representing the entire domain of "unsigned short"
>> (UINT16), both operands of the bitwise OR operator would be promoted
>> to "int". The result would have type "int".
>
> To me, what type the result has was less important than making sure
> both operands are unsigned to ensure a well-defined operation result.

This is a good point -- and my comment was that this well-defined nature
was already ensured by simply following the documented interface
contract, namely using ASCII characters as macro arguments.

I don't insist though.

[...]

>> Anyway, if we want to be pedantic, the fix can be expressed more
>> simply, like this:
>>
>>   (SIGNATURE_16 (A, B) | ((UINT32)SIGNATURE_16 (C, D) << 16))
>>                           ^^^^^^^^
>>
>> Then the UINT32 type ("unsigned int") will "cascade" through the
>> entire expression -- (SIGNATURE_16 (A, B)) will be converted from
>> "int" to "unsigned int" for the bitwise OR, and the result will also
>> have type UINT32:
>>
>>   (INT32_VAL  | (UINT32)INT32_VAL << 16)
>>   (INT32_VAL  | UINT32_VAL)
>>   (UINT32_VAL | UINT32_VAL)
>>   (UINT32_VAL)
>
> This is only true if int <= INT32, isn't it?

That's correct. If UINT32 mapped to "unsigned short" and INT64 mapped to
"int", then ((UINT32)SIGNATURE_16 (C, D)) would be promoted to INT64
("int"), and the rest of the operations would be carried out in INT64
("int").

Is your point that we shouldn't even assume INT32=="int" and
UINT32=="unsigned int"?

I think that would be a good point for a highly portable C codebase that
was written from the ground up without type assumptions. But I don't
think it applies to edk2. If UINT32 mapped to "unsigned short" and INT64
mapped to "int", edk2 would break in a thousand places immediately.

[...]

>>> @@ -1260,7 +1260,7 @@ typedef UINTN RETURN_STATUS;
>>>
>>>  **/
>>>  #define SIGNATURE_64(A, B, C, D, E, F, G, H) \
>>> -    (SIGNATURE_32 (A, B, C, D) | ((UINT64) (SIGNATURE_32 (E, F, G, H)) <<
>> 32))
>>> +    ((UINT64)SIGNATURE_32 (A, B, C, D) | ((UINT64)
>>> + ((UINT64)SIGNATURE_32 (E, F, G, H)) << 32U))
>>>
>>>  #if defined(_MSC_EXTENSIONS) && !defined (__INTEL_COMPILER) &&
>> !defined (MDE_CPU_EBC)
>>>    void * _ReturnAddress(void);
>>>
>>
>> Assuming that the SIGNATURE_32 is *either* fixed (so that it produces
>> a UINT32 value) *or* used in accordance with the documentation (ie.,
>> only ASCII characters as arguments), the replacement text for
>> SIGNATURE_64 is already correct, and this hunk is unneeded.
>
> I think this is also only true if int <= INT64.

Sure. And I think if "int" corresponds to INT128, then this macro is the
smallest problem that the edk2 code base runs into. :)

I do get your point now. I believe you are trying to eliminate all (or
most) implementation-defined traits from these macros, because "that's
how portable C code should be written".

Personally I think that's a laudable goal, and I agree that all portable
C code should be written like that. I also think however that there's a
large disconnect between that goal / style, and how edk2 (and even the
UEFI spec) exist today. In my opinion / experience, edk2 was never meant
to be portable to unknown, future platforms. For one, it is tied to
little endian byte order on the spec level.

Anyway, if you submit a v2 of these series (please make them threaded!),
I'll try to review those with this in mind.

Thanks!
Laszlo