From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received-SPF: Pass (sender SPF authorized) identity=mailfrom; client-ip=66.187.233.73; helo=mx1.redhat.com; envelope-from=lersek@redhat.com; receiver=edk2-devel@lists.01.org Received: from mx1.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 6D27122352291 for ; Wed, 28 Feb 2018 05:51:15 -0800 (PST) Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 4EFE9818B114; Wed, 28 Feb 2018 13:57:22 +0000 (UTC) Received: from lacos-laptop-7.usersys.redhat.com (ovpn-120-11.rdu2.redhat.com [10.10.120.11]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5A146217B400; Wed, 28 Feb 2018 13:57:21 +0000 (UTC) To: =?UTF-8?Q?Marvin_H=c3=a4user?= , "edk2-devel@lists.01.org" Cc: "michael.d.kinney@intel.com" , "liming.gao@intel.com" References: <62c9363b-7f27-cfff-492a-560660727b86@redhat.com> <366ffc0c-b55f-a3c1-973e-b80d3dd07d26@redhat.com> From: Laszlo Ersek Message-ID: <2b22bfbd-24ce-e26c-9f1c-e5ba2816b48f@redhat.com> Date: Wed, 28 Feb 2018 14:57:20 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Wed, 28 Feb 2018 13:57:22 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Wed, 28 Feb 2018 13:57:22 +0000 (UTC) for IP:'10.11.54.6' DOMAIN:'int-mx06.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'lersek@redhat.com' RCPT:'' Subject: Re: [PATCH 1/2] MdePkg/Base.h: Ensure safe bitwise operations. X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Feb 2018 13:51:16 -0000 Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit On 02/28/18 12:43, Marvin Häuser wrote: > Actually, your explanations and the rest of the bit defines made me > wonder, whether marking all BIT defines and bit masks of any kind, > anywhere, as ULL, might be the best solution. For a new project just being started, that could be one of the safest / wisest choices. :) For edk2, about the only counter-argument I'd have right now is that many assignments and function arguments could suddenly require down-casts. > +1) Inversion can be used for any integer size uncasted, Right, you could just use ~BITn, and truncate it to whatever unsigned type is needed. > as edk2 does not support vendor extensions such as __int128 anyway. Not *yet*, I guess :) UEFI 2.7 does list UINT128 / INT128, in table 5, "Common UEFI Data Types". I believe those typedefs may have been added for RISC-V. > +2) Binary operations (AND, OR, ...) should not raise any problems at > all (except for our fellow using VS2005x86 :) ) Haha :) In earnest though, you are right. > +3) Assignments and passing-by-arguments will result in compiler-time > errors when not done properly in the sense of this patch, which > eliminates the need to audit all usages manually, but just compile the > entire codebase and fix the errors one-by-one (which is of course > still not a quick task, but if the package authors agree with the > proposal, I might attempt it). Good point; it didn't occur to me that a large number of the "client sites" would be flagged by the compiler at once. ... I've just remembered another possible problem: I think some EBC compiler cannot use 64-bit integers as case labels in switch statements. See commit ada385843b94 ("MdeModulePkg/PciBus: Change switch-case to if-else to fix EBC build", 2018-01-10). Some case labels currently built of BIT[0..31] macros could break in EBC builds. OTOH, the (EBC) compiler would catch those as well; no silent breakage. > -1) The 'truncating constant value' warning would probably need to be > disabled globally, however I don't understand how an explicit cast is > a problem anyway. > > Did I overlook anything contra regarding that? Hmmm... Do you think it could have a performance impact on 32-bit platforms? (I don't think so, at least not in optimized / RELEASE builds.) >> (3) The clever code: is aware that BIT macros are (mostly) signed, >> and exploits that fact for various ends. >> If we don't think (3) is a real risk, I'm fine with the approach of >> these patches. (I don't think I'll be able to send a real R-b for >> them, because that would require me to evaluate all uses, and that's >> a Herculean task I just can't take on; apologies.) > > I hope that the 'all ULL' proposal uncovers them all. Would there be > cases, where no error would be raised? Dunno :) I quite like the idea! >>>>> -#define ENCODE_ERROR(StatusCode) ((RETURN_STATUS)(MAX_BIT | >>>> (StatusCode))) >>>>> +#define ENCODE_ERROR(StatusCode) ((RETURN_STATUS)(MAX_BIT >> | >>>> (StatusCode##ULL))) >> Given the current definition, if StatusCode is a signed integer, then >> one way or another, it will be converted to (not reinterpreted as) an >> unsigned integer type. This is due to the rules on the bitwise OR >> operator and the outermost cast (thank you for reminding me of that). >> Such conversions are fully defined in the C standard, they are not >> implementation-defined. > > Thanks for noting. Would you still append the U-prefix for readability > / making the intention clear? I think I'd document the requirement that StatusCode have an unsigned integer type, and then I'd update the ENCODE_ERROR() macro invocations to conform. >> Is your point that we shouldn't even assume INT32=="int" and >> UINT32=="unsigned int"? > > Yes, pretty much. I don't think it does hurt either. While it is > certainly a very ugly definition, that's my preferred use for macros > anyway - hide ugly code behind a lovely name. > >> >> I think that would be a good point for a highly portable C codebase >> that was written from the ground up without type assumptions. But I >> don't think it applies to edk2. If UINT32 mapped to "unsigned short" >> and INT64 mapped to "int", edk2 would break in a thousand places >> immediately. > > Of course it would, however I imagined a header file that is included > in every file for every architecture to be as close to flawless as > possible. > While of course I would also fix such assumptions in executable code > if I found, and hope the maintainers care enough to review and merge, > I consider header files a greater danger as that 'flawed' (in an > idealistic sense) macro is expanded into external code, which might > cause trouble while debugging, as the consumer assumes the mistake is > within his code. > Headers >>> library code > module code, in my opinion, especially for > the most central one. I'll defer to the MdePkg maintainers on this. >> Personally I think that's a laudable goal, and I agree that all >> portable C code should be written like that. I also think however >> that there's a large disconnect between that goal / style, and how >> edk2 (and even the UEFI spec) exist today. In my opinion / >> experience, edk2 was never meant to be portable to unknown, future >> platforms. For one, it is tied to little endian byte order on the >> spec level. > > Personally, I am fine to assert anything that the specification > demands (that's what it's here for after all), however we are not > aware of the compiler-specifics at spec-level. It is interesting to contrast: - the use of the standard (optional!) uint32_t type in portable, hosted C projects - with the use of UINT32 in edk2. * From C99: 7.18.1.1 Exact-width integer types 1 [...] 2 The typedef name uintN_t designates an unsigned integer type with width N. Thus, uint24_t denotes an unsigned integer type with a width of exactly 24 bits. 3 These types are optional. However, if an implementation provides integer types with widths of 8, 16, 32, or 64 bits, no padding bits, and (for the signed types) that have a two's complement representation, it shall define the corresponding typedef names. In my experience, portable, hosted C code uses the standard types by default (int, unsigned int, size_t, etc) and uses uint32_t and friends only in contexts where the representation matters -- mostly for implementing binary formats (files, network, ...). And going from one set of types to the other is usually strongly protected with range checks; there is usually some kind of (de)serialization that very consciously handles such transitions. This practice is very good because the integer promotions and the usual arithmetic conversions, as defined by the standard, i.e. in terms of standard integer types, are relatively easy to evalute for most of the code. Even bit-fiddling can be done in just "unsigned", "unsigned long" or "unsigned long long", because guarantees minimum magnitudes for those types. * In edk2 however (and in the UEFI spec), we have: - exact-width integer types that are hard to map to standard integer types -- hence the promotions and conversions are hard to evaluate in a portability mindset (they are not hard to evaluate if you just take UINT32=="unsigned int" for guaranteed, but that's what you'd like to avoid), - UINTN / INTN, which is defined as "native width". That might make some sense in assembly, but in C, "native width" makes no sense. The promotion and conversion rules can be interpreted for UINTN only very indirectly, by making some assumptions about UINT32 and UINT64. (And, about UINT128, theoretically :) ) Combine this with the fact that in edk2 source code, we're supposed to *avoid* using the standard C types! This is the exact opposite of the practice that I described for those portable, hosted C projects (i.e., stick with the standard types by default, go "exact width" only when unavoidable). When I first encountered edk2, I was utterly confused by this practice. I had seen it nowhere else. Reasoning about promotions and conversions looked plain impossible. I still think it was a bad choice (I don't know where it comes from, maybe Windows?), but I've gotten used to it by now -- I do equate UINT32 to "unsigned int", "UINT64" to "unsigned long long", and this way I can actually reason about the ISO C promotions and conversions. If you take that away, I'm not sure what we'll be left with :) I just think that the entire ISO C portability mindset, albeit *absolutely* desirable, is entirely foreign to the edk2 codebase (and the UEFI spec too). I agree we should avoid even implementation-defined behavior, as much as we can -- but not more than we can. If we start doubting UINT32 maps to "unsigned int", the entire code base falls apart in my eyes. Now, I do understand why UEFI (the spec) defines such exact-width types, and declares protocols in terms of those types. That's because UEFI is an ABI spec, not an API spec. That makes sense. However: - rather than introducing UINT32 etc, the spec should have used the standard C types uint32_t etc, - regardless of the UEFI interfaces being declared with exact-width integer types, edk2 code should be allowed (or even *prefer*) to use standard C types, such as "int", "unsigned int" etc, unless necessitated otherwise. Sorry about this long rant; I'm trying to explain why I felt that the goal of your patches didn't match edk2 very well. It doesn't mean that your patches are "wrong". Thanks Laszlo