From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: mx.groups.io; dkim=missing; spf=pass (domain: intel.com, ip: 134.134.136.24, mailfrom: jordan.l.justen@intel.com) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by groups.io with SMTP; Wed, 17 Apr 2019 12:35:09 -0700 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 17 Apr 2019 12:35:08 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,363,1549958400"; d="scan'208";a="132239945" Received: from mjguyett-mobl1.amr.corp.intel.com (HELO localhost) ([10.251.132.204]) by orsmga007.jf.intel.com with ESMTP; 17 Apr 2019 12:35:07 -0700 MIME-Version: 1.0 In-Reply-To: <3a4d9e9b-ffc2-3034-dc20-29b665524b75@redhat.com> References: <20190412233128.4756-1-lersek@redhat.com> <155522637661.21857.4743822681286277764@jljusten-skl> <3bbbb85e-5557-d99b-1c3b-50a844455d20@redhat.com> <155540548458.13612.11281694046292591090@jljusten-skl> <413ac018-bcf2-f510-00d0-33315974a3c2@redhat.com> <155544052538.15733.153410443320244157@jljusten-skl> <940201E3-0EDB-40B8-8680-CDE68DA0FD06@apple.com> <71e4f508-75f2-79a7-967e-d7a6a0e34341@redhat.com> <3a4d9e9b-ffc2-3034-dc20-29b665524b75@redhat.com> From: "Jordan Justen" Subject: Re: [edk2-devel] [PATCH 02/10] MdePkg/PiFirmwareFile: fix undefined behavior in SECTION_SIZE To: Andrew Fish , Laszlo Ersek , devel@edk2.groups.io Cc: devel@edk2.groups.io, Mike Kinney , Liming Gao Message-ID: <155552970738.28368.15192384950564316161@jljusten-skl> User-Agent: alot/0.8 Date: Wed, 17 Apr 2019 12:35:07 -0700 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable On 2019-04-17 07:59:41, Laszlo Ersek wrote: > On 04/17/19 13:44, Andrew Fish wrote: >=20 > > Sorry I digressed into the C specification discussion, and did not > > deal with the patch in general. My point is the original code is legal > > C code. If you lookup CWE-119 it is written as a restriction on what > > the C language allows. > > > > As I mentioned casting to specific alignment is legal, as is defining > > a structure that is pragma pack(1) that can make a UINT32 not be 4 > > byte aligned. Thus the cast created a legal UINT32 value. A cast to > > *(UINT32 *) is different that a cast to (UINT32 *). The rules you > > quote a triggered by the =3D and not the cast. > > > > Thus this is undefined behavior in C: > > UINT32 *Ub =3D (UINT32 *)gSection.Sec.Size; > > Size =3D *Ub & 0x00ffffff; > > > > And this is not Undefined behavior: > > UINT32 NotUb =3D *(UINT32 *)gSection.Sec.Size & 0x00ffffff; >=20 > I agree the 2nd snippet may not be UB due to alignment violations > *alone*. >=20 > It is still UB due to violating the effective type rules. >=20 > > I also had a hard time interpreting what C spec was saying, but > > talking to the people who write the compiler and ubsan cleared it up > > for me. It also makes sense when you think about it. If you tell the > > compiler *(UINT32 *) it can know to generate byte reads if the > > hardware requires aligned access. If you do a (UINT32 *) that new > > pointer no longer carries the information about the alignment > > requirement. Thus the *(UINT32 *) cast is like making a packed > > structure. >=20 > Yes, I think I'm clear on how the alignment information is carried > around, and when it is lost. In your first example above, due to us > forming a separate (standalone) pointer, we lose the alignment > information, and then the assignment is undefined due to an alignment > violation. While in the second example, the alignment information is not > lost, and the assignment is not undefined on an alignment basis *alone*. >=20 > However: the second assignment is *still* undefined, because it violates > the effective type rules. Here's a more direct example for the same: >=20 > STATIC UINT64 mUint64; >=20 > int main(void) > { > UINT16 *Uint16Ptr; >=20 > Uint16Ptr =3D (UINT16 *)&mUint64; > *Uint16Ptr =3D 1; > return 0; > } >=20 > The assignment to (*Uint16Ptr) is fine from the alignment perspective, > but it is nevertheless undefined, because it breaks the effective type > rules. Namely, UINT16 (the type used for the access) is not compatible > with UINT64 (the effective type of mUint64). >=20 > Normally, we don't care about this situation in edk2 -- in fact we write > loads of code like the above --, but we get away with that only because > we force the toolchains to ignore the effective type rules. For GCC in > particular, the option is "-fno-strict-aliasing". >=20 > The only reason I've posted this patch is that "cppcheck" (invoked as a > part of "RH covscan") doesn't care about "-fno-strict-aliasing". And > while "cppcheck" happens to report the overrun, and not the type > punning, the way to remove the warning is to adhere to all the C rules > in the expression, even though we have "-fno-strict-aliasing" in place. >=20 > > I agree the union is a good solution to CWE-119 and it better matches > > the alignment requirement in the PI spec. >=20 > Thank you. >=20 > I'll wait a bit longer to see if Jordan accepts this 02/10 patch based > on the most recent comments, and whether Liming or Mike accepts 04/10. >=20 > If Jordan remains unconvinced on SECTION_SIZE (in this 02/10 patch), and > Liming or Mike are fine with 04/10, I can rework 02/10 to follow 04/10. >=20 > If Jordan remains unconvinced but Mike or Liming prefers the union, then > we have a stalemate and I'll abandon the patch set. I did have a (slight) concern about adding a typedef to a public header that wasn't in the spec. It seemed like something that someone somewhere might not like in case it could interfere with future versions of the spec. According to Liming, we don't have to worry about that. Regarding the UINT32* discussion, I didn't think the union really would make a difference vs skipping the union and casting the original struct pointer directly to a UINT32*. I can see Andrew's point that the union may add some alignment assumptions to the dereference, so I can see how that does potentially change something subtle. Maybe on some machines it will allow for more efficient reading of the data with the (valid) alignment assumption. I was also not seeing why you were saying it produced *undefined* results. I don't think it does in our case, but when you point out that we are aliasing data access, I can see how that quickly gets into *undefined* territory from a compiler's perspective. Anyway, given Liming's feedback that it is ok to add the union, I'm ok with this patch. -Jordan