public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
From: "Liming Gao" <liming.gao@intel.com>
To: Laszlo Ersek <lersek@redhat.com>,
	"devel@edk2.groups.io" <devel@edk2.groups.io>,
	"Justen, Jordan L" <jordan.l.justen@intel.com>,
	Andrew Fish <afish@apple.com>
Cc: "Kinney, Michael D" <michael.d.kinney@intel.com>
Subject: Re: [edk2-devel] [PATCH 02/10] MdePkg/PiFirmwareFile: fix undefined behavior in SECTION_SIZE
Date: Thu, 18 Apr 2019 15:18:42 +0000	[thread overview]
Message-ID: <4A89E2EF3DFEDB4C8BFDE51014F606A14E42B1BB@SHSMSX104.ccr.corp.intel.com> (raw)
In-Reply-To: <4d610ca7-d0ed-b009-6da1-92be09cb6d68@redhat.com>

> -----Original Message-----
> From: Laszlo Ersek [mailto:lersek@redhat.com]
> Sent: Thursday, April 18, 2019 5:39 PM
> To: devel@edk2.groups.io; Justen, Jordan L <jordan.l.justen@intel.com>; Andrew Fish <afish@apple.com>
> Cc: Kinney, Michael D <michael.d.kinney@intel.com>; Gao, Liming <liming.gao@intel.com>
> Subject: Re: [edk2-devel] [PATCH 02/10] MdePkg/PiFirmwareFile: fix undefined behavior in SECTION_SIZE
> 
> On 04/17/19 21:35, Jordan Justen wrote:
> > On 2019-04-17 07:59:41, Laszlo Ersek wrote:
> >> On 04/17/19 13:44, Andrew Fish wrote:
> >>
> >>> Sorry I digressed into the C specification discussion, and did not
> >>> deal with the patch in general. My point is the original code is legal
> >>> C code. If you lookup CWE-119 it is written as a restriction on what
> >>> the C language allows.
> >>>
> >>> As I mentioned casting to specific alignment is legal, as is defining
> >>> a structure that is pragma pack(1) that can make a UINT32 not be 4
> >>> byte aligned. Thus the cast created a legal UINT32 value. A cast to
> >>> *(UINT32 *) is different that a cast to (UINT32 *). The rules you
> >>> quote a triggered by the = and not the cast.
> >>>
> >>> Thus this is undefined behavior in C:
> >>>   UINT32 *Ub = (UINT32 *)gSection.Sec.Size;
> >>>   Size = *Ub & 0x00ffffff;
> >>>
> >>> And this is not Undefined behavior:
> >>>   UINT32 NotUb = *(UINT32 *)gSection.Sec.Size & 0x00ffffff;
> >>
> >> I agree the 2nd snippet may not be UB due to alignment violations
> >> *alone*.
> >>
> >> It is still UB due to violating the effective type rules.
> >>
> >>> I also had a hard time interpreting what C spec was saying, but
> >>> talking to the people who write the compiler and ubsan cleared it up
> >>> for me. It also makes sense when you think about it. If you tell the
> >>> compiler *(UINT32 *) it can know to generate byte reads if the
> >>> hardware requires aligned access. If you do a (UINT32 *) that new
> >>> pointer no longer carries the information about the alignment
> >>> requirement. Thus the  *(UINT32 *) cast is like making a packed
> >>> structure.
> >>
> >> Yes, I think I'm clear on how the alignment information is carried
> >> around, and when it is lost. In your first example above, due to us
> >> forming a separate (standalone) pointer, we lose the alignment
> >> information, and then the assignment is undefined due to an alignment
> >> violation. While in the second example, the alignment information is not
> >> lost, and the assignment is not undefined on an alignment basis *alone*.
> >>
> >> However: the second assignment is *still* undefined, because it violates
> >> the effective type rules. Here's a more direct example for the same:
> >>
> >> STATIC UINT64 mUint64;
> >>
> >> int main(void)
> >> {
> >>   UINT16 *Uint16Ptr;
> >>
> >>   Uint16Ptr = (UINT16 *)&mUint64;
> >>   *Uint16Ptr = 1;
> >>   return 0;
> >> }
> >>
> >> The assignment to (*Uint16Ptr) is fine from the alignment perspective,
> >> but it is nevertheless undefined, because it breaks the effective type
> >> rules. Namely, UINT16 (the type used for the access) is not compatible
> >> with UINT64 (the effective type of mUint64).
> >>
> >> Normally, we don't care about this situation in edk2 -- in fact we write
> >> loads of code like the above --, but we get away with that only because
> >> we force the toolchains to ignore the effective type rules. For GCC in
> >> particular, the option is "-fno-strict-aliasing".
> >>
> >> The only reason I've posted this patch is that "cppcheck" (invoked as a
> >> part of "RH covscan") doesn't care about "-fno-strict-aliasing". And
> >> while "cppcheck" happens to report the overrun, and not the type
> >> punning, the way to remove the warning is to adhere to all the C rules
> >> in the expression, even though we have "-fno-strict-aliasing" in place.
> >>
> >>> I agree the union is a good solution to CWE-119 and it better matches
> >>> the alignment requirement in the PI spec.
> >>
> >> Thank you.
> >>
> >> I'll wait a bit longer to see if Jordan accepts this 02/10 patch based
> >> on the most recent comments, and whether Liming or Mike accepts 04/10.
> >>
> >> If Jordan remains unconvinced on SECTION_SIZE (in this 02/10 patch), and
> >> Liming or Mike are fine with 04/10, I can rework 02/10 to follow 04/10.
> >>
> >> If Jordan remains unconvinced but Mike or Liming prefers the union, then
> >> we have a stalemate and I'll abandon the patch set.
> >
> > I did have a (slight) concern about adding a typedef to a public
> > header that wasn't in the spec. It seemed like something that someone
> > somewhere might not like in case it could interfere with future
> > versions of the spec. According to Liming, we don't have to worry
> > about that.
> >
> > Regarding the UINT32* discussion, I didn't think the union really
> > would make a difference vs skipping the union and casting the original
> > struct pointer directly to a UINT32*.
> 
> The gcc docs on "-fstrict-aliasing" seem to suggest the same, using an
> example -- but to me that example looks wrong (it seems well-defined,
> based on the same rules I quoted from the standard); i.e. that looks
> like a bug to me in the gcc docs. I've contacted a colleague at RH in
> the toolchain team about it.
> 
> > I can see Andrew's point that the union may add some alignment
> > assumptions to the dereference, so I can see how that does potentially
> > change something subtle. Maybe on some machines it will allow for more
> > efficient reading of the data with the (valid) alignment assumption.
> >
> > I was also not seeing why you were saying it produced *undefined*
> > results. I don't think it does in our case, but when you point out
> > that we are aliasing data access, I can see how that quickly gets into
> > *undefined* territory from a compiler's perspective.
> 
> Right, I approached this purely from the standard's perspective. The
> standard says, in "4. Conformance":
> 
> 1 In this International Standard, ‘‘shall’’ is to be interpreted as a
>   requirement on an implementation or on a program; conversely, ‘‘shall
>   not’’ is to be interpreted as a prohibition.
> 
> 2 If a ‘‘shall’’ or ‘‘shall not’’ requirement that appears outside of a
>   constraint is violated, the behavior is undefined. Undefined behavior
>   is otherwise indicated in this International Standard by the words
>   ‘‘undefined behavior’’ or by the omission of any explicit definition
>   of behavior. There is no difference in emphasis among these three;
>   they all describe ‘‘behavior that is undefined’’.
> 
> In this particular case, 6.5p7 goes "An object *shall* have its stored
> value accessed *only* by [...]" (emphasis mine), and that paragraph
> doesn't fall in a Constraints section, so 4p2 clearly applies.
> 
> That's why I called it undefined -- it might work perfectly well in
> practice on all the edk2 platforms, possibly due to our carefully
> selected compiler switches, but that's not what static analyzers care
> about. AIUI they only care about the standard, hence my focus on the
> standard for the fix.
> 
> (
> 
> Side note: the classification "outside of a constraint" in 4p2 above is
> relevant because we also have:
> 
> 5.1.1.3 Diagnostics
> 
> 1 A conforming implementation shall produce at least one diagnostic
>   message (identified in an implementation-defined manner) if a
>   preprocessing translation unit or translation unit contains a
>   violation of any syntax rule or constraint, even if the behavior is
>   also explicitly specified as undefined or implementation-defined.
>   [...]
> 
> IOW, if we break a "shall" or "shall not" that is under a Constraint,
> then the compiler *must* yell at us; it cannot allow the UB to slip our
> attention.
> 
> )
> 
> > Anyway, given Liming's feedback that it is ok to add the union, I'm ok
> > with this patch.
> 
> Funnily enough, in parallel, Mike has expressed a preference for the
> byte-wise access plus the shifting (without helper macros) :)
> 
> I don't see why Liming would dislike that, and I gather you too would
> still prefer that to the union, so I'm going to incorporate it in v2.
> 
I am OK for both solution. I don't think union breaks something. I agree byte access way is the consistent solution in section header and ffs header. So, I am OK for them both.

> Thank you!
> Laszlo
> 
> > 
> >


  reply	other threads:[~2019-04-18 15:18 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-12 23:31 [PATCH 00/10] patches for some warnings raised by "RH covscan" Laszlo Ersek
2019-04-12 23:31 ` [PATCH 01/10] MdePkg/PiFirmwareFile: express IS_SECTION2 in terms of SECTION_SIZE Laszlo Ersek
2019-04-15 17:01   ` [edk2-devel] " Philippe Mathieu-Daudé
2019-04-12 23:31 ` [PATCH 02/10] MdePkg/PiFirmwareFile: fix undefined behavior in SECTION_SIZE Laszlo Ersek
2019-04-14  7:19   ` [edk2-devel] " Jordan Justen
2019-04-15 16:15     ` Laszlo Ersek
2019-04-16  8:28       ` Liming Gao
2019-04-16  9:04       ` Jordan Justen
2019-04-16 10:59         ` Laszlo Ersek
2019-04-16 16:50           ` Philippe Mathieu-Daudé
2019-04-17 10:08             ` Laszlo Ersek
2019-04-16 18:48           ` Jordan Justen
2019-04-16 23:25             ` Andrew Fish
2019-04-17 10:29               ` Laszlo Ersek
2019-04-17 11:44                 ` Andrew Fish
2019-04-17 14:59                   ` Laszlo Ersek
2019-04-17 19:35                     ` Jordan Justen
2019-04-18  9:38                       ` Laszlo Ersek
2019-04-18 15:18                         ` Liming Gao [this message]
2019-04-17 10:01             ` Laszlo Ersek
2019-04-12 23:31 ` [PATCH 03/10] BaseTools/PiFirmwareFile: " Laszlo Ersek
2019-04-12 23:31 ` [PATCH 04/10] MdePkg/PiFirmwareFile: fix undefined behavior in FFS_FILE_SIZE Laszlo Ersek
2019-04-15 17:23   ` [edk2-devel] " Philippe Mathieu-Daudé
2019-04-17 17:52   ` Michael D Kinney
2019-04-17 18:31     ` Michael D Kinney
2019-04-18  9:06       ` Laszlo Ersek
2019-04-17 18:31     ` Andrew Fish
2019-04-17 18:36       ` Michael D Kinney
2019-04-18  8:48         ` Laszlo Ersek
2019-04-18  8:45       ` Laszlo Ersek
2019-04-18 23:12         ` Laszlo Ersek
2019-04-18 17:20     ` Philippe Mathieu-Daudé
2019-04-18 17:59       ` Michael D Kinney
2019-04-18 18:12         ` Philippe Mathieu-Daudé
2019-04-12 23:31 ` [PATCH 05/10] OvmfPkg/Sec: fix out-of-bounds reads Laszlo Ersek
2019-04-15 17:24   ` [edk2-devel] " Philippe Mathieu-Daudé
2019-04-12 23:31 ` [PATCH 06/10] OvmfPkg/QemuVideoDxe: avoid arithmetic on null pointer Laszlo Ersek
2019-04-12 23:31 ` [PATCH 07/10] OvmfPkg/AcpiPlatformDxe: suppress invalid "deref of undef pointer" warning Laszlo Ersek
2019-04-15 17:26   ` [edk2-devel] " Philippe Mathieu-Daudé
2019-04-12 23:31 ` [PATCH 08/10] OvmfPkg: suppress "Value stored to ... is never read" analyzer warnings Laszlo Ersek
2019-04-14  8:03   ` [edk2-devel] " Jordan Justen
2019-04-15 16:25     ` Laszlo Ersek
2019-04-16  9:26       ` Jordan Justen
2019-04-16 11:44         ` Laszlo Ersek
2019-04-12 23:31 ` [PATCH 09/10] OvmfPkg/AcpiPlatformDxe: catch theoretical nullptr deref in Xen code Laszlo Ersek
2019-04-15 17:28   ` [edk2-devel] " Philippe Mathieu-Daudé
2019-04-12 23:31 ` [PATCH 10/10] OvmfPkg/BasePciCapLib: suppress invalid "nullptr deref" warning Laszlo Ersek
2019-04-15 17:31   ` [edk2-devel] " Philippe Mathieu-Daudé
2019-04-16 11:01     ` Laszlo Ersek
2019-04-12 23:36 ` [PATCH 00/10] patches for some warnings raised by "RH covscan" Ard Biesheuvel
2019-04-15 16:16   ` Laszlo Ersek
2019-04-18 14:20 ` [edk2-devel] " Laszlo Ersek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A89E2EF3DFEDB4C8BFDE51014F606A14E42B1BB@SHSMSX104.ccr.corp.intel.com \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox