public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
From: "Laszlo Ersek" <lersek@redhat.com>
To: devel@edk2.groups.io, rsingh@ventanamicro.com, "Kinney,
	Michael D" <michael.d.kinney@intel.com>
Cc: "Ni, Ray" <ray.ni@intel.com>,
	Veeresh Sangolli <veeresh.sangolli@dellteam.com>
Subject: Re: [edk2-devel] [PATCH v3 1/2] MdeModulePkg/Bus/Pci/PciHostBridgeDxe: Fix OVERRUN Coverity issues
Date: Mon, 13 Nov 2023 17:12:01 +0100	[thread overview]
Message-ID: <88f1fa80-8639-3abb-8030-ec4f02b80358@redhat.com> (raw)
In-Reply-To: <CAA9DWXD0Eq8T6zn56mcySGhh1iFR=eNs_0=fCeRMujpL5O65bQ@mail.gmail.com>

On 11/10/23 05:07, Ranbir Singh wrote:
> Options before us till now -
> 
> 1. Add array overrun check and Debug statement before CpuDeadLoop within

This would be useless.

Your current patch implements the following pattern:

  ASSERT (X);
  if (!X) {
    CpuDeadLoop ();
  }

Option#1 would mean

  ASSERT (X);
  if (!X) {
    DEBUG ((DEBUG_ERROR, "%a: condition X failed\n", __func__));
    CpuDeadLoop ();
  }

Now compare the behavior of both code snippets.

- In DEBUG and NOOPT builds, if X evaluated to FALSE, then the ASSERT()
would fire. A DEBUG_ERROR message would be logged, including "X", the
file name, and the line number. ASSERT() would then enter CpuDeadLoop()
internally, or invoke CpuBreakpoint() -- dependent on platform PCD
configuration. This applies to both snippets. In particular, the
explicit DEBUG and CpuDeadLoop() would not be reached, in the second
snippet. In other words, in DEBUG and NOOPT builds, the difference is
unobservable.

- In RELEASE builds, the ASSERT() is compiled out of both snippets.
Furthermore, the explicit DEBUG is compiled out of the second snippet.
That is, both code snippets would silently enter CpuDeadLoop(). In other
words, the behavior of both snippets is undistinguishable in RELEASE
builds too.

In my opinion, the current patch is right.

Reviewed-by: Laszlo Ersek <lersek@redhat.com>


To elaborate on that more generally:

Core edk2 code has so consistently and so broadly *abused* and *misused*
ASSERT() for error checking, that now we fail to recognize an *actual
valid use* of ASSERT() for what it is. For illustration, compare the
following two code snippets:

(a)

  UINTN Val;

  Val = 1 + 2;
  ASSERT (Val == 3);

(b)

  VOID *Ptr;

  Ptr = AllocatePool (1024);
  ASSERT (Ptr != NULL);

Snippet (a) is a *valid* use of an assert. It encapsulates a logical
predicate about the program's state, based on algorithmic and
mathematical reasoning. If the predicate turns out not to hold in
practice, at runtime,that means the programmer has retroactively caught
an issue in their own thinking, the algorithm is incorrectly designed or
implemented, and the program's state is effectively unknown at this
point (the internal invariant in question is broken), and so we must
stop immediately, because we don't know what the program is going to do
next. Given that what we thought about the program *thus far* has now
turned out to be false.

Snippet (b) is an abuse of assert. AllocatePool()'s result depends on
the environment. Available memory, input data seen from the user or the
disk or the network thus far, and a bunch of other things not under our
control. Therefore, we must explicitly deal with AllocatePool() failing.
Claiming that AllocatePool() will succeed is wrong because we generally
cannot even *attempt* to prove it.

In case (a) however, we can actually make a plausible attempt at
*proving* the predicate. The assert is there just in case our proof is
wrong.

There's an immense difference between situations (a) and (b). I cannot
emphasize that enough. Case (a) is a provable statement about the
behavior of an algorithm. Case (b) is a *guess* at input data and
environment.

The problem with edk2 core is that it has so consistently abused
ASSERT() for case (b) that now, when we occasionally see a *valid
instance* of (a), we mistake it for (b).

That's the case here. The ASSERT() seen in this patch is case (a). It's
just that Coverity cannot prove it itself, because the predicate we
assert is much more complicated than (1 + 2 == 3). But that doesn't
change the fact that we have a proof for the invariant in question.

Therefore, the solution is not to try to handle an impossible error
gracefully. The solution is two-fold:

- Tell coverity that we have a proof, in terms that it understands, to
shut it up. The terminology for this is CpuDeadLoop(). We're willing to
hang at runtime even in RELEASE builds, if the condition fails.

- If, at runtime, our proof turns out to be wrong indeed, then we must
stop immediately (because if 1+2 stops equalling 3, then we can't trust
*anything else* that our program would do).

(The more generic corollary of my last point, by the way, is that
ASSERT()s should NEVER be compiled out of programs. And if we actually
did that, like many other projects do, then this Coverity warning
wouldn't even exist, in the first place, because Coverity would *always*
see -- with the ASSERT() being there in all builds -- a range check on
Index.

Put differently, having to add a CpuDeadLoop() explicitly is just
"damage control" for the self-inflicted damage of compiling out ASSERTs.)

Laszlo



> 2. Status Quo (not everything can be ideal :-))
> 
> Question before us
>      - Is 1 better than 2 ?
> 
> 
> On Fri, Nov 10, 2023 at 8:41 AM Ranbir Singh <rsingh@ventanamicro.com
> <mailto:rsingh@ventanamicro.com>> wrote:
> 
>     As far as I know, from a secure coding perspective, it would be
>     recommended that array overrun condition check is captured in the
>     code even if it is felt that it will never hit.
> 
>     Generally speaking, I won't be in favour of handling other ASSERT
>     conditions updates even if required if they are not related to array
>     overrun conditions i.e., the context of the patch.
> 
>     If someone / PCI maintainers can advise in this patch context what
>     should be done in the array overrun condition, I will be happy to
>     update, otherwise, sorry to say I won't be able to pursue this
>     particular one further and hence would be leaving the related code
>     with the status quo here.
> 
>     On Fri, Nov 10, 2023 at 2:10 AM Kinney, Michael D
>     <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>> wrote:
> 
>         Hi Ranbir,
> 
>         A deadloop without even a debug print is not good behavior.
> 
>         If this condition really represents a condition where it is not
>         possible
>         to complete the PCI resource allocation/assignment, then an
>         error status
>         code should be returned to the caller of NotifyPhase().  Perhaps
>         EFI_OUT_OF_RESOURCES.  The other ASSERT() conditions in this API
>         should
>         likely be updated to do the same.
> 
>         This may also require the caller of this service, the PCI Bus
>         Driver,
>         to be reviewed to make sure it handles error conditions from
>         NotifyPhase().
> 
>         I recommend you get help on the proposed code changes from the PCI
>         subsystem maintainers.
> 
>         Thanks,
> 
>         Mike
> 
> 
> 
>         > -----Original Message-----
>         > From: devel@edk2.groups.io <mailto:devel@edk2.groups.io>
>         <devel@edk2.groups.io <mailto:devel@edk2.groups.io>> On Behalf
>         Of Ranbir
>         > Singh
>         > Sent: Thursday, November 9, 2023 9:39 AM
>         > To: devel@edk2.groups.io <mailto:devel@edk2.groups.io>;
>         rsingh@ventanamicro.com <mailto:rsingh@ventanamicro.com>
>         > Cc: Ni, Ray <ray.ni@intel.com <mailto:ray.ni@intel.com>>;
>         Veeresh Sangolli
>         > <veeresh.sangolli@dellteam.com
>         <mailto:veeresh.sangolli@dellteam.com>>
>         > Subject: [edk2-devel] [PATCH v3 1/2]
>         > MdeModulePkg/Bus/Pci/PciHostBridgeDxe: Fix OVERRUN Coverity issues
>         >
>         > From: Ranbir Singh <Ranbir.Singh3@Dell.com>
>         >
>         > The function NotifyPhase has a check
>         >
>         >     ASSERT (Index < TypeMax);
>         >
>         > but this comes into play only in DEBUG mode. In Release mode,
>         there is
>         > no handling if the Index value is within array limits or not.
>         If for
>         > whatever reasons, the Index does not get re-assigned to Index2
>         at line
>         > 937, then it remains at TypeMax as assigned earlier at line
>         929. This
>         > poses array overrun risk at lines 942 and 943. It is better to
>         deploy
>         > a safety check on Index limit before accessing array elements.
>         >
>         > REF: https://bugzilla.tianocore.org/show_bug.cgi?id=4212
>         <https://bugzilla.tianocore.org/show_bug.cgi?id=4212>
>         >
>         > Cc: Ray Ni <ray.ni@intel.com <mailto:ray.ni@intel.com>>
>         > Co-authored-by: Veeresh Sangolli
>         <veeresh.sangolli@dellteam.com
>         <mailto:veeresh.sangolli@dellteam.com>>
>         > Signed-off-by: Ranbir Singh <Ranbir.Singh3@Dell.com>
>         > Signed-off-by: Ranbir Singh <rsingh@ventanamicro.com
>         <mailto:rsingh@ventanamicro.com>>
>         > ---
>         >  MdeModulePkg/Bus/Pci/PciHostBridgeDxe/PciHostBridge.c | 5 +++++
>         >  1 file changed, 5 insertions(+)
>         >
>         > diff --git a/MdeModulePkg/Bus/Pci/PciHostBridgeDxe/PciHostBridge.c
>         > b/MdeModulePkg/Bus/Pci/PciHostBridgeDxe/PciHostBridge.c
>         > index d573e532bac8..c2c143068cd2 100644
>         > --- a/MdeModulePkg/Bus/Pci/PciHostBridgeDxe/PciHostBridge.c
>         > +++ b/MdeModulePkg/Bus/Pci/PciHostBridgeDxe/PciHostBridge.c
>         > @@ -939,6 +939,11 @@ NotifyPhase (
>         >              }
>         >
>         >
>         >
>         >              ASSERT (Index < TypeMax);
>         >
>         > +
>         >
>         > +            if (Index == TypeMax) {
>         >
>         > +              CpuDeadLoop ();
>         >
>         > +            }
>         >
>         > +
>         >
>         >              ResNodeHandled[Index] = TRUE;
>         >
>         >              Alignment             = RootBridge-
>         > >ResAllocNode[Index].Alignment;
>         >
>         >              BitsOfAlignment       = LowBitSet64 (Alignment + 1);
>         >
>         > --
>         > 2.34.1
>         >
>         >
>         >
>         > -=-=-=-=-=-=
>         > Groups.io Links: You receive all messages sent to this group.
>         > View/Reply Online (#110993):
>         > https://edk2.groups.io/g/devel/message/110993
>         <https://edk2.groups.io/g/devel/message/110993>
>         > Mute This Topic: https://groups.io/mt/102490513/1643496
>         <https://groups.io/mt/102490513/1643496>
>         > Group Owner: devel+owner@edk2.groups.io
>         <mailto:devel%2Bowner@edk2.groups.io>
>         > Unsubscribe: https://edk2.groups.io/g/devel/unsub
>         <https://edk2.groups.io/g/devel/unsub>
>         > [michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>]
>         > -=-=-=-=-=-=
>         >
> 
> 



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#111164): https://edk2.groups.io/g/devel/message/111164
Mute This Topic: https://groups.io/mt/102490513/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/leave/12367111/7686176/1913456212/xyzzy [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



  reply	other threads:[~2023-11-13 16:12 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-09 17:39 [edk2-devel] [PATCH v3 0/2] BZ 4212: Fix MdeModulePkg/Bus/Pci/PciHostBridgeDxe issues pointed by Coverity Ranbir Singh
2023-11-09 17:39 ` [edk2-devel] [PATCH v3 1/2] MdeModulePkg/Bus/Pci/PciHostBridgeDxe: Fix OVERRUN Coverity issues Ranbir Singh
2023-11-09 20:40   ` Michael D Kinney
2023-11-10  3:11     ` Ranbir Singh
2023-11-10  4:07       ` Ranbir Singh
2023-11-13 16:12         ` Laszlo Ersek [this message]
2023-11-14 16:34           ` Ranbir Singh
2023-11-15  8:58             ` Laszlo Ersek
2023-11-13 15:48     ` Laszlo Ersek
2023-11-09 17:39 ` [edk2-devel] [PATCH v3 2/2] MdeModulePkg/Bus/Pci/PciHostBridgeDxe: Fix MISSING_BREAK Coverity issue Ranbir Singh
2023-11-13 16:33   ` Laszlo Ersek
2023-11-14 16:11     ` Ranbir Singh
2023-11-15  8:55       ` Laszlo Ersek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=88f1fa80-8639-3abb-8030-ec4f02b80358@redhat.com \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox