On Mon, Nov 13, 2023 at 9:42 PM Laszlo Ersek <lersek@redhat.com> wrote:

> On 11/10/23 05:07, Ranbir Singh wrote:
> > Options before us till now -
> >
> > 1. Add array overrun check and Debug statement before CpuDeadLoop within
>
> This would be useless.
>
> Your current patch implements the following pattern:
>
>   ASSERT (X);
>   if (!X) {
>     CpuDeadLoop ();
>   }
>
> Option#1 would mean
>
>   ASSERT (X);
>   if (!X) {
>     DEBUG ((DEBUG_ERROR, "%a: condition X failed\n", __func__));
>     CpuDeadLoop ();
>   }
>
> Now compare the behavior of both code snippets.
>
> - In DEBUG and NOOPT builds, if X evaluated to FALSE, then the ASSERT()
> would fire. A DEBUG_ERROR message would be logged, including "X", the
> file name, and the line number. ASSERT() would then enter CpuDeadLoop()
> internally, or invoke CpuBreakpoint() -- dependent on platform PCD
> configuration. This applies to both snippets. In particular, the
> explicit DEBUG and CpuDeadLoop() would not be reached, in the second
> snippet. In other words, in DEBUG and NOOPT builds, the difference is
> unobservable.
>
> - In RELEASE builds, the ASSERT() is compiled out of both snippets.
> Furthermore, the explicit DEBUG is compiled out of the second snippet.
> That is, both code snippets would silently enter CpuDeadLoop(). In other
> words, the behavior of both snippets is undistinguishable in RELEASE
> builds too.
>
> In my opinion, the current patch is right.
>
> Reviewed-by: Laszlo Ersek <lersek@redhat.com>
>
>
> To elaborate on that more generally:
>
> Core edk2 code has so consistently and so broadly *abused* and *misused*
> ASSERT() for error checking, that now we fail to recognize an *actual
> valid use* of ASSERT() for what it is. For illustration, compare the
> following two code snippets:
>
> (a)
>
>   UINTN Val;
>
>   Val = 1 + 2;
>   ASSERT (Val == 3);
>
> (b)
>
>   VOID *Ptr;
>
>   Ptr = AllocatePool (1024);
>   ASSERT (Ptr != NULL);
>
> Snippet (a) is a *valid* use of an assert. It encapsulates a logical
> predicate about the program's state, based on algorithmic and
> mathematical reasoning. If the predicate turns out not to hold in
> practice, at runtime,that means the programmer has retroactively caught
> an issue in their own thinking, the algorithm is incorrectly designed or
> implemented, and the program's state is effectively unknown at this
> point (the internal invariant in question is broken), and so we must
> stop immediately, because we don't know what the program is going to do
> next. Given that what we thought about the program *thus far* has now
> turned out to be false.
>
> Snippet (b) is an abuse of assert. AllocatePool()'s result depends on
> the environment. Available memory, input data seen from the user or the
> disk or the network thus far, and a bunch of other things not under our
> control. Therefore, we must explicitly deal with AllocatePool() failing.
> Claiming that AllocatePool() will succeed is wrong because we generally
> cannot even *attempt* to prove it.
>
> In case (a) however, we can actually make a plausible attempt at
> *proving* the predicate. The assert is there just in case our proof is
> wrong.
>
> There's an immense difference between situations (a) and (b). I cannot
> emphasize that enough. Case (a) is a provable statement about the
> behavior of an algorithm. Case (b) is a *guess* at input data and
> environment.
>
> The problem with edk2 core is that it has so consistently abused
> ASSERT() for case (b) that now, when we occasionally see a *valid
> instance* of (a), we mistake it for (b).
>
> That's the case here. The ASSERT() seen in this patch is case (a). It's
> just that Coverity cannot prove it itself, because the predicate we
> assert is much more complicated than (1 + 2 == 3). But that doesn't
> change the fact that we have a proof for the invariant in question.
>
> Therefore, the solution is not to try to handle an impossible error
> gracefully. The solution is two-fold:
>
> - Tell coverity that we have a proof, in terms that it understands, to
> shut it up. The terminology for this is CpuDeadLoop(). We're willing to
> hang at runtime even in RELEASE builds, if the condition fails.
>
> - If, at runtime, our proof turns out to be wrong indeed, then we must
> stop immediately (because if 1+2 stops equalling 3, then we can't trust
> *anything else* that our program would do).
>
> (The more generic corollary of my last point, by the way, is that
> ASSERT()s should NEVER be compiled out of programs. And if we actually
> did that, like many other projects do, then this Coverity warning
> wouldn't even exist, in the first place, because Coverity would *always*
> see -- with the ASSERT() being there in all builds -- a range check on
> Index.
>
> Put differently, having to add a CpuDeadLoop() explicitly is just
> "damage control" for the self-inflicted damage of compiling out ASSERTs.)
>
> Laszlo
>
>
Thanks Laszlo for being immensely generous in writing in that much detail.
This must be beneficial to many.

Though you already gave R-b, the return statement needs to be added to
explicitly and completely rule out ARRAY_OVERRUN issue by static analysis
tools.

             ASSERT (Index < TypeMax);
+
+            if (Index == TypeMax) {
+              CpuDeadLoop ();
+              return EFI_OUT_OF_RESOURCES;
+            }
+


>
> > 2. Status Quo (not everything can be ideal :-))
> >
> > Question before us
> >      - Is 1 better than 2 ?
> >
> >
> > On Fri, Nov 10, 2023 at 8:41 AM Ranbir Singh <rsingh@ventanamicro.com
> > <mailto:rsingh@ventanamicro.com>> wrote:
> >
> >     As far as I know, from a secure coding perspective, it would be
> >     recommended that array overrun condition check is captured in the
> >     code even if it is felt that it will never hit.
> >
> >     Generally speaking, I won't be in favour of handling other ASSERT
> >     conditions updates even if required if they are not related to array
> >     overrun conditions i.e., the context of the patch.
> >
> >     If someone / PCI maintainers can advise in this patch context what
> >     should be done in the array overrun condition, I will be happy to
> >     update, otherwise, sorry to say I won't be able to pursue this
> >     particular one further and hence would be leaving the related code
> >     with the status quo here.
> >
> >     On Fri, Nov 10, 2023 at 2:10 AM Kinney, Michael D
> >     <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>>
> wrote:
> >
> >         Hi Ranbir,
> >
> >         A deadloop without even a debug print is not good behavior.
> >
> >         If this condition really represents a condition where it is not
> >         possible
> >         to complete the PCI resource allocation/assignment, then an
> >         error status
> >         code should be returned to the caller of NotifyPhase().  Perhaps
> >         EFI_OUT_OF_RESOURCES.  The other ASSERT() conditions in this API
> >         should
> >         likely be updated to do the same.
> >
> >         This may also require the caller of this service, the PCI Bus
> >         Driver,
> >         to be reviewed to make sure it handles error conditions from
> >         NotifyPhase().
> >
> >         I recommend you get help on the proposed code changes from the
> PCI
> >         subsystem maintainers.
> >
> >         Thanks,
> >
> >         Mike
> >
> >
> >
> >         > -----Original Message-----
> >         > From: devel@edk2.groups.io <mailto:devel@edk2.groups.io>
> >         <devel@edk2.groups.io <mailto:devel@edk2.groups.io>> On Behalf
> >         Of Ranbir
> >         > Singh
> >         > Sent: Thursday, November 9, 2023 9:39 AM
> >         > To: devel@edk2.groups.io <mailto:devel@edk2.groups.io>;
> >         rsingh@ventanamicro.com <mailto:rsingh@ventanamicro.com>
> >         > Cc: Ni, Ray <ray.ni@intel.com <mailto:ray.ni@intel.com>>;
> >         Veeresh Sangolli
> >         > <veeresh.sangolli@dellteam.com
> >         <mailto:veeresh.sangolli@dellteam.com>>
> >         > Subject: [edk2-devel] [PATCH v3 1/2]
> >         > MdeModulePkg/Bus/Pci/PciHostBridgeDxe: Fix OVERRUN Coverity
> issues
> >         >
> >         > From: Ranbir Singh <Ranbir.Singh3@Dell.com>
> >         >
> >         > The function NotifyPhase has a check
> >         >
> >         >     ASSERT (Index < TypeMax);
> >         >
> >         > but this comes into play only in DEBUG mode. In Release mode,
> >         there is
> >         > no handling if the Index value is within array limits or not.
> >         If for
> >         > whatever reasons, the Index does not get re-assigned to Index2
> >         at line
> >         > 937, then it remains at TypeMax as assigned earlier at line
> >         929. This
> >         > poses array overrun risk at lines 942 and 943. It is better to
> >         deploy
> >         > a safety check on Index limit before accessing array elements.
> >         >
> >         > REF: https://bugzilla.tianocore.org/show_bug.cgi?id=4212
> >         <https://bugzilla.tianocore.org/show_bug.cgi?id=4212>
> >         >
> >         > Cc: Ray Ni <ray.ni@intel.com <mailto:ray.ni@intel.com>>
> >         > Co-authored-by: Veeresh Sangolli
> >         <veeresh.sangolli@dellteam.com
> >         <mailto:veeresh.sangolli@dellteam.com>>
> >         > Signed-off-by: Ranbir Singh <Ranbir.Singh3@Dell.com>
> >         > Signed-off-by: Ranbir Singh <rsingh@ventanamicro.com
> >         <mailto:rsingh@ventanamicro.com>>
> >         > ---
> >         >  MdeModulePkg/Bus/Pci/PciHostBridgeDxe/PciHostBridge.c | 5
> +++++
> >         >  1 file changed, 5 insertions(+)
> >         >
> >         > diff --git
> a/MdeModulePkg/Bus/Pci/PciHostBridgeDxe/PciHostBridge.c
> >         > b/MdeModulePkg/Bus/Pci/PciHostBridgeDxe/PciHostBridge.c
> >         > index d573e532bac8..c2c143068cd2 100644
> >         > --- a/MdeModulePkg/Bus/Pci/PciHostBridgeDxe/PciHostBridge.c
> >         > +++ b/MdeModulePkg/Bus/Pci/PciHostBridgeDxe/PciHostBridge.c
> >         > @@ -939,6 +939,11 @@ NotifyPhase (
> >         >              }
> >         >
> >         >
> >         >
> >         >              ASSERT (Index < TypeMax);
> >         >
> >         > +
> >         >
> >         > +            if (Index == TypeMax) {
> >         >
> >         > +              CpuDeadLoop ();
> >         >
> >         > +            }
> >         >
> >         > +
> >         >
> >         >              ResNodeHandled[Index] = TRUE;
> >         >
> >         >              Alignment             = RootBridge-
> >         > >ResAllocNode[Index].Alignment;
> >         >
> >         >              BitsOfAlignment       = LowBitSet64 (Alignment +
> 1);
> >         >
> >         > --
> >         > 2.34.1
> >         >
> >         >
> >         >
> >         > -=-=-=-=-=-=
> >         > Groups.io Links: You receive all messages sent to this group.
> >         > View/Reply Online (#110993):
> >         > https://edk2.groups.io/g/devel/message/110993
> >         <https://edk2.groups.io/g/devel/message/110993>
> >         > Mute This Topic: https://groups.io/mt/102490513/1643496
> >         <https://groups.io/mt/102490513/1643496>
> >         > Group Owner: devel+owner@edk2.groups.io
> >         <mailto:devel%2Bowner@edk2.groups.io>
> >         > Unsubscribe: https://edk2.groups.io/g/devel/unsub
> >         <https://edk2.groups.io/g/devel/unsub>
> >         > [michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com
> >]
> >         > -=-=-=-=-=-=
> >         >
> >
> > 
>
>


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#111208): https://edk2.groups.io/g/devel/message/111208
Mute This Topic: https://groups.io/mt/102490513/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-