public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
From: "Laszlo Ersek" <lersek@redhat.com>
To: devel@edk2.groups.io, mcb30@ipxe.org
Cc: Ray Ni <ray.ni@intel.com>, Gerd Hoffmann <kraxel@redhat.com>
Subject: Re: [edk2-devel] [PATCH v3 4/5] MdeModulePkg: Add self-tests for NestedInterruptTplLib
Date: Tue, 23 Jan 2024 17:55:35 +0100	[thread overview]
Message-ID: <eed3031c-485b-3d17-7442-0ad640868da2@redhat.com> (raw)
In-Reply-To: <0102018d36f28154-cb2f6e0a-93ea-490d-96bd-5c804c984e2a-000000@eu-west-1.amazonses.com>

only superficial comments:

On 1/23/24 16:31, Michael Brown wrote:
> Add the ability to perform self-tests on the first few invocations of
> NestedInterruptRestoreTPL(), to verify that:
> 
> - the timer interrupt handler correctly rearms the timer interrupt
>   before calling NestedInterruptRestoreTPL(), and
> 
> - the architecture-specific DisableInterruptsOnIret() implementation
>   correctly causes interrupts to be disabled after the IRET or
>   equivalent instruction.
> 
> Any test failure will be treated as fatal and will halt the system
> with an appropriate diagnostic message.
> 
> Each test invocation adds approximately one timer tick of delay to the
> overall system startup time.
> 
> Only one test is performed by default (to avoid unnecessary system
> startup delay).  The number of tests performed can be controlled via
> PcdNestedInterruptNumberOfSelfTests at build time.
> 
> Signed-off-by: Michael Brown <mcb30@ipxe.org>
> ---
>  MdeModulePkg/MdeModulePkg.dec                 |   4 +
>  .../NestedInterruptTplLib.inf                 |   3 +
>  .../Include/Library/NestedInterruptTplLib.h   |   4 +
>  .../Library/NestedInterruptTplLib/Tpl.c       | 129 ++++++++++++++++++
>  4 files changed, 140 insertions(+)

(This is not even a comment, just a hint :) consider passing
"--stat=1000 --stat-graph-width=20" to git, when formatting the patches.
Those options deal well with the extremely long filenames / pathnames in
edk2.)

> 
> diff --git a/MdeModulePkg/MdeModulePkg.dec b/MdeModulePkg/MdeModulePkg.dec
> index d6fb729af5a7..efd32c197b18 100644
> --- a/MdeModulePkg/MdeModulePkg.dec
> +++ b/MdeModulePkg/MdeModulePkg.dec
> @@ -1142,6 +1142,10 @@ [PcdsFixedAtBuild]
>    # @Prompt Enable large address image loading.
>    gEfiMdeModulePkgTokenSpaceGuid.PcdImageLargeAddressLoad|TRUE|BOOLEAN|0x30001059
>  
> +  ## Number of NestedInterruptTplLib self-tests to perform at startup.
> +  # @Prompt Number of NestedInterruptTplLib self-tests.
> +  gEfiMdeModulePkgTokenSpaceGuid.PcdNestedInterruptNumberOfSelfTests|1|UINT32|0x30001060
> +
>  [PcdsFixedAtBuild, PcdsPatchableInModule]
>    ## Dynamic type PCD can be registered callback function for Pcd setting action.
>    #  PcdMaxPeiPcdCallBackNumberPerPcdEntry indicates the maximum number of callback function
> diff --git a/MdeModulePkg/Library/NestedInterruptTplLib/NestedInterruptTplLib.inf b/MdeModulePkg/Library/NestedInterruptTplLib/NestedInterruptTplLib.inf
> index f130d6dcd213..e67d899b9446 100644
> --- a/MdeModulePkg/Library/NestedInterruptTplLib/NestedInterruptTplLib.inf
> +++ b/MdeModulePkg/Library/NestedInterruptTplLib/NestedInterruptTplLib.inf
> @@ -34,3 +34,6 @@ [LibraryClasses]
>  
>  [Depex.common.DXE_DRIVER]
>    TRUE
> +
> +[Pcd]
> +  gEfiMdeModulePkgTokenSpaceGuid.PcdNestedInterruptNumberOfSelfTests
> diff --git a/MdeModulePkg/Include/Library/NestedInterruptTplLib.h b/MdeModulePkg/Include/Library/NestedInterruptTplLib.h
> index 0ead6e4b346a..7dd934577e99 100644
> --- a/MdeModulePkg/Include/Library/NestedInterruptTplLib.h
> +++ b/MdeModulePkg/Include/Library/NestedInterruptTplLib.h
> @@ -32,6 +32,10 @@ typedef struct {
>    /// interrupt handler.
>    ///
>    BOOLEAN    DeferredRestoreTPL;
> +  ///
> +  /// Number of self-tests performed.
> +  ///
> +  UINTN      SelfTestCount;
>  } NESTED_INTERRUPT_STATE;
>  
>  /**

I suggest that the new field be UINT32. The (exclusive) limit is a
32-bit PCD. Making the counter (potentially) wider than the limit is not
useful, but it's also a bit of a complication for the debug messages
(see below).

> diff --git a/MdeModulePkg/Library/NestedInterruptTplLib/Tpl.c b/MdeModulePkg/Library/NestedInterruptTplLib/Tpl.c
> index 99af553ab189..dfe22331204f 100644
> --- a/MdeModulePkg/Library/NestedInterruptTplLib/Tpl.c
> +++ b/MdeModulePkg/Library/NestedInterruptTplLib/Tpl.c
> @@ -17,6 +17,18 @@
>  
>  #include "Iret.h"
>  
> +//
> +// Number of self-tests to perform.
> +//
> +#define NUMBER_OF_SELF_TESTS \
> +  (FixedPcdGet32 (PcdNestedInterruptNumberOfSelfTests))
> +
> +STATIC
> +VOID
> +NestedInterruptSelfTest (
> +  IN NESTED_INTERRUPT_STATE  *IsrState
> +  );
> +
>  /**
>    Raise the task priority level to TPL_HIGH_LEVEL.
>  
> @@ -211,6 +223,16 @@ NestedInterruptRestoreTPL (
>      //
>      DisableInterrupts ();
>  
> +    ///
> +    /// Perform a limited number of self-tests on the first few
> +    /// invocations.
> +    ///
> +    if ((IsrState->DeferredRestoreTPL == FALSE) &&

This comment applies to several locations in the patch:

BOOLEANs should not be checked using explicit "== TRUE" and "== FALSE"
operators / comparisons; they should only be evaluated in their logical
contexts:

  (Foo)
  (!Bar)

etc

> +	(IsrState->SelfTestCount < NUMBER_OF_SELF_TESTS)) {
> +      IsrState->SelfTestCount++;
> +      NestedInterruptSelfTest (IsrState);
> +    }
> +
>      //
>      // DEFERRAL RETURN POINT
>      //
> @@ -248,3 +270,110 @@ NestedInterruptRestoreTPL (
>      }
>    }
>  }
> +
> +/**
> +  Perform internal self-test.
> +
> +  Induce a delay to force a nested timer interrupt to take place, and
> +  verify that the nested interrupt behaves as required.
> +
> +  @param IsrState              A pointer to the state shared between all
> +                               invocations of the nested interrupt handler.
> +**/
> +VOID
> +NestedInterruptSelfTest (
> +  IN NESTED_INTERRUPT_STATE  *IsrState
> +  )
> +{
> +  UINTN SelfTestCount;
> +  UINTN TimeOut;

Did this pass the uncrustify check? I think uncrustify would insist on
inserting two spaces here.

(

For running uncrustify locally:

- clone
<https://projectmu@dev.azure.com/projectmu/Uncrustify/_git/Uncrustify>

- check it out at tag 73.0.8 (the tag that edk2 CI uses on github is in
".pytool/Plugin/UncrustifyCheck/uncrustify_ext_dep.yaml")

- build it (IIRC it uses cmake)

- with nothing dirty in the working tree (i.e., everything committed, or
at least stashed to the index), run

  uncrustify \
    -c .pytool/Plugin/UncrustifyCheck/uncrustify.cfg \
    --replace \
    --no-backup \
    --if-changed \
    -F file-list.txt

)

> +
> +  //
> +  // Record number of this self-test for debug messages.
> +  //
> +  SelfTestCount = IsrState->SelfTestCount;
> +
> +  //
> +  // Re-enable interrupts and stall for up to one second to induce at
> +  // least one more timer interrupt.
> +  //
> +  // This mimics the effect of an interrupt having occurred precisely
> +  // at the end of our call to RestoreTPL(), with interrupts having
> +  // been re-enabled by RestoreTPL() and with the interrupt happening
> +  // to occur after the TPL has already been lowered back down to
> +  // InterruptedTPL.  (This is the scenario that can lead to stack
> +  // exhaustion, as described above.)
> +  //
> +  ASSERT (GetInterruptState () == FALSE);
> +  ASSERT (IsrState->DeferredRestoreTPL == FALSE);
> +  EnableInterrupts();

I think uncrustify would want to insert a space here too

> +  for (TimeOut = 0; TimeOut < 1000; TimeOut++) {
> +    //
> +    // Stall for 1ms
> +    //
> +    gBS->Stall (1000);
> +
> +    //
> +    // If we observe that interrupts have been spontaneously disabled,
> +    // then this must be because the induced interrupt handler's call
> +    // to NestedInterruptRestoreTPL() correctly chose to defer the
> +    // RestoreTPL() call to the outer handler (i.e. to us).
> +    //
> +    if (GetInterruptState() == FALSE) {
> +      ASSERT (IsrState->DeferredRestoreTPL == TRUE);
> +      DEBUG ((
> +        DEBUG_INFO,
> +        "Nested interrupt self-test %u/%u passed\n",
> +        SelfTestCount,
> +        NUMBER_OF_SELF_TESTS
> +	));

So SelfTestCount may be a UINT64. The central PrintLib instance does not
support a conversion specifier for UINTN, only for UINT32 and UINT64
explicitly. Therefore the best way to print a UINTN is to cast the value
to fixed UINT64, and then print that with %Lu. However, in this case, we
can work in the opposite direction: change the type of SelfTestCount to
UINT32 (and then %u will be fine). Which is what I propose at the top.

%u is already fine for NUMBER_OF_SELF_TESTS (which is a FixedPcdGet32).


> +      return;
> +    }
> +  }
> +
> +  //
> +  // The test has failed and we will halt the system.  Disable
> +  // interrupts now so that any test-induced interrupt storm does not
> +  // prevent the fatal error messages from being displayed correctly.
> +  //
> +  DisableInterrupts();
> +
> +  //
> +  // If we observe that DeferredRestoreTPL is TRUE then this indicates
> +  // that an interrupt did occur and NestedInterruptRestoreTPL() chose
> +  // to defer the RestoreTPL() call to the outer handler, but that
> +  // DisableInterruptsOnIret() failed to cause interrupts to be
> +  // disabled after the IRET or equivalent instruction.
> +  //
> +  // This error represents a bug in the architecture-specific
> +  // implementation of DisableInterruptsOnIret().
> +  //
> +  if (IsrState->DeferredRestoreTPL == TRUE) {
> +    DEBUG ((
> +      DEBUG_ERROR,
> +      "Nested interrupt self-test %u/%u failed: interrupts still enabled\n",
> +      SelfTestCount,
> +      NUMBER_OF_SELF_TESTS
> +      ));
> +    ASSERT (FALSE);
> +  }
> +
> +  //
> +  // If no timer interrupt occurred then this indicates that the timer
> +  // interrupt handler failed to rearm the timer before calling
> +  // NestedInterruptRestoreTPL().  This would prevent nested
> +  // interrupts from occurring at all, which would break
> +  // e.g. callbacks at TPL_CALLBACK that themselves wait on further
> +  // timer events.
> +  //
> +  // This error represents a bug in the platform-specific timer
> +  // interrupt handler.
> +  //
> +  DEBUG ((
> +    DEBUG_ERROR,
> +    "Nested interrupt self-test %u/%u failed: no nested interrupt\n",
> +    SelfTestCount,
> +    NUMBER_OF_SELF_TESTS
> +    ));
> +  ASSERT (FALSE);
> +}

I'd prefer something stronger than just ASSERT (FALSE) here, but -- per
previous discussion -- we don't have a generally accepted "panic" API
yet, and CpuDeadLoop() is not suitable for all platforms, so this should do.

With my trivial comments addressed:

Acked-by: Laszlo Ersek <lersek@redhat.com>

Comment on the general idea: I much like that the self-test is active on
every boot (without high costs).

Side idea: technically we could merge the first two patches in
separation (pending MdeModulePkg maintainer approval), and then consider
the last three patches as new improvements (possibly needing longer
review). This kind of splitting has both advantages and disadvantages;
the advantage is that the code movement / upstreaming to MdeModulePkg is
not blocked by (somewhat) unrelated discussion. The disadvantages are
that more admin work is needed (more posting, and more PRs), and that
patches in the series that one might consider to belong together will
fly apart in the git history. So I just figured I'd raise the option.

Laszlo



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#114219): https://edk2.groups.io/g/devel/message/114219
Mute This Topic: https://groups.io/mt/103911608/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/leave/12367111/7686176/1913456212/xyzzy [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



  reply	other threads:[~2024-01-23 16:55 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <17ACFF3FDD20CD9A.13754@groups.io>
2024-01-23 15:31 ` [edk2-devel] [PATCH v3 0/5] MdeModulePkg: Move NestedInterruptTplLib to MdeModulePkg Michael Brown
2024-01-25 12:17   ` Ni, Ray
     [not found] ` <20240123153104.2451759-1-mcb30@ipxe.org>
2024-01-23 15:31   ` [edk2-devel] [PATCH v3 1/5] " Michael Brown
2024-01-23 15:31   ` [edk2-devel] [PATCH v3 2/5] MdeModulePkg: Add missing Iret.h to NestedInterruptTplLib sources list Michael Brown
2024-01-23 15:31   ` [edk2-devel] [PATCH v3 3/5] MdeModulePkg: Do nothing on NestedInterruptRestoreTPL(TPL_HIGH_LEVEL) Michael Brown
2024-01-23 16:32     ` Laszlo Ersek
2024-01-23 16:59       ` Michael Brown
2024-01-24 12:52         ` Laszlo Ersek
2024-01-23 15:31   ` [edk2-devel] [PATCH v3 4/5] MdeModulePkg: Add self-tests for NestedInterruptTplLib Michael Brown
2024-01-23 16:55     ` Laszlo Ersek [this message]
2024-01-23 17:41       ` Michael Brown
2024-01-24 12:58         ` Laszlo Ersek
2024-01-24 10:24     ` Ni, Ray
2024-01-24 10:26       ` Ni, Ray
2024-01-23 15:31   ` [edk2-devel] [PATCH v3 5/5] MdeModulePkg: Extend NestedInterruptTplLib to support Arm CPUs Michael Brown
2024-01-23 15:51     ` Ard Biesheuvel
2024-01-23 16:48       ` Michael Brown
2024-01-23 17:10     ` Laszlo Ersek
2024-01-23 17:21       ` Michael Brown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=eed3031c-485b-3d17-7442-0ad640868da2@redhat.com \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox