From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail02.groups.io (mail02.groups.io [66.175.222.108]) by spool.mail.gandi.net (Postfix) with ESMTPS id AE7B6D811AC for ; Tue, 23 Jan 2024 16:55:43 +0000 (UTC) DKIM-Signature: a=rsa-sha256; bh=7/tjefVtWIZnO3fWbM3bxKtoteO0w87TZWflU18VJZQ=; c=relaxed/simple; d=groups.io; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From:In-Reply-To:Precedence:List-Subscribe:List-Help:Sender:List-Id:Mailing-List:Delivered-To:Reply-To:List-Unsubscribe-Post:List-Unsubscribe:Content-Language:Content-Type:Content-Transfer-Encoding; s=20140610; t=1706028942; v=1; b=kvAJs2kNoB+AaanOefC0l2kfBZxUfCmFTB4dYidre8Lg5qIRv6oQQv1AcWcyYK6DCIBcNUYy zVGT7WFBTOzAqktsVDgSSmDW6xlrZhIY12pD9LksB1wIcbv/n6DrnJ4CtunPy3oP5PtDbjS3UIE xcRZtDgCD58zKL5wnijbjOfI= X-Received: by 127.0.0.2 with SMTP id UTApYY7687511xxYhL0lL1mg; Tue, 23 Jan 2024 08:55:42 -0800 X-Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mx.groups.io with SMTP id smtpd.web11.18283.1706028941573417274 for ; Tue, 23 Jan 2024 08:55:41 -0800 X-Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-589-sLn4L0nBPVaIs9QRBLKmdg-1; Tue, 23 Jan 2024 11:55:37 -0500 X-MC-Unique: sLn4L0nBPVaIs9QRBLKmdg-1 X-Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 3E7A185A589; Tue, 23 Jan 2024 16:55:37 +0000 (UTC) X-Received: from [10.39.194.212] (unknown [10.39.194.212]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 5BE5D3C2E; Tue, 23 Jan 2024 16:55:36 +0000 (UTC) Message-ID: Date: Tue, 23 Jan 2024 17:55:35 +0100 MIME-Version: 1.0 Subject: Re: [edk2-devel] [PATCH v3 4/5] MdeModulePkg: Add self-tests for NestedInterruptTplLib To: devel@edk2.groups.io, mcb30@ipxe.org Cc: Ray Ni , Gerd Hoffmann References: <17ACFF3FDD20CD9A.13754@groups.io> <20240123153104.2451759-1-mcb30@ipxe.org> <0102018d36f28154-cb2f6e0a-93ea-490d-96bd-5c804c984e2a-000000@eu-west-1.amazonses.com> From: "Laszlo Ersek" In-Reply-To: <0102018d36f28154-cb2f6e0a-93ea-490d-96bd-5c804c984e2a-000000@eu-west-1.amazonses.com> X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.1 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Precedence: Bulk List-Subscribe: List-Help: Sender: devel@edk2.groups.io List-Id: Mailing-List: list devel@edk2.groups.io; contact devel+owner@edk2.groups.io Reply-To: devel@edk2.groups.io,lersek@redhat.com List-Unsubscribe-Post: List-Unsubscribe=One-Click List-Unsubscribe: X-Gm-Message-State: IQ6Gw9AFUhcMtkGCSMTRNnOvx7686176AA= Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-GND-Status: LEGIT Authentication-Results: spool.mail.gandi.net; dkim=pass header.d=groups.io header.s=20140610 header.b=kvAJs2kN; spf=pass (spool.mail.gandi.net: domain of bounce@groups.io designates 66.175.222.108 as permitted sender) smtp.mailfrom=bounce@groups.io; dmarc=fail reason="SPF not aligned (relaxed), DKIM not aligned (relaxed)" header.from=redhat.com (policy=none) only superficial comments: On 1/23/24 16:31, Michael Brown wrote: > Add the ability to perform self-tests on the first few invocations of > NestedInterruptRestoreTPL(), to verify that: >=20 > - the timer interrupt handler correctly rearms the timer interrupt > before calling NestedInterruptRestoreTPL(), and >=20 > - the architecture-specific DisableInterruptsOnIret() implementation > correctly causes interrupts to be disabled after the IRET or > equivalent instruction. >=20 > Any test failure will be treated as fatal and will halt the system > with an appropriate diagnostic message. >=20 > Each test invocation adds approximately one timer tick of delay to the > overall system startup time. >=20 > Only one test is performed by default (to avoid unnecessary system > startup delay). The number of tests performed can be controlled via > PcdNestedInterruptNumberOfSelfTests at build time. >=20 > Signed-off-by: Michael Brown > --- > MdeModulePkg/MdeModulePkg.dec | 4 + > .../NestedInterruptTplLib.inf | 3 + > .../Include/Library/NestedInterruptTplLib.h | 4 + > .../Library/NestedInterruptTplLib/Tpl.c | 129 ++++++++++++++++++ > 4 files changed, 140 insertions(+) (This is not even a comment, just a hint :) consider passing "--stat=3D1000 --stat-graph-width=3D20" to git, when formatting the patches= . Those options deal well with the extremely long filenames / pathnames in edk2.) >=20 > diff --git a/MdeModulePkg/MdeModulePkg.dec b/MdeModulePkg/MdeModulePkg.de= c > index d6fb729af5a7..efd32c197b18 100644 > --- a/MdeModulePkg/MdeModulePkg.dec > +++ b/MdeModulePkg/MdeModulePkg.dec > @@ -1142,6 +1142,10 @@ [PcdsFixedAtBuild] > # @Prompt Enable large address image loading. > gEfiMdeModulePkgTokenSpaceGuid.PcdImageLargeAddressLoad|TRUE|BOOLEAN|0= x30001059 > =20 > + ## Number of NestedInterruptTplLib self-tests to perform at startup. > + # @Prompt Number of NestedInterruptTplLib self-tests. > + gEfiMdeModulePkgTokenSpaceGuid.PcdNestedInterruptNumberOfSelfTests|1|U= INT32|0x30001060 > + > [PcdsFixedAtBuild, PcdsPatchableInModule] > ## Dynamic type PCD can be registered callback function for Pcd settin= g action. > # PcdMaxPeiPcdCallBackNumberPerPcdEntry indicates the maximum number = of callback function > diff --git a/MdeModulePkg/Library/NestedInterruptTplLib/NestedInterruptTp= lLib.inf b/MdeModulePkg/Library/NestedInterruptTplLib/NestedInterruptTplLib= .inf > index f130d6dcd213..e67d899b9446 100644 > --- a/MdeModulePkg/Library/NestedInterruptTplLib/NestedInterruptTplLib.in= f > +++ b/MdeModulePkg/Library/NestedInterruptTplLib/NestedInterruptTplLib.in= f > @@ -34,3 +34,6 @@ [LibraryClasses] > =20 > [Depex.common.DXE_DRIVER] > TRUE > + > +[Pcd] > + gEfiMdeModulePkgTokenSpaceGuid.PcdNestedInterruptNumberOfSelfTests > diff --git a/MdeModulePkg/Include/Library/NestedInterruptTplLib.h b/MdeMo= dulePkg/Include/Library/NestedInterruptTplLib.h > index 0ead6e4b346a..7dd934577e99 100644 > --- a/MdeModulePkg/Include/Library/NestedInterruptTplLib.h > +++ b/MdeModulePkg/Include/Library/NestedInterruptTplLib.h > @@ -32,6 +32,10 @@ typedef struct { > /// interrupt handler. > /// > BOOLEAN DeferredRestoreTPL; > + /// > + /// Number of self-tests performed. > + /// > + UINTN SelfTestCount; > } NESTED_INTERRUPT_STATE; > =20 > /** I suggest that the new field be UINT32. The (exclusive) limit is a 32-bit PCD. Making the counter (potentially) wider than the limit is not useful, but it's also a bit of a complication for the debug messages (see below). > diff --git a/MdeModulePkg/Library/NestedInterruptTplLib/Tpl.c b/MdeModule= Pkg/Library/NestedInterruptTplLib/Tpl.c > index 99af553ab189..dfe22331204f 100644 > --- a/MdeModulePkg/Library/NestedInterruptTplLib/Tpl.c > +++ b/MdeModulePkg/Library/NestedInterruptTplLib/Tpl.c > @@ -17,6 +17,18 @@ > =20 > #include "Iret.h" > =20 > +// > +// Number of self-tests to perform. > +// > +#define NUMBER_OF_SELF_TESTS \ > + (FixedPcdGet32 (PcdNestedInterruptNumberOfSelfTests)) > + > +STATIC > +VOID > +NestedInterruptSelfTest ( > + IN NESTED_INTERRUPT_STATE *IsrState > + ); > + > /** > Raise the task priority level to TPL_HIGH_LEVEL. > =20 > @@ -211,6 +223,16 @@ NestedInterruptRestoreTPL ( > // > DisableInterrupts (); > =20 > + /// > + /// Perform a limited number of self-tests on the first few > + /// invocations. > + /// > + if ((IsrState->DeferredRestoreTPL =3D=3D FALSE) && This comment applies to several locations in the patch: BOOLEANs should not be checked using explicit "=3D=3D TRUE" and "=3D=3D FAL= SE" operators / comparisons; they should only be evaluated in their logical contexts: (Foo) (!Bar) etc > + (IsrState->SelfTestCount < NUMBER_OF_SELF_TESTS)) { > + IsrState->SelfTestCount++; > + NestedInterruptSelfTest (IsrState); > + } > + > // > // DEFERRAL RETURN POINT > // > @@ -248,3 +270,110 @@ NestedInterruptRestoreTPL ( > } > } > } > + > +/** > + Perform internal self-test. > + > + Induce a delay to force a nested timer interrupt to take place, and > + verify that the nested interrupt behaves as required. > + > + @param IsrState A pointer to the state shared between all > + invocations of the nested interrupt handl= er. > +**/ > +VOID > +NestedInterruptSelfTest ( > + IN NESTED_INTERRUPT_STATE *IsrState > + ) > +{ > + UINTN SelfTestCount; > + UINTN TimeOut; Did this pass the uncrustify check? I think uncrustify would insist on inserting two spaces here. ( For running uncrustify locally: - clone - check it out at tag 73.0.8 (the tag that edk2 CI uses on github is in ".pytool/Plugin/UncrustifyCheck/uncrustify_ext_dep.yaml") - build it (IIRC it uses cmake) - with nothing dirty in the working tree (i.e., everything committed, or at least stashed to the index), run uncrustify \ -c .pytool/Plugin/UncrustifyCheck/uncrustify.cfg \ --replace \ --no-backup \ --if-changed \ -F file-list.txt ) > + > + // > + // Record number of this self-test for debug messages. > + // > + SelfTestCount =3D IsrState->SelfTestCount; > + > + // > + // Re-enable interrupts and stall for up to one second to induce at > + // least one more timer interrupt. > + // > + // This mimics the effect of an interrupt having occurred precisely > + // at the end of our call to RestoreTPL(), with interrupts having > + // been re-enabled by RestoreTPL() and with the interrupt happening > + // to occur after the TPL has already been lowered back down to > + // InterruptedTPL. (This is the scenario that can lead to stack > + // exhaustion, as described above.) > + // > + ASSERT (GetInterruptState () =3D=3D FALSE); > + ASSERT (IsrState->DeferredRestoreTPL =3D=3D FALSE); > + EnableInterrupts(); I think uncrustify would want to insert a space here too > + for (TimeOut =3D 0; TimeOut < 1000; TimeOut++) { > + // > + // Stall for 1ms > + // > + gBS->Stall (1000); > + > + // > + // If we observe that interrupts have been spontaneously disabled, > + // then this must be because the induced interrupt handler's call > + // to NestedInterruptRestoreTPL() correctly chose to defer the > + // RestoreTPL() call to the outer handler (i.e. to us). > + // > + if (GetInterruptState() =3D=3D FALSE) { > + ASSERT (IsrState->DeferredRestoreTPL =3D=3D TRUE); > + DEBUG (( > + DEBUG_INFO, > + "Nested interrupt self-test %u/%u passed\n", > + SelfTestCount, > + NUMBER_OF_SELF_TESTS > + )); So SelfTestCount may be a UINT64. The central PrintLib instance does not support a conversion specifier for UINTN, only for UINT32 and UINT64 explicitly. Therefore the best way to print a UINTN is to cast the value to fixed UINT64, and then print that with %Lu. However, in this case, we can work in the opposite direction: change the type of SelfTestCount to UINT32 (and then %u will be fine). Which is what I propose at the top. %u is already fine for NUMBER_OF_SELF_TESTS (which is a FixedPcdGet32). > + return; > + } > + } > + > + // > + // The test has failed and we will halt the system. Disable > + // interrupts now so that any test-induced interrupt storm does not > + // prevent the fatal error messages from being displayed correctly. > + // > + DisableInterrupts(); > + > + // > + // If we observe that DeferredRestoreTPL is TRUE then this indicates > + // that an interrupt did occur and NestedInterruptRestoreTPL() chose > + // to defer the RestoreTPL() call to the outer handler, but that > + // DisableInterruptsOnIret() failed to cause interrupts to be > + // disabled after the IRET or equivalent instruction. > + // > + // This error represents a bug in the architecture-specific > + // implementation of DisableInterruptsOnIret(). > + // > + if (IsrState->DeferredRestoreTPL =3D=3D TRUE) { > + DEBUG (( > + DEBUG_ERROR, > + "Nested interrupt self-test %u/%u failed: interrupts still enabled= \n", > + SelfTestCount, > + NUMBER_OF_SELF_TESTS > + )); > + ASSERT (FALSE); > + } > + > + // > + // If no timer interrupt occurred then this indicates that the timer > + // interrupt handler failed to rearm the timer before calling > + // NestedInterruptRestoreTPL(). This would prevent nested > + // interrupts from occurring at all, which would break > + // e.g. callbacks at TPL_CALLBACK that themselves wait on further > + // timer events. > + // > + // This error represents a bug in the platform-specific timer > + // interrupt handler. > + // > + DEBUG (( > + DEBUG_ERROR, > + "Nested interrupt self-test %u/%u failed: no nested interrupt\n", > + SelfTestCount, > + NUMBER_OF_SELF_TESTS > + )); > + ASSERT (FALSE); > +} I'd prefer something stronger than just ASSERT (FALSE) here, but -- per previous discussion -- we don't have a generally accepted "panic" API yet, and CpuDeadLoop() is not suitable for all platforms, so this should do= . With my trivial comments addressed: Acked-by: Laszlo Ersek Comment on the general idea: I much like that the self-test is active on every boot (without high costs). Side idea: technically we could merge the first two patches in separation (pending MdeModulePkg maintainer approval), and then consider the last three patches as new improvements (possibly needing longer review). This kind of splitting has both advantages and disadvantages; the advantage is that the code movement / upstreaming to MdeModulePkg is not blocked by (somewhat) unrelated discussion. The disadvantages are that more admin work is needed (more posting, and more PRs), and that patches in the series that one might consider to belong together will fly apart in the git history. So I just figured I'd raise the option. Laszlo -=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#114219): https://edk2.groups.io/g/devel/message/114219 Mute This Topic: https://groups.io/mt/103911608/7686176 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/leave/12367111/7686176/19134562= 12/xyzzy [rebecca@openfw.io] -=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-