From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mx.groups.io with SMTP id smtpd.web11.3140.1683313007356762314 for ; Fri, 05 May 2023 11:56:47 -0700 Authentication-Results: mx.groups.io; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=cp+nTl38; spf=pass (domain: redhat.com, ip: 170.10.133.124, mailfrom: lersek@redhat.com) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1683313006; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=eYDmbDhF7te3CXWhkXNDDysdQ21w1Zp8jF0eo+WraA8=; b=cp+nTl38G52vveJxoRHBb8K1HBq4/IPMBECArM9IlkqFauEDZT1pWzDSFz6eQlNK7D0q6v qVJ3H9Wa1ngBJABydrs6T/evCOHh4cmytM7nMx0mxwo6eWMFee6jRg82syXR2YszYUDn6x XfJy2RaFFxOckphTl/fdcE8MNAP+798= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-672-OSVgmnK9PSG1RXjRWlFJsw-1; Fri, 05 May 2023 14:56:43 -0400 X-MC-Unique: OSVgmnK9PSG1RXjRWlFJsw-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 47A6889C7F8; Fri, 5 May 2023 18:56:40 +0000 (UTC) Received: from [10.39.193.58] (unknown [10.39.193.58]) by smtp.corp.redhat.com (Postfix) with ESMTPS id E7EDF2026D16; Fri, 5 May 2023 18:56:38 +0000 (UTC) Message-ID: Date: Fri, 5 May 2023 20:56:37 +0200 MIME-Version: 1.0 Subject: Re: [edk2-devel] [PATCH v2 1/1] OvmfPkg/NestedInterruptTplLib: replace ASSERT() with a warning logged. To: Michael Brown , devel@edk2.groups.io, kraxel@redhat.com Cc: Oliver Steffen , Pawel Polawski , Jiewen Yao , Ard Biesheuvel , Jordan Justen References: <20230503071954.266637-1-kraxel@redhat.com> <01020187ec402266-6d4dee99-5a0d-4105-abaf-419c2a5607cc-000000@eu-west-1.amazonses.com> From: "Laszlo Ersek" In-Reply-To: <01020187ec402266-6d4dee99-5a0d-4105-abaf-419c2a5607cc-000000@eu-west-1.amazonses.com> X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit On 5/5/23 16:10, Michael Brown wrote: > On 03/05/2023 08:19, Gerd Hoffmann wrote: >> OVMF can't guarantee that the ASSERT() doesn't happen.  Misbehaving >> EFI applications can trigger this.  So log a warning instead and try >> to continue. >> >> Reproducer: Fetch windows 11 22H2 iso image, boot it in qemu with OVMF. >> >> Traced to BootServices->Stall() being called with IPL=TPL_HIGH_LEVEL >> and Interrupts /enabled/ while windows is booting. >> >> Cc: Michael Brown >> Cc: Laszlo Ersek >> Signed-off-by: Gerd Hoffmann >> --- >>   OvmfPkg/Library/NestedInterruptTplLib/Tpl.c | 4 +++- >>   1 file changed, 3 insertions(+), 1 deletion(-) >> >> diff --git a/OvmfPkg/Library/NestedInterruptTplLib/Tpl.c >> b/OvmfPkg/Library/NestedInterruptTplLib/Tpl.c >> index e19d98878eb7..fdd7d15c4ba8 100644 >> --- a/OvmfPkg/Library/NestedInterruptTplLib/Tpl.c >> +++ b/OvmfPkg/Library/NestedInterruptTplLib/Tpl.c >> @@ -39,7 +39,9 @@ NestedInterruptRaiseTPL ( >>     // >>     ASSERT (GetInterruptState () == FALSE); >>     InterruptedTPL = gBS->RaiseTPL (TPL_HIGH_LEVEL); >> -  ASSERT (InterruptedTPL < TPL_HIGH_LEVEL); >> +  if (InterruptedTPL >= TPL_HIGH_LEVEL) { >> +    DEBUG ((DEBUG_WARN, "%a: Called at IPL %d\n", __func__, >> InterruptedTPL)); >> +  } >>       return InterruptedTPL; >>   } > > While https://bugzilla.redhat.com/show_bug.cgi?id=2189136 continues to > track the underlying Windows bug that leads to this assertion being > triggered: I suspect that this patch will allow people to boot these > buggy versions of Windows in OVMF, and I don't think it will make things > any worse. > > I would probably suggest changing DEBUG_WARN to DEBUG_ERROR since this > represents a serious invariant violation being detected.  With that change: > >   Reviewed-by: Michael Brown I don't like the patch. For two reasons: (1) It papers over the actual issue. The problem should be fixed where it is, if possible. (2) With the patch applied, NestedInterruptRaiseTPL() can return TPL_HIGH_LEVEL (as "InterruptedTPL"). Consequently, TimerInterruptHandler() [OvmfPkg/LocalApicTimerDxe/LocalApicTimerDxe.c] may pass TPL_HIGH_LEVEL back to NestedInterruptRestoreTPL(), as "InterruptedTPL". I believe that this in turn may invalidate at least one comment in NestedInterruptRestoreTPL(): // // Call RestoreTPL() to allow event notifications to be // dispatched. This will implicitly re-enable interrupts. // gBS->RestoreTPL (InterruptedTPL); Restoring TPL_HIGH_LEVEL does not re-enable interrupts -- nominally anyways. I wouldn't like OVMF to stick with yet another workaround / yet more internal inconsistency. We should just wait until fixed Windows installer media gets released. Here's an alternative: (a) Make LocalApicTimerDxe Xen-specific again. It's only the OVMF Xen platform that really *needs* NestedInterruptTplLib. (Don't get me wrong: NestedInterruptTplLib is technically correct in all circumstances, but in practice it happens to be too strict.) (b) For the non-Xen OVMF platforms, re-create a LocalApicTimerDxe variant that effectively has commits a086f4a63bc0 and a24fbd606125 reverted. (We should keep 9bf473da4c1d.) This returns us to pre-239b50a86370 status -- that is, a timer interrupt handler that (a) does not try to be smart about nested interrupts, therefore one that is much simpler, and (b) is more tolerant of the Windows / cdboot.efi spec violation, (c) is vulnerable to the timer interrupt storm seen on Xen, but will never run on Xen. (Only the OVMF Xen platform is supposed to be launched on Xen.) Laszlo