public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
From: "Laszlo Ersek" <lersek@redhat.com>
To: Ard Biesheuvel <ardb@kernel.org>, Michael Brown <mcb30@ipxe.org>
Cc: devel@edk2.groups.io, kraxel@redhat.com,
	Oliver Steffen <osteffen@redhat.com>,
	Pawel Polawski <ppolawsk@redhat.com>,
	Jiewen Yao <jiewen.yao@intel.com>,
	Ard Biesheuvel <ardb+tianocore@kernel.org>,
	Jordan Justen <jordan.l.justen@intel.com>
Subject: Re: [edk2-devel] [PATCH v2 1/1] OvmfPkg/NestedInterruptTplLib: replace ASSERT() with a warning logged.
Date: Mon, 8 May 2023 08:45:52 +0200	[thread overview]
Message-ID: <0b5118c3-35b6-28f6-87e1-bcba6d445c82@redhat.com> (raw)
In-Reply-To: <CAMj1kXEw1cSApkTzFZXFdCdO0T8Pg3ujQWi6skWfaLuUGiBB+Q@mail.gmail.com>

On 5/6/23 01:57, Ard Biesheuvel wrote:
> On Sat, 6 May 2023 at 01:27, Michael Brown <mcb30@ipxe.org> wrote:
>>
>> On 05/05/2023 19:56, Laszlo Ersek wrote:
>>> I don't like the patch. For two reasons:
>>>
>>> (1) It papers over the actual issue. The problem should be fixed where
>>> it is, if possible.
>>
>> Agreed, but (as you have shown in
>> https://bugzilla.redhat.com/show_bug.cgi?id=2189136) the bug lies in
>> Windows code rather than in EDK2 code.  If the goal is to allow these
>> buggy Windows builds to still be used with OVMF, then the only option is
>> to paper over the issue.  We should do this only if it can be proven
>> safe to do so, of course.
>>
>>> (2) With the patch applied, NestedInterruptRaiseTPL() can return
>>> TPL_HIGH_LEVEL (as "InterruptedTPL"). Consequently,
>>> TimerInterruptHandler() [OvmfPkg/LocalApicTimerDxe/LocalApicTimerDxe.c]
>>> may pass TPL_HIGH_LEVEL back to NestedInterruptRestoreTPL(), as
>>> "InterruptedTPL".
>>>
>>> I believe that this in turn may invalidate at least one comment in
>>> NestedInterruptRestoreTPL():
>>>
>>>      //
>>>      // Call RestoreTPL() to allow event notifications to be
>>>      // dispatched.  This will implicitly re-enable interrupts.
>>>      //
>>>      gBS->RestoreTPL (InterruptedTPL);
>>>
>>> Restoring TPL_HIGH_LEVEL does not re-enable interrupts -- nominally anyways.
>>
>> I agree that the comment is invalidated, but as far as I can tell the
>> logic remains safe.
>>
>> I will put together a patch to update the comments in
>> NestedInterruptTplLib to address the possibility of an interrupt
>> occurring (illegally) at TPL_HIGH_LEVEL.
>>
>>> (a) Make LocalApicTimerDxe Xen-specific again. It's only the OVMF Xen
>>> platform that really *needs* NestedInterruptTplLib. (Don't get me wrong:
>>> NestedInterruptTplLib is technically correct in all circumstances, but
>>> in practice it happens to be too strict.)
>>>
>>> (b) For the non-Xen OVMF platforms, re-create a LocalApicTimerDxe
>>> variant that effectively has commits a086f4a63bc0 and a24fbd606125
>>> reverted. (We should keep 9bf473da4c1d.) This returns us to
>>> pre-239b50a86370 status -- that is, a timer interrupt handler that (a)
>>> does not try to be smart about nested interrupts, therefore one that is
>>> much simpler, and (b) is more tolerant of the Windows / cdboot.efi spec
>>> violation, (c) is vulnerable to the timer interrupt storm seen on Xen,
>>> but will never run on Xen. (Only the OVMF Xen platform is supposed to be
>>> launched on Xen.)
>>
>> I'm less keen on this because it reduces the runtime exposure of a very
>> complex piece of code, and will effectively cause that code to become
>> unmaintained.
>>
>> It's also satisfying (to me) that NestedInterruptTplLib provides a
>> provable upper bound on stack consumption due to interrupts, which can't
>> be guaranteed by the simpler pre-239b50a86370 scheme.
>>
>> Could we defer judgement until after I've fully reasoned through (and
>> documented) how NestedInterruptTplLib will work in the presence of
>> interrupts occurring at TPL_HIGH_LEVEL?
>>
> 
> Would it be feasible for our firmware implementation to disable the
> timer interrupt at the timer end as well?
> 
> E.g.,
> 
> RaiseTPL(HIGH)::
> 
> CLI
> disarm timer
> 
> 
> RestoreTPL::
> 
> <complain if HIGH and interrupts enabled at CPU side>
> re-arm timer
> STI
> 

I can be entirely wrong here, but:

- we looked for a solution (or workaround) to the original problem that
stays within the boundaries of OvmfPkg, so sinking tweaks into the core
TPL manipulation functions isn't ideal

- regarding the TimerInterruptHandler() function(s) that do live in
OvmfPkg, there had been tweaks to signaling end-of-interrupt (which I
understand as sort of equivalent to your suggestion, as unless/until you
signal EOI, no more interrupts will be *generated*), but those had not
helped. The EOI was either too early and so we got the unbounded
nesting, or it was too late, and no interrupts were generated while (for
example) TPL_CALLBACK code would depend on timers with CheckEvent. See
bug 4162 -- that was what prompted Michael to revert the EOI placement
tweak and to implement NestedInterruptLib.

Apologies if there are further interpretations of disarming the timer
that I'm missing!

Laszlo


  reply	other threads:[~2023-05-08  6:45 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-03  7:19 [PATCH v2 1/1] OvmfPkg/NestedInterruptTplLib: replace ASSERT() with a warning logged Gerd Hoffmann
2023-05-05 14:10 ` [edk2-devel] " Michael Brown
2023-05-05 18:56   ` Laszlo Ersek
2023-05-05 23:27     ` Michael Brown
2023-05-05 23:57       ` Ard Biesheuvel
2023-05-08  6:45         ` Laszlo Ersek [this message]
2023-05-09  9:13           ` Ard Biesheuvel
2023-05-08  6:38       ` Laszlo Ersek
2023-05-08 21:31         ` [PATCH 0/2] OvmfPkg: Relax assertion that interrupts do not occur at TPL_HIGH_LEVEL Michael Brown
2023-05-09  7:05           ` Gerd Hoffmann
2023-05-09  8:43           ` Laszlo Ersek
2023-05-09 12:08             ` [edk2-devel] " Michael Brown
2023-05-09 13:27               ` Laszlo Ersek
     [not found]         ` <20230508213100.3949708-1-mcb30@ipxe.org>
2023-05-08 21:31           ` [PATCH 1/2] OvmfPkg: Clarify invariants for NestedInterruptTplLib Michael Brown
2023-05-08 21:31           ` [PATCH 2/2] OvmfPkg: Relax assertion that interrupts do not occur at TPL_HIGH_LEVEL Michael Brown
2023-05-09  8:35             ` Laszlo Ersek
2023-05-09  9:42               ` Gerd Hoffmann
2023-05-09 12:04               ` [edk2-devel] " Michael Brown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0b5118c3-35b6-28f6-87e1-bcba6d445c82@redhat.com \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox