public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
From: "Michael Brown" <mcb30@ipxe.org>
To: devel@edk2.groups.io, michael.d.kinney@intel.com,  "Ni,
	Ray" <ray.ni@intel.com>
Cc: Liming Gao <gaoliming@byosoft.com.cn>,
	Laszlo Ersek <lersek@redhat.com>,
	 Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [edk2-devel] [PATCH 2/2] MdeModulePkg/DxeCore: Fix stack overflow issue due to nested interrupts
Date: Thu, 29 Feb 2024 19:41:10 +0000	[thread overview]
Message-ID: <0102018df6628fb4-14e85f28-9225-4d90-a6fd-f6c33530d484-000000@eu-west-1.amazonses.com> (raw)
In-Reply-To: <CO1PR11MB49295527FF063E11A75E8F10D25F2@CO1PR11MB4929.namprd11.prod.outlook.com>

On 29/02/2024 19:09, Michael D Kinney wrote:
>> "When the DXE Foundation is notified that the EFI_CPU_ARCH_PROTOCOL has
>> been installed, then the full version of the Boot Service RestoreTPL()
>> can be made available.  When an attempt is made to restore the TPL
>> level
>> to level below EFI_TPL_HIGH_LEVEL, then the DXE Foundation should use
>> the services of the EFI_CPU_ARCH_PROTOCOL to enable interrupts."
> 
> I would claim that this spec is perhaps incomplete in this area that
> that incomplete description is what allows the window for interrupt
> nesting to occur.  This language is correct for UEFI code that calls
> Raise/Restore TPL once the CPU Arch Protocol is available.  It does
> not cover the required behavior to prevent nesting when processing
> a timer interrupt.  This could be considered a gap in the UEFI/PI
> spec content.

I think it's important that we don't phrase it as preventing interrupt 
nesting.  The UEFI design *requires* that nested interrupts be allowed 
to happen, since callbacks at TPL_CALLBACK are allowed to wait for 
events at TPL_NOTIFY, and this can't happen without the existence of 
nested interrupts.

The problem is not nested interrupts per se: the problem is the 
potential for unlimited stack consumption.

>> - How does the proposed patch react to an interrupt occurring
>> (illegally) at TPL_HIGH_LEVEL (as happens with some versions of
>> Windows)?  As far as I can tell, it will result in mInterruptedTplMask
>> having bit 31 set but never cleared.  What impact will this have?
> 
> This behavior could potentially break any UEFI code that sets TPL to
> TPL_HIGH_LEVEL as a lock, which can then cause any number of
> undefined behaviors.  I am curious of you have a way to reproduce
> this failure for testing purposed.
> 
> I would agree that any proposed change needs to comprehend this
> Scenario if it can be reproduced with shipping OS images.

https://bugzilla.redhat.com/show_bug.cgi?id=2189136 was the original bug 
report in which it was discovered that Windows 11 would call 
RaiseTPL(TPL_HIGH_LEVEL) and then enable interrupts using the STI 
instruction.

It would be interesting to hear from anyone at Microsoft as to why this 
happens!

>> - How does the proposed patch react to potentially mismatched
>> RaisedTPL()/RestoreTPL() calls (e.g. oldTpl = RaiseTPL(TPL_CALLBACK)
>> followed by RaiseTPL(TPL_NOTIFY) followed by a single
>> RestoreTPL(oldTpl))?
> 
> The proposed patch only changes behavior when processing a timer
> interrupt.  I do not think there would be any changes in behavior
> for UEFI code that makes that sequence of calls.

The patch affects all callers of RaiseTPL() and RestoreTPL().  Given 
that it creates a new piece of shared state (mInterruptedTplMask), I'd 
like to see some kind of proof that it can correctly handle an arbitrary 
sequence of calls from unknown third-party code.

For example: consider an interrupt at TPL_APPLICATION with a third-party 
timer interrupt handler that does something like:

   OldTpl = RaiseTPL (TPL_HIGH_LEVEL);

   ... send EOI, call timer tick function, etc ...

   if (OldTpl < TPL_NOTIFY) {
     RestoreTPL (TPL_NOTIFY);
     ... do some weird OEM-specific thing ...
   }

   RestoreTPL ( OldTpl );

This is arguably a valid sequence of calls to RaiseTPL()/RestoreTPL(). 
With the patch as-is, mInterruptedTplMask will have flagged the 
TPL_APPLICATION bit but not the TPL_NOTIFY bit, and so the call to 
RestoreTPL(TPL_NOTIFY) *will* re-enable interrupts, which is against the 
intention of the patch.

Thanks,

Michael



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#116187): https://edk2.groups.io/g/devel/message/116187
Mute This Topic: https://groups.io/mt/104642317/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



  reply	other threads:[~2024-02-29 19:41 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-29 13:02 [edk2-devel] [PATCH 0/2] Fix stack overflow issue due to nested interrupts Ni, Ray
2024-02-29 13:02 ` [edk2-devel] [PATCH 1/2] UefiCpuPkg/CpuDxe: Return correct interrupt state Ni, Ray
2024-02-29 13:02 ` [edk2-devel] [PATCH 2/2] MdeModulePkg/DxeCore: Fix stack overflow issue due to nested interrupts Ni, Ray
2024-02-29 13:23   ` Michael Brown
2024-02-29 16:43     ` Michael D Kinney
2024-02-29 17:39       ` Michael Brown
2024-02-29 19:09         ` Michael D Kinney
2024-02-29 19:41           ` Michael Brown [this message]
2024-02-29 17:39       ` Paolo Bonzini
2024-02-29 19:09         ` Michael D Kinney
2024-02-29 19:04   ` Paolo Bonzini
2024-02-29 19:16     ` Michael D Kinney
2024-02-29 20:08       ` Paolo Bonzini
2024-02-29 19:22     ` Michael Brown
2024-02-29 19:26       ` Michael D Kinney
2024-02-29 19:44         ` Michael Brown
2024-02-29 20:11       ` Paolo Bonzini
2024-03-01  0:14   ` Paolo Bonzini
2024-03-01  3:07     ` Ni, Ray
2024-03-01  8:37       ` Paolo Bonzini
2024-03-01  9:27         ` Michael Brown
2024-03-01  9:33           ` Paolo Bonzini
2024-03-01 11:10             ` Michael Brown
2024-03-01 12:09               ` Paolo Bonzini
2024-03-05  4:19               ` Ni, Ray
     [not found]               ` <17B9C3692B44139F.30946@groups.io>
2024-06-18  5:54                 ` Ni, Ray
2024-03-01  8:44   ` Paolo Bonzini
2024-03-01  9:20     ` Ni, Ray

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0102018df6628fb4-14e85f28-9225-4d90-a6fd-f6c33530d484-000000@eu-west-1.amazonses.com \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox