public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
From: "Ni, Ray" <ray.ni@intel.com>
To: devel@edk2.groups.io
Cc: Michael D Kinney <michael.d.kinney@intel.com>,
	Liming Gao <gaoliming@byosoft.com.cn>,
	Laszlo Ersek <lersek@redhat.com>, Michael Brown <mcb30@ipxe.org>,
	Paolo Bonzini <pbonzini@redhat.com>
Subject: [edk2-devel] [PATCH 2/2] MdeModulePkg/DxeCore: Fix stack overflow issue due to nested interrupts
Date: Thu, 29 Feb 2024 21:02:46 +0800	[thread overview]
Message-ID: <20240229130246.3-3-ray.ni@intel.com> (raw)
In-Reply-To: <20240229130246.3-1-ray.ni@intel.com>

Assuming the current TPL is TPL_APPLICATION,
RaiseTPL (TPL_APPLICATION -> HIGH) and
RestoreTPL (TPL_HIGH -> TPL_APPLICATION) are called in the timer interrupt
context.
RestoreTPL() will lower the TPL from TPL_HIGH to TPL_NOTIFY first, enable
the interrupts and dispatch pending events associated with TPL_NOTIFY.
Then it will lower TPL to TPL_CALLBACK, enable the interrupts and dispatch
pending events associated with TPL_CALLBACK.
In the end, it will lower the TPL to TPL_APPLICATION with interrupt enabled.

However, it's possible that another timer interrupt happens just in the end
of RestoreTPL() function when TPL is TPL_APPLICATION.

CPU runs into the interrupt context but the contents pushed to the stack
in the outer interrupt context haven't been fully popped.

In the nested interrupt context, if 3rd interrupt happens just in the end
of RestoreTPL() function, CPU runs to interrupt context again but the
contents pushed to the stack in the 2nd interrupt context haven't been
fully popped.

The situation can repeat infinitely that could result in stack overflow.

A ideal solution is to not keep the interrupt disabled when
RestoreTPL(TPL_HIGH -> not TPL_HIGH) is executed in the timer interrupt
context because the interrupt handler will re-enable the interrupt with
arch specific instructions (e.g.: IRET for x86).

The patch introduces mInterruptedTplMask which tells RestoreTPL() if
it's called in the interrupt context and whether it should defer enabling
the interrupt.

Signed-off-by: Michael D Kinney <michael.d.kinney@intel.com>
Signed-off-by: Ray Ni <ray.ni@intel.com>
Cc:  Liming Gao <gaoliming@byosoft.com.cn>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Michael Brown <mcb30@ipxe.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
---
 MdeModulePkg/Core/Dxe/Event/Tpl.c | 151 +++++++++++++++++++++++++++++-
 1 file changed, 147 insertions(+), 4 deletions(-)

diff --git a/MdeModulePkg/Core/Dxe/Event/Tpl.c b/MdeModulePkg/Core/Dxe/Event/Tpl.c
index 8ce456f968..f2206bc788 100644
--- a/MdeModulePkg/Core/Dxe/Event/Tpl.c
+++ b/MdeModulePkg/Core/Dxe/Event/Tpl.c
@@ -9,6 +9,100 @@ SPDX-License-Identifier: BSD-2-Clause-Patent
 #include "DxeMain.h"
 #include "Event.h"
 
+//
+// It's used to support nested interrupts.
+// The bit position is the TPL that was interrupted.
+// The bit is set when the TPL is interrupted.
+//
+// Example 1):
+//   Assume system runs at TPL_APPLICATION (4) and there is no pending event.
+//
+//   Timer interrupt happens. CPU runs to the interrupt context.
+//   1. Interrupt context (Interrupted TPL = TPL_APPLICATION):
+//     CoreRaiseTpl(TPL_APPLICATION -> TPL_HIGH) is called where
+//     mInterruptedTplMask is changed from 0 to 0x10.
+//
+//     Note: CoreRaiseTpl(TPL_HIGH) could be called from Timer driver, or CoreTimerTick().
+//       If it's called from both, the 2nd call in CoreTimerTick() will not change mInterruptedTplMask
+//       as it's a TPL raise from TPL_HIGH to TPL_HIGH.
+//
+//     When CoreRestoreTpl(TPL_HIGH -> TPL_APPLICATION) is called, because there is no pending event it will
+//     lower TPL to TPL_APPLICATION, but with interrupts disabled as the TPL_APPLICATION bit is set in
+//     mInterruptedTplMask.
+//     mInterruptedTplMask is changed from 0x10 to 0.
+//
+// Example 2):
+//   Assume system runs at TPL_APPLICATION (4) and there is only one event at TPL_CALLBACK (8) which will
+//   register another event at TPL_NOTIFY (16).
+//
+//   Timer interrupt happens. CPU runs to the (outer) interrupt context.
+//   1. Outer interrupt context (Interrupted TPL = TPL_APPLICATION):
+//     CoreRaiseTpl(TPL_APPLICATION -> TPL_HIGH) is called where
+//     mInterruptedTplMask is changed from 0 to 0x10.
+//
+//     When CoreRestoreTpl(TPL_HIGH -> TPL_APPLICATION) is called, because there is one event at
+//     TPL_CALLBACK (8) it will lower TPL to TPL_CALLBACK, enable the interrupts and dispatch it.
+//
+//     The event registers another event associating with TPL_NOTIFY.
+//
+//     2nd timer interrupt happens. CPU runs to the inner-1/nested-1 interrupt context.
+//
+//     2. Inner-1/nested-1 interrupt context (Interrupted TPL = TPL_CALLBACK):
+//       CoreRaiseTpl(TPL_CALLBACK -> TPL_HIGH) is called where
+//       mInterruptedTplMask is changed from 0x10 to 0x110.
+//
+//       When CoreRestoreTpl(TPL_HIGH -> TPL_CALLBACK) is called, because there is one event at
+//       TPL_NOTIFY (16) it will lower TPL to TPL_NOTIFY, enable the interrupts and dispatch it.
+//
+//       3rd timer interrupt happens. CPU runs to the inner-2/nested-2 interrupt context.
+//
+//       3. Inner-2/nested-2 interrupt context (Interrupt TPL = TPL_NOTIFY):
+//         CoreRaiseTpl(TPL_NOTIFY -> TPL_HIGH) is called where
+//         mInterruptedTplMask is changed from 0x110 to 0x10110.
+//
+//         CoreTimerTick() signals mEfiCheckTimerEvent which queues a event at TPL_HIGH - 1.
+//
+//         When CoreRestoreTpl(TPL_HIGH -> TPL_NOTIFY) is called, because there is one event at
+//         TPL_HIGH - 1 (30) it will lower TPL to TPL_HIGH - 1, enable the interrupts and dispatch it.
+//
+//         4th timer interrupt happens. CPU runs to the inner-3/nested-3 interrupt context.
+//
+//           4. Inner-3/nested-3 interrupt context (Interrupt TPL = TPL_HIGH - 1):
+//           CoreRaiseTpl(TPL_HIGH - 1 -> TPL_HIGH) is called where
+//           mInterruptedTplMask is changed from 0x10110 to 0x40010110.
+//
+//           When CoreRestoreTpl(TPL_HIGH -> TPL_HIGH - 1) is called, because there is no pending event it will
+//           lower TPL to TPL_HIGH - 1, but with interrupts disabled as the (TPL_HIGH - 1) bit is set in
+//           mInterruptedTplMask.
+//           mInterruptedTplMask is changed from 0x40010110 to 0x10110.
+//
+//           Arch specific instruction (e.g.: IRET for X86) in the interrupt handler will re-enable the interrupts
+//           and return to the inner-2/nested-2 interrupt context.
+//
+//       3. Inner-2/nested-2 interrupt context continues (Interrupt TPL = TPL_NOTIFY):
+//         CoreRestoreTpl (TPL_HIGH -> TPL_NOTIFY) lowers TPL to TPL_NOTIFY with interrupts disabled
+//         as the (TPL_NOTIFY) bit is set in mInterruptedTplMask.
+//         mInterruptedTplMask is changed from 0x10110 to 0x110.
+//
+//         Arch specific instruction (e.g.: IRET for X86) in the interrupt handler will re-enable the interrupts
+//         and return to the inner-1/nested-1 interrupt context.
+//
+//     2. Inner-1/nested-1 interrupt context continues (Interrupt TPL = TPL_CALLBACK):
+//       CoreRestoreTpl (TPL_HIGH -> TPL_CALLBACK) lowers TPL to TPL_CALLBACK with interrupts disabled
+//       as the (TPL_CALLBACK) bit is set in mInterruptedTplMask.
+//       mInterruptedTplMask is changed from 0x110 to 0x10.
+//       Arch specific instruction (e.g.: IRET for X86) in the interrupt handler will re-enable the interrupts
+//       and return to the outer interrupt context.
+//
+//   1. Outer interrupt context continues (Interrupted TPL = TPL_APPLICATION):
+//     CoreRestoreTpl (TPL_HIGH -> TPL_APPLICATION) lowers TPL to TPL_APPLICATION with interrupts disabled
+//     as the (TPL_APPLICATION) bit is set in mInterruptedTplMask.
+//     mInterruptedTplMask is changed from 0x10 to 0.
+//     Arch specific instruction (e.g.: IRET for X86) in the interrupt handler will re-enable the interrupts
+//     and return to main flow.
+//
+volatile static UINTN  mInterruptedTplMask = 0;
+
 /**
   Set Interrupt State.
 
@@ -59,6 +153,7 @@ CoreRaiseTpl (
   )
 {
   EFI_TPL  OldTpl;
+  BOOLEAN  InterruptState;
 
   OldTpl = gEfiCurrentTpl;
   if (OldTpl > NewTpl) {
@@ -72,7 +167,40 @@ CoreRaiseTpl (
   // If raising to high level, disable interrupts
   //
   if ((NewTpl >= TPL_HIGH_LEVEL) &&  (OldTpl < TPL_HIGH_LEVEL)) {
-    CoreSetInterruptState (FALSE);
+    //
+    // When gCpu is NULL, State is TRUE.
+    // Calling CoreSetInterruptState() with TRUE is safe as CoreSetInterruptState() will directly return
+    // when gCpu is NULL.
+    //
+    InterruptState = TRUE;
+    if (gCpu != NULL) {
+      gCpu->GetInterruptState (gCpu, &InterruptState);
+    }
+
+    if (InterruptState) {
+      //
+      // Interrupts are currently enabled.
+      // Disable them for going to HIGH level.
+      //
+      CoreSetInterruptState (FALSE);
+    } else {
+      //
+      // Interrupts are already disabled.
+      // gEfiCurrentTpl is the "Interrupted TPL".
+      // It's a non-nested interrupt if mInterruptedTplMask is 0.
+      // It's a nested interrupt otherwise.
+      // Nested interrupts are ONLY allowed at TPL > "Interrupted TPL", otherwise
+      // stack overflow might occur.
+      // If it's a nested interrupt, the "Interrupted TPL" should be higher than
+      // the outer's "Interrupted TPL". ASSERT() is to check that.
+      //
+      ASSERT ((INTN)gEfiCurrentTpl > HighBitSet64 (mInterruptedTplMask));
+
+      //
+      // Save the "Interrupted TPL" (TPL that was interrupted).
+      //
+      mInterruptedTplMask |= (UINTN)(1 << gEfiCurrentTpl);
+    }
   }
 
   //
@@ -134,9 +262,9 @@ CoreRestoreTpl (
   }
 
   //
-  // Set the new value
+  // Set the new TPL with interrupt disabled.
   //
-
+  CoreSetInterruptState (FALSE);
   gEfiCurrentTpl = NewTpl;
 
   //
@@ -144,7 +272,22 @@ CoreRestoreTpl (
   // interrupts are enabled
   //
   if (gEfiCurrentTpl < TPL_HIGH_LEVEL) {
-    CoreSetInterruptState (TRUE);
+    if ((INTN)gEfiCurrentTpl > HighBitSet64 (mInterruptedTplMask)) {
+      //
+      // Only enable interrupts if restoring to a level above the highest
+      // interrupted TPL level.  This allows interrupt nesting, but only for
+      // events at higher TPL level than the current TPL level.
+      //
+      CoreSetInterruptState (TRUE);
+    } else {
+      //
+      // Clear interrupted TPL level mask, but do not re-enable interrupts here
+      // This will return to CoreTimerTick() and interrupts will be re-enabled
+      // when the timer interrupt handlers returns from interrupt context.
+      //
+      ASSERT ((INTN)gEfiCurrentTpl == HighBitSet64 (mInterruptedTplMask));
+      mInterruptedTplMask &= ~(UINTN)(1 << gEfiCurrentTpl);
+    }
   }
 
   DEBUG_CODE (
-- 
2.39.1.windows.1



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#116167): https://edk2.groups.io/g/devel/message/116167
Mute This Topic: https://groups.io/mt/104642317/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



  parent reply	other threads:[~2024-02-29 13:03 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-29 13:02 [edk2-devel] [PATCH 0/2] Fix stack overflow issue due to nested interrupts Ni, Ray
2024-02-29 13:02 ` [edk2-devel] [PATCH 1/2] UefiCpuPkg/CpuDxe: Return correct interrupt state Ni, Ray
2024-02-29 13:02 ` Ni, Ray [this message]
2024-02-29 13:23   ` [edk2-devel] [PATCH 2/2] MdeModulePkg/DxeCore: Fix stack overflow issue due to nested interrupts Michael Brown
2024-02-29 16:43     ` Michael D Kinney
2024-02-29 17:39       ` Michael Brown
2024-02-29 19:09         ` Michael D Kinney
2024-02-29 19:41           ` Michael Brown
2024-02-29 17:39       ` Paolo Bonzini
2024-02-29 19:09         ` Michael D Kinney
2024-02-29 19:04   ` Paolo Bonzini
2024-02-29 19:16     ` Michael D Kinney
2024-02-29 20:08       ` Paolo Bonzini
2024-02-29 19:22     ` Michael Brown
2024-02-29 19:26       ` Michael D Kinney
2024-02-29 19:44         ` Michael Brown
2024-02-29 20:11       ` Paolo Bonzini
2024-03-01  0:14   ` Paolo Bonzini
2024-03-01  3:07     ` Ni, Ray
2024-03-01  8:37       ` Paolo Bonzini
2024-03-01  9:27         ` Michael Brown
2024-03-01  9:33           ` Paolo Bonzini
2024-03-01 11:10             ` Michael Brown
2024-03-01 12:09               ` Paolo Bonzini
2024-03-05  4:19               ` Ni, Ray
     [not found]               ` <17B9C3692B44139F.30946@groups.io>
2024-06-18  5:54                 ` Ni, Ray
2024-03-01  8:44   ` Paolo Bonzini
2024-03-01  9:20     ` Ni, Ray

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240229130246.3-3-ray.ni@intel.com \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox