From: "Laszlo Ersek" <lersek@redhat.com>
To: Igor Druzhinin <igor.druzhinin@citrix.com>, devel@edk2.groups.io
Cc: jordan.l.justen@intel.com, ard.biesheuvel@arm.com,
anthony.perard@citrix.com, julien@xen.org,
Ray Ni <ray.ni@intel.com>, Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [PATCH] OvmfPkg: End timer interrupt later to avoid stack overflow under load
Date: Tue, 16 Jun 2020 20:42:42 +0200 [thread overview]
Message-ID: <ee7d61de-ed38-acc4-1666-cd886d76cc14@redhat.com> (raw)
In-Reply-To: <1592275782-9369-1-git-send-email-igor.druzhinin@citrix.com>
(+Paolo, +Ray)
On 06/16/20 04:49, Igor Druzhinin wrote:
> RestoreTPL called while at TPL_HIGH_LEVEL unconditionally enables interrupts
> even if called in interrupt handler. That opens a window while interrupt
> is not completely handled but another interrupt could be accepted.
>
> If a VM starts on a heavily loaded host hundreds of periodic timer interrupts
> might be queued while vCPU is descheduled (the behavior is typical for
> a Xen host). The next time vCPU is scheduled again all of them get
> delivered back to back causing OVMF to accept each one without finishing
> a previous one and cleaning up the stack. That quickly results in stack
> overflow and a triple fault.
>
> Fix it by postponing sending EOI until we finished processing the current
> tick giving interrupt handler opportunity to clean up the stack before
> accepting the next tick.
>
> Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
> ---
>
> Laszlo, Anthony,
>
> Do you think it's the right way to address it?
>
> Alternatively, we might avoid calling RaiseTPL in interrupt handler at all
> like it's done in HpetTimer implementation for instance.
>
> Or we might try to address it in Raise/RestoreTPL calls by saving/restoring
> interrupt state along with TPL.
>
> ---
> OvmfPkg/8254TimerDxe/Timer.c | 5 +++--
> OvmfPkg/XenTimerDxe/XenTimerDxe.c | 5 +++--
> 2 files changed, 6 insertions(+), 4 deletions(-)
If I understand correctly, TimerInterruptHandler()
[OvmfPkg/8254TimerDxe/Timer.c] currently does the following:
- RaiseTPL (TPL_HIGH_LEVEL) --> mask interrupts from being delivered
- mLegacy8259->EndOfInterrupt() --> permit the PIC to generate further
interrupts (= make them pending)
- RestoreTPL() --> unmask interrupts (allow delivery)
RestoreTPL() is always expected to invoke handlers (on its own stack)
that have just been unmasked, so that behavior is not unexpected, in my
opinion.
What seems unexpected is the queueing of a huge number of timer
interrupts. I would think a timer interrupt is either pending or not
pending (i.e. if it's already pending, then the next generated interrupt
is coalesced, not queued). While there would still be a window between
the EOI and the unmasking, I don't think it would normally allow for a
*huge* number of queued interrupts (and consequently a stack overflow).
So I basically see the root of the problem in the interrupts being
queued rather than coalesced. I'm pretty unfamiliar with this x86 area
(= the 8259 PIC in general), but the following wiki article seems to
agree with my suspicion:
https://wiki.osdev.org/8259_PIC#How_does_the_8259_PIC_chip_work.3F
[...] and whether there's an interrupt already pending. If the
channel is unmasked and there's no interrupt pending, the PIC will
raise the interrupt line [...]
Can we say that the interrupt queueing (as opposed to coalescing) is a
Xen issue?
(Hmmm... maybe the hypervisor *has* to queue the timer interrupts,
otherwise some of them would simply be lost, and the guest would lose
track of time.)
Either way, I'm not sure what the best approach is. This driver was
moved under OvmfPkg from PcAtChipsetPkg in commit 1a3ffdff82e6
("OvmfPkg: Copy 8254TimerDxe driver from PcAtChipsetPkg", 2019-04-11).
HpetTimerDxe also lives under PcAtChipsetPkg.
So I think I'll have to rely on the expertise of Ray here (CC'd).
Also, I recall a recent-ish QEMU commit that seems vaguely related
(i.e., to timer interrupt coalescing -- see 7a3e29b12f5a, "mc146818rtc:
fix timer interrupt reinjection again", 2019-11-19), so I'm CC'ing Paolo
too.
Some more comments / questions below:
>
> diff --git a/OvmfPkg/8254TimerDxe/Timer.c b/OvmfPkg/8254TimerDxe/Timer.c
> index 67e22f5..fd1691b 100644
> --- a/OvmfPkg/8254TimerDxe/Timer.c
> +++ b/OvmfPkg/8254TimerDxe/Timer.c
> @@ -79,8 +79,6 @@ TimerInterruptHandler (
>
> OriginalTPL = gBS->RaiseTPL (TPL_HIGH_LEVEL);
>
> - mLegacy8259->EndOfInterrupt (mLegacy8259, Efi8259Irq0);
> -
> if (mTimerNotifyFunction != NULL) {
> //
> // @bug : This does not handle missed timer interrupts
> @@ -89,6 +87,9 @@ TimerInterruptHandler (
> }
>
> gBS->RestoreTPL (OriginalTPL);
> +
> + DisableInterrupts ();
> + mLegacy8259->EndOfInterrupt (mLegacy8259, Efi8259Irq0);
> }
So this briefly (temporarily) unmasks interrupt delivery (between
RestoreTPL() and DisableInterrupts()) while the PIC is still blocked
from generating more, and then unblocks the PIC.
It looks plausible for preventing the unbounded recursion per se, but
why is it safe to leave the function with interrupts disabled? Before
the patch, that didn't use to be the case.
I'm generally concerned about such an "indiscriminate" 8254TimerDxe
patch, because I haven't seen a similar report on QEMU/KVM yet. And
again I'm quite out of my depth with the 8259 / 8254. Even if we end up
merging a patch like this, I feel tempted to restrict it to Xen.
... Regarding XenTimerDxe, I don't have the slightest idea, unfortunately.
Thanks
Laszlo
>
> /**
> diff --git a/OvmfPkg/XenTimerDxe/XenTimerDxe.c b/OvmfPkg/XenTimerDxe/XenTimerDxe.c
> index 9f9e047..0bec593 100644
> --- a/OvmfPkg/XenTimerDxe/XenTimerDxe.c
> +++ b/OvmfPkg/XenTimerDxe/XenTimerDxe.c
> @@ -61,8 +61,6 @@ TimerInterruptHandler (
>
> OriginalTPL = gBS->RaiseTPL (TPL_HIGH_LEVEL);
>
> - SendApicEoi();
> -
> if (mTimerNotifyFunction != NULL) {
> //
> // @bug : This does not handle missed timer interrupts
> @@ -71,6 +69,9 @@ TimerInterruptHandler (
> }
>
> gBS->RestoreTPL (OriginalTPL);
> +
> + DisableInterrupts ();
> + SendApicEoi ();
> }
>
> /**
>
next prev parent reply other threads:[~2020-06-16 18:42 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-16 2:49 [PATCH] OvmfPkg: End timer interrupt later to avoid stack overflow under load Igor Druzhinin
2020-06-16 18:42 ` Laszlo Ersek [this message]
2020-06-17 3:16 ` Igor Druzhinin
2020-06-17 12:44 ` Laszlo Ersek
2020-06-17 13:51 ` Paolo Bonzini
2020-06-17 15:46 ` Laszlo Ersek
2020-06-17 16:59 ` Paolo Bonzini
2020-06-17 17:23 ` Igor Druzhinin
2020-06-18 8:36 ` Laszlo Ersek
2020-06-18 8:44 ` Laszlo Ersek
2020-06-18 12:10 ` [edk2-devel] " Laszlo Ersek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-list from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ee7d61de-ed38-acc4-1666-cd886d76cc14@redhat.com \
--to=devel@edk2.groups.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox