public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
From: "Laszlo Ersek" <lersek@redhat.com>
To: Michael Brown <mcb30@ipxe.org>, devel@edk2.groups.io
Cc: Ard Biesheuvel <ardb+tianocore@kernel.org>,
	Brijesh Singh <brijesh.singh@amd.com>,
	Erdem Aktas <erdemaktas@google.com>,
	Gerd Hoffmann <kraxel@redhat.com>,
	James Bottomley <jejb@linux.ibm.com>,
	Jiewen Yao <jiewen.yao@intel.com>,
	Jordan Justen <jordan.l.justen@intel.com>,
	Min Xu <min.m.xu@intel.com>, Oliver Steffen <osteffen@redhat.com>,
	Sebastien Boeuf <sebastien.boeuf@intel.com>,
	Tom Lendacky <thomas.lendacky@amd.com>
Subject: Re: [edk2-devel] [PATCH v2] OvmfPkg/PlatformInitLib: catch QEMU's CPU hotplug reg block regression
Date: Thu, 12 Jan 2023 18:58:24 +0100	[thread overview]
Message-ID: <49e4e8bb-3bbd-0ca8-ee59-e75560deffa7@redhat.com> (raw)
In-Reply-To: <01020185a6bda78a-05d82180-4d1a-4af4-9a9b-ac78088d11ed-000000@eu-west-1.amazonses.com>

On 1/12/23 17:08, Michael Brown wrote:
> On 12/01/2023 13:22, Laszlo Ersek wrote:
>>>> Detect the issue in PlatformMaxCpuCountInitialization(), and
>>>> print an error message and *hang* if the issue is present.
>>>
>>> Would this mean that OVMF would refuse to start with all current
>>> distro versions of qemu (when not using KVM), or am I
>>> misunderstanding?
>>
>> Your understanding is correct.
>
> I apologise in advance if this is a stupid question, but: given that
> we can detect the issue (as per this patch), and given also:
>
>>> On 12/01/2023 08:28, Laszlo Ersek wrote:
>>>> In QEMU v5.1.0, the CPU hotplug register block misbehaves: the
>>>> negotiation protocol is (effectively) broken such that it
>>>> suggests that switching from the legacy interface to the modern
>>>> interface works, but in reality the switch never happens.
> ...would it work to detect the issue and treat it as "modern interface
> is not supported: continue to use the legacy interface"?  IOW, to
> treat "Present=0" as indicating that the modern interface is not
> supported.

I considered that. It's not a bad idea at all, but it's more complicated
than that.

The code in PlatformPei (or more precisely, nowadays in PlatformInitLib)
could be rearranged. It would be a really ugly patch, because now, after
CmdData2 reads as 0, we're simply not prepared to renege on that
determination. So the branches would have to be rearranged in a quite
ugly manner. *But* it would be doable of course.

However, in the CpuHotplugSmm driver, we also read the same register
(QEMU_CPUHP_R_CMD_DATA2), for determining feature availability. See the
QemuCpuhpReadCommandData2() call in
"OvmfPkg/CpuHotplugSmm/CpuHotplug.c", function CpuHotplugEntry(). And
the outcome of *that* check is final. No counting is done there anymore,
because it's unnecessary in the first place. If the check succeeds, we
proceed, if the check fails, we hang hard. That sanity check is also
misled by the QEMU bug, and tweaking *that* check is out of question.

And I didn't want to post a patch that would make PlatformPei deal with
the QEMU bug one way, and CpuHotplugSmm another way.

Technically, the following should be possible: in PlatformInitLib,
rearrange the branches such that we can still do "something" if the QEMU
bug is detected. Then, if PcdSmmSmramRequire is TRUE, hang hard, and if
PcdSmmSmramRequire is FALSE, then just pretend the modern hotplug
register block is not available.

Unfortunately, I really dislike this. It will cause the same hang on
distros that build OVMF -- possibly their *only* OVMF binary -- with -D
SMM_REQUIRE. Then the request will morph into "can we refine this even
further". And sure, you can refine it even further:

- the CpuHotplugSmm driver exits gracefully if QEMU does not negotiate
the "SMRAM at default SMBASE" feature (see edk2 commit d74d56fcfa31
("OvmfPkg: introduce PcdQ35SmramAtDefaultSmbase", 2020-02-05), and QEMU
commit f404220e279c ("q35: implement 128K SMRAM at default SMBASE
address", 2020-01-22)). This can be disabled through the
"mch.smbase-smram" property on the QEMU command line.

- QEMU can be configured with other compat properties on the command
line so that "CPU hotplug with SMI" and "CPU hot-unplug with SMI" *not*
be offered to the firmware. Then QEMU will reject hotplug attempts, and
the SMM hotplug code in edk2 will not be triggered by the (virtual)
hardware.

The case is that both QEMU and edk2 check for each other's supported
features. It's a complex interwoven feature set with security impact,
which is exactly why we added feature negotiation at every step --
effectively mutual negotiation wherever necessary. I cannot claim I
remember every part of it, and playing tricks around feature negotiation
with SMM impact makes me *extremely uncomfortable*. I absolutely don't
want to author an OVMF patch, briefly before I disappear again (for
good!), that "looks good" now, and then becomes a horrible SMM CVE in a
year or two. I want to go for "obviously no bug", rather than "no
obvious bug".

In brief, my counter-argument is: once we start refining the hang, it
will never be nuanced and polished enough for users, and we'll just heap
on more complexity, until we introduce an obscure but nasty bug. I don't
want to start down that path.

I can live with this patch being rejected altogether (that's consistent
with me having exited edk2 -- I can pretend I don't know about it; it's
not my job anymore). I will not take responsibility for relaxing the
proposed hang however, not even as a reviewer ACKing it. If someone can
rearrange the code, drawing a *practical* but also secure boundary for
the fallback, that's great, the patch can go in without me "being in the
know". I want none of that responsibility.

I don't want to force my hard-liner view on QEMU/OVMF users here; I need
not participate in a relaxed approach either, however. This reads like a
re-run of the "old grub in guest versus non-executable UEFI heap"
discussion, and it's giving me the shivers.

Your proposal is entirely justified, from a practical / user
perspective, but I'm not the right person for it.

Thanks
Laszlo


  reply	other threads:[~2023-01-12 17:58 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-12  8:28 [PATCH v2] OvmfPkg/PlatformInitLib: catch QEMU's CPU hotplug reg block regression Laszlo Ersek
2023-01-12  9:55 ` [edk2-devel] " Michael Brown
2023-01-12 10:09   ` Ard Biesheuvel
2023-01-12 13:31     ` Laszlo Ersek
2023-01-12 13:22   ` Laszlo Ersek
2023-01-12 16:08     ` Michael Brown
2023-01-12 17:58       ` Laszlo Ersek [this message]
2023-01-12 18:22         ` Laszlo Ersek
2023-01-12 22:49           ` Michael Brown
2023-01-13  6:03         ` Gerd Hoffmann
2023-01-13  9:32           ` Gerd Hoffmann
2023-01-13 10:10             ` Laszlo Ersek
2023-01-13 12:22               ` Gerd Hoffmann
2023-01-16 14:42                 ` Ard Biesheuvel
2023-01-16 14:48                 ` Laszlo Ersek
2023-01-17 12:37                   ` Gerd Hoffmann
2023-01-17 16:43                     ` Ard Biesheuvel
2023-01-18  7:25                       ` Gerd Hoffmann
2023-01-18 11:50                         ` Laszlo Ersek
2023-01-18 13:10                           ` Gerd Hoffmann
2023-01-18 13:25                             ` Laszlo Ersek
2023-01-18 13:10                           ` Ard Biesheuvel
2023-01-18 13:21                             ` Laszlo Ersek
2023-01-12 18:34 ` Laszlo Ersek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49e4e8bb-3bbd-0ca8-ee59-e75560deffa7@redhat.com \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox