From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from us-smtp-delivery-1.mimecast.com (us-smtp-delivery-1.mimecast.com [207.211.31.81])
 by mx.groups.io with SMTP id smtpd.web10.388.1582752189176832199
 for <devel@edk2.groups.io>;
 Wed, 26 Feb 2020 13:23:09 -0800
Authentication-Results: mx.groups.io;
 dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=iwNEXuKC;
 spf=pass (domain: redhat.com, ip: 207.211.31.81, mailfrom: lersek@redhat.com)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com;
	s=mimecast20190719; t=1582752188;
	h=from:from:reply-to:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-type:
	 content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references;
	bh=Y5Mwl3ou9P2k2oTLUpLbZhRnIz9cwVBlEGOrjZT5s9g=;
	b=iwNEXuKCOxdn8wwQyjx8R7Uefn4LcBn5rtPEi5EUPiM82gmaz6w52WqXahIuuZhZjA4BTs
	0ZqH8Rh4EdXxmq8F0pzpXpk8aKVY8yA+gmU8WWgG7TJlIQ6j/NjTr5xJSkyrrMb0yWqhha
	lcvLbWFLE6q8iKCHgxZfIRRhd8vX4Qk=
Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com
 [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id
 us-mta-245-bbJWBy_8MLa0GjQ4RAUiaA-1; Wed, 26 Feb 2020 16:23:01 -0500
X-MC-Unique: bbJWBy_8MLa0GjQ4RAUiaA-1
Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16])
	(using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
	(No client certificate requested)
	by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 7C4C11005512;
	Wed, 26 Feb 2020 21:23:00 +0000 (UTC)
Received: from lacos-laptop-7.usersys.redhat.com (ovpn-116-185.ams2.redhat.com [10.36.116.185])
	by smtp.corp.redhat.com (Postfix) with ESMTP id 900505C28D;
	Wed, 26 Feb 2020 21:22:58 +0000 (UTC)
Subject: Re: [edk2-devel] [PATCH 12/16] OvmfPkg/CpuHotplugSmm: introduce First SMI Handler for hot-added CPUs
From: "Laszlo Ersek" <lersek@redhat.com>
To: edk2-devel-groups-io <devel@edk2.groups.io>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>,
 Igor Mammedov <imammedo@redhat.com>, Jiewen Yao <jiewen.yao@intel.com>,
 Jordan Justen <jordan.l.justen@intel.com>,
 Michael Kinney <michael.d.kinney@intel.com>,
 =?UTF-8?Q?Philippe_Mathieu-Daud=c3=a9?= <philmd@redhat.com>
Reply-To: devel@edk2.groups.io, lersek@redhat.com
References: <20200223172537.28464-1-lersek@redhat.com>
 <20200223172537.28464-13-lersek@redhat.com>
 <111145fc-be3d-2a9a-a126-c14345a8a8a4@redhat.com>
Message-ID: <2fefc402-2331-ebc9-4024-f99aa36082f0@redhat.com>
Date: Wed, 26 Feb 2020 22:22:57 +0100
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101
 Thunderbird/52.9.1
MIME-Version: 1.0
In-Reply-To: <111145fc-be3d-2a9a-a126-c14345a8a8a4@redhat.com>
X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16
X-Mimecast-Spam-Score: 0
X-Mimecast-Originator: redhat.com
Content-Language: en-US
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit

On 02/24/20 10:10, Laszlo Ersek wrote:

> Overnight I managed to think up an attack, from the OS, against the
> "SmmVacated" byte (the last byte of the reserved page, i.e. the last
> byte of the Post-SMM Pen).
> 
> Here's how:
> 
> There are three CPUs being hotplugged in one SMI, CPU#1..CPU#3. The OS
> boots them all before raising the SMI (i.e. before it allows the ACPI
> GPE handler to take effect). After the first CPU (let's say CPU#1)
> returns to the OS via the RSM, the OS uses it (=CPU#1) to attack
> "SmmVacated", writing 1 to it in a loop.
> 
> Meanwhile CPU#2 and CPU#3 are still in SMM; let's say CPU#2 is
> relocating SMBASE, while CPU#3 is spinning on the APIC ID gate. And the
> SMM Monarch (CPU#0) is waiting for CPU#2 to report back in through
> "SmmVacated", from the Post-SMM Pen.
> 
> Now, the OS writes 1 to "SmmVacated" early, pretending to be CPU#2. This
> causes the SMM Monarch (CPU#0) to open the APIC ID gate for CPU#3 before
> CPU#2 actually reaches the RSM. This may cause CPU#2 and CPU#3 to both
> reach RSM with the same SMBASE value.
> 
> So why did I think to put SmmVacated in normal RAM (in the Post-SMM Pen
> reserved page?) Because in the following message:
> 
>   http://mid.mail-archive.com/20191004160948.72097f6c@redhat.com
>   (alt. link: <https://edk2.groups.io/g/devel/message/48475>)
> 
> Igor showed that putting "SmmVacated" in SMRAM is racy, even without a
> malicious OS. The problem is that there is no way to flip a byte in
> SMRAM *atomically* with RSM. So the CPU that has just completed SMBASE
> relocation can only flip the byte before RSM (in SMRAM) or after RSM (in
> normal RAM). In the latter case, the OS can attack SmmVacated -- but in
> the former case, we get the same race *without* any OS involvement
> (because the CPU just about to leave SMM via RSM actively allows the SMM
> Monarch to ungate the relocation code for a different CPU).
> 
> So I don't think there's a 100% safe & secure way to do this. One thing
> we could try -- I could update the code -- is to *combine* both
> approaches; use two "SmmVacated" flags: one in SMRAM, set to 1 just
> before the RSM instruction, and the other one in normal RAM (reserved
> page) that this patch set already introduces. The normal RAM flag would
> avoid the race completely for benign OSes (like the present patchset
> already does), and the SMRAM flag would keep the racy window to a single
> instruction when the OS is malicious and the normal RAM flag cannot be
> trusted.

I've implemented and tested this "combined" approach.

Thanks
Laszlo

> 
> I'd appreciate feedback on this; I don't know how in physical firmware,
> the initial SMI handler for hot-added CPUs deals with the same problem.
> 
> I guess on physical platforms, the platform manual could simply say,
> "only hot-plug CPUs one by one". That eliminates the whole problem. In
> such a case, we could safely stick with the current patchset, too.
> 
> --*--
> 
> BTW, I did try hot-plugging multiple CPUs at once, with libvirt:
> 
>> virsh setvcpu ovmf.fedora.q35 1,3 --enable --live
>>
>> error: Operation not supported: only one hotpluggable entity can be
>> selected
> 
> I think this is because it would require multiple "device-add" commands
> to be sent at the same time over the QMP monitor -- and that's something
> that QEMU doesn't support. (Put alternatively, "device-add" does not
> take a list of objects to create.) In that regard, the problem described
> above is academic, because QEMU already seems like a platform that can
> only hotplug one CPU at a time. In that sense, using APIC ID *arrays*
> and *loops* per hotplug SMI is not really needed; I did that because we
> had discussed this feature in terms of multiple CPUs from the beginning,
> and because QEMU's ACPI GPE handler (the CSCN AML method) also loops
> over multiple processors.
> 
> Again, comments would be most welcome... I wouldn't like to complicate
> the SMI handler if it's not absolutely necessary.