From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: mx.groups.io; dkim=missing; spf=pass (domain: redhat.com, ip: 209.132.183.28, mailfrom: lersek@redhat.com) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by groups.io with SMTP; Thu, 29 Aug 2019 10:01:40 -0700 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 5142B2D6A23; Thu, 29 Aug 2019 17:01:39 +0000 (UTC) Received: from lacos-laptop-7.usersys.redhat.com (ovpn-117-174.ams2.redhat.com [10.36.117.174]) by smtp.corp.redhat.com (Postfix) with ESMTP id 652F860C05; Thu, 29 Aug 2019 17:01:36 +0000 (UTC) Subject: Re: [edk2-rfc] [edk2-devel] CPU hotplug using SMM with QEMU+OVMF To: Igor Mammedov , "Yao, Jiewen" Cc: "Kinney, Michael D" , Paolo Bonzini , "rfc@edk2.groups.io" , Alex Williamson , "devel@edk2.groups.io" , qemu devel list , "Chen, Yingwen" , "Nakajima, Jun" , Boris Ostrovsky , Joao Marcal Lemos Martins , Phillip Goerl References: <8091f6e8-b1ec-f017-1430-00b0255729f4@redhat.com> <35396800-32d2-c25f-b0d0-2d7cd8438687@redhat.com> <2b4ba607-f0e3-efee-6712-6dcef129b310@redhat.com> <7f2d2f1e-2dd8-6914-c55e-61067e06b142@redhat.com> <3661c0c5-3da4-1453-a66a-3e4d4022e876@redhat.com> <74D8A39837DF1E4DA445A8C0B3885C503F76FDAF@shsmsx102.ccr.corp.intel.com> <74D8A39837DF1E4DA445A8C0B3885C503F7728AB@shsmsx102.ccr.corp.intel.com> <20190827203102.56d0d048@redhat.com> From: "Laszlo Ersek" Message-ID: <033ced1a-1399-968e-cce6-6b15a20b0baf@redhat.com> Date: Thu, 29 Aug 2019 19:01:35 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <20190827203102.56d0d048@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Thu, 29 Aug 2019 17:01:39 +0000 (UTC) Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit On 08/27/19 20:31, Igor Mammedov wrote: > On Sat, 24 Aug 2019 01:48:09 +0000 > "Yao, Jiewen" wrote: >> (05) Host CPU: (OS) Port 0xB2 write, all CPUs enter SMM (NOTE: New CPU >> will not enter CPU because SMI is disabled) > I think only CPU that does the write will enter SMM That used to be the case (and it is still the default QEMU behavior, if broadcast SMI is not negotiated). However, OVMF does negotiate broadcast SMI whenever QEMU offers the feature. Broadcast SMI is important for the stability of the edk2 SMM infrastructure on QEMU/KVM, we've found. https://bugzilla.redhat.com/show_bug.cgi?id=1412313 https://bugzilla.redhat.com/show_bug.cgi?id=1412327 > and we might not need to pull in all already initialized CPUs into SMM. That, on the other hand, could be a valid idea. But then the CPU should use a different method for raising a synchronous SMI for itself (not a write to IO port 0xB2). Is a "directed SMI for self" possible? > [...] I've tried to read through the procedure with your suggested changes, but I'm failing at composing a coherent mental image, in this email response format. If you have the time, can you write up the suggested list of steps in a "flat" format? (I believe you are suggesting to eliminate some steps completely.) ... jumping to another point: >> 2) Let trusted software (SMM and init code) guarantee SMREBASE one by one (include any code runs before SMREBASE) > that would mean pulling all present CPUs into SMM mode so no attack > code could be executing before doing hotplug. With a lot of present CPUs > it could be quite expensive and unlike physical hardware, guest's CPUs > could be preempted arbitrarily long causing long delays. I agree with your analysis, but I slightly disagree about the impact: - CPU hotplug is not a frequent administrative action, so the CPU load should be temporary (it should be a spike). I don't worry that it would trip up OS kernel code. (SMI handling is known to take long on physical platforms oo.) In practice, all "normal" SMIs are broadcast already (for example when calling the runtime UEFI variable services from the OS kernel). - The fact that QEMU/KVM introduces some jitter into the execution of multi-core code (including SMM code) has proved useful in the past, for catching edk2 regressions. Again, this is not a strong disagreement from my side. I'm open to better ways for synching CPUs during muti-CPU-hotplug. (Digression: I expect someone could be curious why (a) I find it acceptable (even beneficial) that "some jitter" injected by the QEMU/KVM scheduling exposes multi-core regressions in edk2, but at the same time (b) I found it really important to add broadcast SMI to QEMU and OVMF. After all, both "jitter" and "unicast SMIs" are QEMU/KVM platform specifics, so why the different treatment? The reason is that the "jitter" does not interfere with normal operation, and it has been good for catching *regressions*. IOW, there is a working edk2 state, someone posts a patch, works on physical hardware, but breaks on QEMU/KVM --> then we can still reject or rework or revert the patch. And we're back to a working state again (in the best case, with a fixed feature patch). With the unicast SMIs however, it was impossible to enable the SMM stack reliably in the first place. There was no functional state to return to. Digression ends.) > lets first see if if we can ignore race Makes me uncomfortable, but if this is the consensus, I'll go along. > and if it's not then > we probably end up with implementing some form of #1 OK. Thanks! Laszlo