From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: mx.groups.io; dkim=missing; spf=pass (domain: redhat.com, ip: 209.132.183.28, mailfrom: lersek@redhat.com) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by groups.io with SMTP; Tue, 13 Aug 2019 09:18:27 -0700 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 27C8B81F10; Tue, 13 Aug 2019 16:18:27 +0000 (UTC) Received: from lacos-laptop-7.usersys.redhat.com (ovpn-117-193.ams2.redhat.com [10.36.117.193]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5A57182E27; Tue, 13 Aug 2019 16:18:24 +0000 (UTC) Subject: Re: CPU hotplug using SMM with QEMU+OVMF From: "Laszlo Ersek" To: edk2-devel-groups-io Cc: edk2-rfc-groups-io , qemu devel list , Igor Mammedov , Paolo Bonzini , Jiewen Yao , Yingwen Chen , Jun Nakajima , Boris Ostrovsky , Joao Marcal Lemos Martins , Phillip Goerl References: <8091f6e8-b1ec-f017-1430-00b0255729f4@redhat.com> Message-ID: <55cc7921-2b91-5060-d5de-d7513067f76e@redhat.com> Date: Tue, 13 Aug 2019 18:18:23 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Tue, 13 Aug 2019 16:18:27 +0000 (UTC) Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit On 08/13/19 18:09, Laszlo Ersek wrote: > On 08/13/19 16:16, Laszlo Ersek wrote: >> (06) Host CPU: (SMM) Save 38000, Update 38000 -- fill simple SMM >> rebase code. >> >> (07) Host CPU: (SMM) Send message to New CPU to Enable SMI. > > Aha, so this is the SMM-only register you mention in step (03). Is the > register specified in the Intel SDM? > > >> (08) New CPU: (Flash) Get message - Enable SMI. >> >> (09) Host CPU: (SMM) Send SMI to the new CPU only. >> >> (10) New CPU: (SMM) Response first SMI at 38000, and rebase SMBASE to >> TSEG. > > What code does the new CPU execute after it completes step (10)? Does it > halt? > > >> (11) Host CPU: (SMM) Restore 38000. > > These steps (i.e., (06) through (11)) don't appear RAS-specific. The > only platform-specific feature seems to be SMI masking register, which > could be extracted into a new SmmCpuFeaturesLib API. > > Thus, would you please consider open sourcing firmware code for steps > (06) through (11)? > > > Alternatively -- and in particular because the stack for step (01) > concerns me --, we could approach this from a high-level, functional > perspective. The states that really matter are the relocated SMBASE for > the new CPU, and the state of the full system, right at the end of step > (11). > > When the SMM setup quiesces during normal firmware boot, OVMF could use > existent (finalized) SMBASE infomation to *pre-program* some virtual > QEMU hardware, with such state that would be expected, as "final" state, > of any new hotplugged CPU. Afterwards, if / when the hotplug actually > happens, QEMU could blanket-apply this state to the new CPU, and > broadcast a hardware SMI to all CPUs except the new one. > > The hardware SMI should tell the firmware that the rest of the process > -- step (12) below, and onward -- is being requested. > > If I understand right, this approach would produce an firmware & system > state that's identical to what's expected right after step (11): > > - all SMBASEs relocated > - all preexistent CPUs in SMM > - new CPU halted / blocked from launch > - DRAM at 0x30000 / 0x38000 contains OS-owned data > > Is my understanding correct that this is the expected state after step > (11)? Revisiting some of my notes from earlier, such as -- apologies, private BZ... --, we discussed some of this stuff with Mike on the phone in April. And, it looked like generating a hardware SMI in QEMU, in association with the hotplug action that was being requested through the QEMU monitor, would be the right approach. By now I have forgotten about that discussion -- hence "revisiting my notes"--, but luckily, it seems consistent with what I've proposed above, under "alternatively". Thanks, Laszlo