From: "Yao, Jiewen" <jiewen.yao@intel.com>
To: Alex Williamson <alex.williamson@redhat.com>,
Laszlo Ersek <lersek@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
"devel@edk2.groups.io" <devel@edk2.groups.io>,
edk2-rfc-groups-io <rfc@edk2.groups.io>,
"qemu devel list" <qemu-devel@nongnu.org>,
Igor Mammedov <imammedo@redhat.com>,
"Chen, Yingwen" <yingwen.chen@intel.com>,
"Nakajima, Jun" <jun.nakajima@intel.com>,
Boris Ostrovsky <boris.ostrovsky@oracle.com>,
Joao Marcal Lemos Martins <joao.m.martins@oracle.com>,
Phillip Goerl <phillip.goerl@oracle.com>
Subject: Re: [edk2-devel] CPU hotplug using SMM with QEMU+OVMF
Date: Sat, 17 Aug 2019 00:20:25 +0000 [thread overview]
Message-ID: <74D8A39837DF1E4DA445A8C0B3885C503F761B96@shsmsx102.ccr.corp.intel.com> (raw)
In-Reply-To: <20190816161933.7d30a881@x1.home>
> -----Original Message-----
> From: Alex Williamson [mailto:alex.williamson@redhat.com]
> Sent: Saturday, August 17, 2019 6:20 AM
> To: Laszlo Ersek <lersek@redhat.com>
> Cc: Yao, Jiewen <jiewen.yao@intel.com>; Paolo Bonzini
> <pbonzini@redhat.com>; devel@edk2.groups.io; edk2-rfc-groups-io
> <rfc@edk2.groups.io>; qemu devel list <qemu-devel@nongnu.org>; Igor
> Mammedov <imammedo@redhat.com>; Chen, Yingwen
> <yingwen.chen@intel.com>; Nakajima, Jun <jun.nakajima@intel.com>; Boris
> Ostrovsky <boris.ostrovsky@oracle.com>; Joao Marcal Lemos Martins
> <joao.m.martins@oracle.com>; Phillip Goerl <phillip.goerl@oracle.com>
> Subject: Re: [edk2-devel] CPU hotplug using SMM with QEMU+OVMF
>
> On Fri, 16 Aug 2019 22:15:15 +0200
> Laszlo Ersek <lersek@redhat.com> wrote:
>
> > +Alex (direct question at the bottom)
> >
> > On 08/16/19 09:49, Yao, Jiewen wrote:
> > > below
> > >
> > >> -----Original Message-----
> > >> From: Paolo Bonzini [mailto:pbonzini@redhat.com]
> > >> Sent: Friday, August 16, 2019 3:20 PM
> > >> To: Yao, Jiewen <jiewen.yao@intel.com>; Laszlo Ersek
> > >> <lersek@redhat.com>; devel@edk2.groups.io
> > >> Cc: edk2-rfc-groups-io <rfc@edk2.groups.io>; qemu devel list
> > >> <qemu-devel@nongnu.org>; Igor Mammedov
> <imammedo@redhat.com>;
> > >> Chen, Yingwen <yingwen.chen@intel.com>; Nakajima, Jun
> > >> <jun.nakajima@intel.com>; Boris Ostrovsky
> <boris.ostrovsky@oracle.com>;
> > >> Joao Marcal Lemos Martins <joao.m.martins@oracle.com>; Phillip
> Goerl
> > >> <phillip.goerl@oracle.com>
> > >> Subject: Re: [edk2-devel] CPU hotplug using SMM with QEMU+OVMF
> > >>
> > >> On 16/08/19 04:46, Yao, Jiewen wrote:
> > >>> Comment below:
> > >>>
> > >>>
> > >>>> -----Original Message-----
> > >>>> From: Paolo Bonzini [mailto:pbonzini@redhat.com]
> > >>>> Sent: Friday, August 16, 2019 12:21 AM
> > >>>> To: Laszlo Ersek <lersek@redhat.com>; devel@edk2.groups.io; Yao,
> > >> Jiewen
> > >>>> <jiewen.yao@intel.com>
> > >>>> Cc: edk2-rfc-groups-io <rfc@edk2.groups.io>; qemu devel list
> > >>>> <qemu-devel@nongnu.org>; Igor Mammedov
> > >> <imammedo@redhat.com>;
> > >>>> Chen, Yingwen <yingwen.chen@intel.com>; Nakajima, Jun
> > >>>> <jun.nakajima@intel.com>; Boris Ostrovsky
> > >> <boris.ostrovsky@oracle.com>;
> > >>>> Joao Marcal Lemos Martins <joao.m.martins@oracle.com>; Phillip
> Goerl
> > >>>> <phillip.goerl@oracle.com>
> > >>>> Subject: Re: [edk2-devel] CPU hotplug using SMM with QEMU+OVMF
> > >>>>
> > >>>> On 15/08/19 17:00, Laszlo Ersek wrote:
> > >>>>> On 08/14/19 16:04, Paolo Bonzini wrote:
> > >>>>>> On 14/08/19 15:20, Yao, Jiewen wrote:
> > >>>>>>>> - Does this part require a new branch somewhere in the OVMF
> SEC
> > >>>> code?
> > >>>>>>>> How do we determine whether the CPU executing SEC is BSP
> or
> > >>>>>>>> hot-plugged AP?
> > >>>>>>> [Jiewen] I think this is blocked from hardware perspective, since
> the
> > >> first
> > >>>> instruction.
> > >>>>>>> There are some hardware specific registers can be used to
> determine
> > >> if
> > >>>> the CPU is new added.
> > >>>>>>> I don’t think this must be same as the real hardware.
> > >>>>>>> You are free to invent some registers in device model to be used
> in
> > >>>> OVMF hot plug driver.
> > >>>>>>
> > >>>>>> Yes, this would be a new operation mode for QEMU, that only
> applies
> > >> to
> > >>>>>> hot-plugged CPUs. In this mode the AP doesn't reply to INIT or
> SMI,
> > >> in
> > >>>>>> fact it doesn't reply to anything at all.
> > >>>>>>
> > >>>>>>>> - How do we tell the hot-plugged AP where to start execution?
> (I.e.
> > >>>> that
> > >>>>>>>> it should execute code at a particular pflash location.)
> > >>>>>>> [Jiewen] Same real mode reset vector at FFFF:FFF0.
> > >>>>>>
> > >>>>>> You do not need a reset vector or INIT/SIPI/SIPI sequence at all in
> > >>>>>> QEMU. The AP does not start execution at all when it is
> unplugged,
> > >> so
> > >>>>>> no cache-as-RAM etc.
> > >>>>>>
> > >>>>>> We only need to modify QEMU so that hot-plugged APIs do not
> reply
> > >> to
> > >>>>>> INIT/SIPI/SMI.
> > >>>>>>
> > >>>>>>> I don’t think there is problem for real hardware, who always has
> CAR.
> > >>>>>>> Can QEMU provide some CPU specific space, such as MMIO
> region?
> > >>>>>>
> > >>>>>> Why is a CPU-specific region needed if every other processor is in
> SMM
> > >>>>>> and thus trusted.
> > >>>>>
> > >>>>> I was going through the steps Jiewen and Yingwen recommended.
> > >>>>>
> > >>>>> In step (02), the new CPU is expected to set up RAM access. In step
> > >>>>> (03), the new CPU, executing code from flash, is expected to "send
> > >> board
> > >>>>> message to tell host CPU (GPIO->SCI) -- I am waiting for hot-add
> > >>>>> message." For that action, the new CPU may need a stack
> (minimally if
> > >> we
> > >>>>> want to use C function calls).
> > >>>>>
> > >>>>> Until step (03), there had been no word about any other (=
> pre-plugged)
> > >>>>> CPUs (more precisely, Jiewen even confirmed "No impact to other
> > >>>>> processors"), so I didn't assume that other CPUs had entered SMM.
> > >>>>>
> > >>>>> Paolo, I've attempted to read Jiewen's response, and yours, as
> carefully
> > >>>>> as I can. I'm still very confused. If you have a better understanding,
> > >>>>> could you please write up the 15-step process from the thread
> starter
> > >>>>> again, with all QEMU customizations applied? Such as, unnecessary
> > >> steps
> > >>>>> removed, and platform specifics filled in.
> > >>>>
> > >>>> Sure.
> > >>>>
> > >>>> (01a) QEMU: create new CPU. The CPU already exists, but it does
> not
> > >>>> start running code until unparked by the CPU hotplug
> controller.
> > >>>>
> > >>>> (01b) QEMU: trigger SCI
> > >>>>
> > >>>> (02-03) no equivalent
> > >>>>
> > >>>> (04) Host CPU: (OS) execute GPE handler from DSDT
> > >>>>
> > >>>> (05) Host CPU: (OS) Port 0xB2 write, all CPUs enter SMM (NOTE: New
> CPU
> > >>>> will not enter CPU because SMI is disabled)
> > >>>>
> > >>>> (06) Host CPU: (SMM) Save 38000, Update 38000 -- fill simple SMM
> > >>>> rebase code.
> > >>>>
> > >>>> (07a) Host CPU: (SMM) Write to CPU hotplug controller to enable
> > >>>> new CPU
> > >>>>
> > >>>> (07b) Host CPU: (SMM) Send INIT/SIPI/SIPI to new CPU.
> > >>> [Jiewen] NOTE: INIT/SIPI/SIPI can be sent by a malicious CPU. There is
> no
> > >>> restriction that INIT/SIPI/SIPI can only be sent in SMM.
> > >>
> > >> All of the CPUs are now in SMM, and INIT/SIPI/SIPI will be discarded
> > >> before 07a, so this is okay.
> > > [Jiewen] May I know why INIT/SIPI/SIPI is discarded before 07a but is
> delivered at 07a?
> > > I don’t see any extra step between 06 and 07a.
> > > What is the magic here?
> >
> > The magic is 07a itself, IIUC. The CPU hotplug controller would be
> > accessible only in SMM. And until 07a happens, the new CPU ignores
> > INIT/SIPI/SIPI even if another CPU sends it those, simply because QEMU
> > would implement the new CPU's behavior like that.
[Jiewen] Got it. Looks fine to me.
> > >> However I do see a problem, because a PCI device's DMA could
> overwrite
> > >> 0x38000 between (06) and (10) and hijack the code that is executed in
> > >> SMM. How is this avoided on real hardware? By the time the new
> CPU
> > >> enters SMM, it doesn't run off cache-as-RAM anymore.
> > > [Jiewen] Interesting question.
> > > I don’t think the DMA attack is considered in threat model for the virtual
> environment. We only list adversary below:
> > > -- Adversary: System Software Attacker, who can control any OS memory
> or silicon register from OS level, or read write BIOS data.
> > > -- Adversary: Simple hardware attacker, who can hot add or hot remove
> a CPU.
> >
> > We do have physical PCI(e) device assignment; sorry for not highlighting
> > that earlier.
[Jiewen] That is OK. Then we MUST add the third adversary.
-- Adversary: Simple hardware attacker, who can use device to perform DMA attack in the virtual world.
NOTE: The DMA attack in the real world is out of scope. That is be handled by IOMMU in the real world, such as VTd. -- Please do clarify if this is TRUE.
In the real world:
#1: the SMM MUST be non-DMA capable region.
#2: the MMIO MUST be non-DMA capable region.
#3: the stolen memory MIGHT be DMA capable region or non-DMA capable region. It depends upon the silicon design.
#4: the normal OS accessible memory - including ACPI reclaim, ACPI NVS, and reserved memory not included by #3 - MUST be DMA capable region.
As such, IOMMU protection is NOT required for #1 and #2. IOMMU protection MIGHT be required for #3 and MUST be required for #4.
I assume the virtual environment is designed in the same way. Please correct me if I am wrong.
>> That feature (VFIO) does rely on the (physical) IOMMU, and
> > it makes sure that the assigned device can only access physical frames
> > that belong to the virtual machine that the device is assigned to.
[Jiewen] Thank you! Good to know.
I found https://www.kernel.org/doc/Documentation/vfio.txt
Is that what you scribed above?
Anyway, I believe the problem is clear and the solution in real world is clear.
I will leave the virtual world discussion to Alex, Paolo, Laszlo.
If you need any of my input, please let me know.
> > However, as far as I know, VFIO doesn't try to restrict PCI DMA to
> > subsets of guest RAM... I could be wrong about that, I vaguely recall
> > RMRR support, which seems somewhat related.
> >
> > > I agree it is a threat from real hardware perspective. SMM may check
> VTd to make sure the 38000 is blocked.
> > > I doubt if it is a threat in virtual environment. Do we have a way to block
> DMA in virtual environment?
> >
> > I think that would be a VFIO feature.
> >
> > Alex: if we wanted to block PCI(e) DMA to a specific part of guest RAM
> > (expressed with guest-physical RAM addresses), perhaps permanently,
> > perhaps just for a while -- not sure about coordination though --, could
> > VFIO accommodate that (I guess by "punching holes" in the IOMMU page
> > tables)?
>
> It depends. For starters, the vfio mapping API does not allow
> unmapping arbitrary sub-ranges of previous mappings. So the hole you
> want to punch would need to be independently mapped. From there you
> get into the issue of whether this range is a potential DMA target. If
> it is, then this is the path to data corruption. We cannot interfere
> with the operation of the device and we have little to no visibility of
> active DMA targets.
>
> If we're talking about RAM that is never a DMA target, perhaps e820
> reserved memory, then we can make sure certainly MemoryRegions are
> skipped when mapped by QEMU and would expect the guest to never map
> them through a vIOMMU as well. Maybe then it's a question of where
> we're trying to provide security (it might be more difficult if QEMU
> needs to sanitize vIOMMU mappings to actively prevent mapping
> reserved areas).
>
> Is there anything unique about the VM case here? Bare metal SMM needs
> to be concerned about protecting itself from I/O devices that operate
> outside of the realm of SMM mode as well, right? Is something "simple"
> like an AddressSpace switch necessary here, such that an I/O device
> always has a mapping to a safe guest RAM page while the vCPU
> AddressSpace can switch to some protected page? The IOMMU and vCPU
> mappings don't need to be the same. The vCPU is more under our control
> than the assigned device.
>
> FWIW, RMRRs are a VT-d specific mechanism to define an address range as
> persistently, identity mapped for one or more devices. IOW, the device
> would always map that range. I don't think that's what you're after
> here. RMRRs are also an abomination that I hope we never find a
> requirement for in a VM. Thanks,
>
> Alex
next prev parent reply other threads:[~2019-08-17 0:20 UTC|newest]
Thread overview: 69+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-13 14:16 CPU hotplug using SMM with QEMU+OVMF Laszlo Ersek
2019-08-13 16:09 ` Laszlo Ersek
2019-08-13 16:18 ` Laszlo Ersek
2019-08-14 13:20 ` Yao, Jiewen
2019-08-14 14:04 ` Paolo Bonzini
2019-08-15 9:55 ` Yao, Jiewen
2019-08-15 16:04 ` Paolo Bonzini
2019-08-15 15:00 ` [edk2-devel] " Laszlo Ersek
2019-08-15 16:16 ` Igor Mammedov
2019-08-15 16:21 ` Paolo Bonzini
2019-08-16 2:46 ` Yao, Jiewen
2019-08-16 7:20 ` Paolo Bonzini
2019-08-16 7:49 ` Yao, Jiewen
2019-08-16 20:15 ` Laszlo Ersek
2019-08-16 22:19 ` Alex Williamson
2019-08-17 0:20 ` Yao, Jiewen [this message]
2019-08-18 19:50 ` Paolo Bonzini
2019-08-18 23:00 ` Yao, Jiewen
2019-08-19 14:10 ` Paolo Bonzini
2019-08-21 12:07 ` Laszlo Ersek
2019-08-21 15:48 ` [edk2-rfc] " Michael D Kinney
2019-08-21 17:05 ` Paolo Bonzini
2019-08-21 17:25 ` Michael D Kinney
2019-08-21 17:39 ` Paolo Bonzini
2019-08-21 20:17 ` Michael D Kinney
2019-08-22 6:18 ` Paolo Bonzini
2019-08-22 18:29 ` Laszlo Ersek
2019-08-22 18:51 ` Paolo Bonzini
2019-08-23 14:53 ` Laszlo Ersek
2019-08-22 20:13 ` Michael D Kinney
2019-08-22 17:59 ` Laszlo Ersek
2019-08-22 18:43 ` Paolo Bonzini
2019-08-22 20:06 ` Michael D Kinney
2019-08-22 22:18 ` Paolo Bonzini
2019-08-22 22:32 ` Michael D Kinney
2019-08-22 23:11 ` Paolo Bonzini
2019-08-23 1:02 ` Michael D Kinney
2019-08-23 5:00 ` Yao, Jiewen
2019-08-23 15:25 ` Michael D Kinney
2019-08-24 1:48 ` Yao, Jiewen
2019-08-27 18:31 ` Igor Mammedov
2019-08-29 17:01 ` Laszlo Ersek
2019-08-30 14:48 ` Igor Mammedov
2019-08-30 18:46 ` Laszlo Ersek
2019-09-02 8:45 ` Igor Mammedov
2019-09-02 19:09 ` Laszlo Ersek
2019-09-03 14:53 ` [Qemu-devel] " Igor Mammedov
2019-09-03 17:20 ` Laszlo Ersek
2019-09-04 9:52 ` imammedo
2019-09-05 13:08 ` Laszlo Ersek
2019-09-05 15:45 ` Igor Mammedov
2019-09-05 15:49 ` [PATCH] q35: lpc: allow to lock down 128K RAM at default SMBASE address Igor Mammedov
2019-09-09 19:15 ` Laszlo Ersek
2019-09-09 19:20 ` Laszlo Ersek
2019-09-10 15:58 ` Igor Mammedov
2019-09-11 17:30 ` Laszlo Ersek
2019-09-17 13:11 ` [edk2-devel] " Igor Mammedov
2019-09-17 14:38 ` [staging/branch]: CdePkg - C Development Environment Package Minnow Ware
2019-08-26 15:30 ` [edk2-rfc] [edk2-devel] CPU hotplug using SMM with QEMU+OVMF Laszlo Ersek
2019-08-27 16:23 ` Igor Mammedov
2019-08-27 20:11 ` Laszlo Ersek
2019-08-28 12:01 ` Igor Mammedov
2019-08-29 16:25 ` Laszlo Ersek
2019-08-30 13:49 ` [Qemu-devel] " Igor Mammedov
2019-08-22 17:53 ` Laszlo Ersek
2019-08-16 20:00 ` Laszlo Ersek
2019-08-15 16:07 ` Igor Mammedov
2019-08-15 16:24 ` Paolo Bonzini
2019-08-16 7:42 ` Igor Mammedov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-list from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=74D8A39837DF1E4DA445A8C0B3885C503F761B96@shsmsx102.ccr.corp.intel.com \
--to=devel@edk2.groups.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox