public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
* [edk2-devel] [RFC] Design review for Lazy Page Accept in TDVF
@ 2021-08-30  7:49 jiaqi.gao
  2021-08-31  6:10 ` Gerd Hoffmann
  0 siblings, 1 reply; 7+ messages in thread
From: jiaqi.gao @ 2021-08-30  7:49 UTC (permalink / raw)
  To: devel@edk2.groups.io
  Cc: Wang, Jian J, Wu, Hao A, Bi, Dandan, gaoliming@byosoft.com.cn,
	Ni, Ray, Kinney, Michael D, Yao, Jiewen, Zimmer, Vincent,
	Justen, Jordan L, Xu, Min M

[-- Attachment #1: Type: text/plain, Size: 1876 bytes --]

Motivation:
Intel TDX provides memory encryption and integrity multi-tenancy for hardware protection. A TD-guest uses TDCALL to accept shared memory as private. However, accept whole system memory may take a long time which will have an adverse impact on the boot time performance. We introduce Lazy Page Accept method which means only part of the memory is accepted by TDVF and rest of it is left to OS to be accepted.

Issue:
Memory size that need to be accepted by TDVF cannot be determined at the beginning in some cases. For example, kernel/initrd size can be large and may exceed the memory that has been accepted. Because of this, we have to provide a method to accept memory dynamically.

We propose three options to address this issue:

  1.  Modifying the memory allocation (MdeModulePkg/Core/Dxe/Mem) logic to accept memory when OUT_OF_RESOURCE occurs.
  2.  Changing the process flow of QEMU direct boot and GRUB to accept memory when loading the image fails and returns OUT_OF_RESOURCE.
  3.  Adding AcceptMemory() as a boot service interface to simplify the implementation of option 2.
Underlying implementation of accepting memory is provided by a protocol which can be installed by architecture-specific drivers such as TdxDxe.

The details are in the design slides: https://edk2.groups.io/g/devel/files/Designs/2021/0830/TDVF%20Lazy%20Page%20Accept%28v0.7%29.pptx



I am seeking your feedback on this proposal. Thank you!



References:

[1] tdx-virtual-firmware-design-guide-rev-1.pdf. https://software.intel.com/content/dam/develop/external/us/en/documents/tdx-virtual-firmware-design-guide-rev-1.pdf

[2] A POC of Lazy Page Accept in TDVF. https://github.com/mxu9/edk2/pull/9/commits

[3] Add new unaccepted memory type in mu_basecore. https://github.com/microsoft/mu_basecore/pull/66





Best Regards,

Gao Jiaqi



[-- Attachment #2: Type: text/html, Size: 7322 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [edk2-devel] [RFC] Design review for Lazy Page Accept in TDVF
  2021-08-30  7:49 [edk2-devel] [RFC] Design review for Lazy Page Accept in TDVF jiaqi.gao
@ 2021-08-31  6:10 ` Gerd Hoffmann
  2021-09-01  7:23   ` Gao, Jiaqi
  0 siblings, 1 reply; 7+ messages in thread
From: Gerd Hoffmann @ 2021-08-31  6:10 UTC (permalink / raw)
  To: devel, jiaqi.gao
  Cc: Wang, Jian J, Wu, Hao A, Bi, Dandan, gaoliming@byosoft.com.cn,
	Ni, Ray, Kinney, Michael D, Yao, Jiewen, Zimmer, Vincent,
	Justen, Jordan L, Xu, Min M

On Mon, Aug 30, 2021 at 07:49:27AM +0000, Gao, Jiaqi wrote:
> Motivation: Intel TDX provides memory encryption and integrity
> multi-tenancy for hardware protection. A TD-guest uses TDCALL to
> accept shared memory as private. However, accept whole system memory
> may take a long time which will have an adverse impact on the boot
> time performance.

Which order of magnitude do we talk about?
How long would it take to accept 2G of memory (all memory below 4g on
qemu q35) ?

> We propose three options to address this issue:

>   1.  Modifying the memory allocation (MdeModulePkg/Core/Dxe/Mem) logic to accept memory when OUT_OF_RESOURCE occurs.
>   2.  Changing the process flow of QEMU direct boot and GRUB to accept memory when loading the image fails and returns OUT_OF_RESOURCE.
>   3.  Adding AcceptMemory() as a boot service interface to simplify the implementation of option 2.
> Underlying implementation of accepting memory is provided by a protocol which can be installed by architecture-specific drivers such as TdxDxe.

(1) Looks best to me.  From a design point of view it is a very
reasonable thing for the core memory manager to also manage the
accepted/unaccepted state of memory.  It avoids duplicating the
"oom -> try AcceptMemoryRessource()" logic in bootloaders and
will also cover other oom situations.

take care,
  Gerd


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [edk2-devel] [RFC] Design review for Lazy Page Accept in TDVF
  2021-08-31  6:10 ` Gerd Hoffmann
@ 2021-09-01  7:23   ` Gao, Jiaqi
  2021-09-03  0:31     ` Yao, Jiewen
  0 siblings, 1 reply; 7+ messages in thread
From: Gao, Jiaqi @ 2021-09-01  7:23 UTC (permalink / raw)
  To: devel@edk2.groups.io, kraxel@redhat.com
  Cc: Wang, Jian J, Wu, Hao A, Bi, Dandan, gaoliming@byosoft.com.cn,
	Ni, Ray, Kinney, Michael D, Yao, Jiewen, Zimmer, Vincent,
	Justen, Jordan L, Xu, Min M


On Tuesday, August 31, 2021 2:11 PM, Gerd Hoffmann wrote:
> > Motivation: Intel TDX provides memory encryption and integrity
> > multi-tenancy for hardware protection. A TD-guest uses TDCALL to
> > accept shared memory as private. However, accept whole system memory
> > may take a long time which will have an adverse impact on the boot
> > time performance.
> 
> Which order of magnitude do we talk about?
> How long would it take to accept 2G of memory (all memory below 4g on
> qemu q35) ?

Here is some data using different guest configurations, it will take less time with more cpu cores.
For 2048MB memory it takes about 4 ~ 1.5 seconds using 1 ~ 4 cores guest to accept all.
For 4096MB memory it takes about 8 ~ 3 seconds using 1 ~ 4 cores guest.

> > We propose three options to address this issue:
> 
> >   1.  Modifying the memory allocation (MdeModulePkg/Core/Dxe/Mem)
> logic to accept memory when OUT_OF_RESOURCE occurs.
> >   2.  Changing the process flow of QEMU direct boot and GRUB to accept
> memory when loading the image fails and returns OUT_OF_RESOURCE.
> >   3.  Adding AcceptMemory() as a boot service interface to simplify the
> implementation of option 2.
> > Underlying implementation of accepting memory is provided by a protocol
> which can be installed by architecture-specific drivers such as TdxDxe.
> 
> (1) Looks best to me.  From a design point of view it is a very reasonable
> thing for the core memory manager to also manage the
> accepted/unaccepted state of memory.  It avoids duplicating the "oom -> try
> AcceptMemoryRessource()" logic in bootloaders and will also cover other
> oom situations.
> 
> take care,
>   Gerd
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [edk2-devel] [RFC] Design review for Lazy Page Accept in TDVF
  2021-09-01  7:23   ` Gao, Jiaqi
@ 2021-09-03  0:31     ` Yao, Jiewen
  2021-09-03  5:56       ` Gerd Hoffmann
  2021-09-03 12:34       ` Gao, Jiaqi
  0 siblings, 2 replies; 7+ messages in thread
From: Yao, Jiewen @ 2021-09-03  0:31 UTC (permalink / raw)
  To: Gao, Jiaqi, devel@edk2.groups.io, kraxel@redhat.com
  Cc: Wang, Jian J, Wu, Hao A, Bi, Dandan, gaoliming@byosoft.com.cn,
	Ni, Ray, Kinney, Michael D, Zimmer, Vincent, Justen, Jordan L,
	Xu, Min M

Hi
It is good idea to have a protocol to abstract TDX and SEV.

I think we need clearly document what service can be used in EFI_ACCEPT_MEMORY.
For example, can we use memory allocation service, GCD service, or MP service?
In https://github.com/mxu9/edk2/pull/9/commits, I do not find the producer of EFI_ACCEPT_MEMORY, would you please give me some hint?

Couple of dependency issue:
If EFI_ACCEPT_MEMORY cannot use MP service, then there might be performance concern.
If it uses MP service, then we need ensure MP service is installed earlier and before memory accept request.
I think we need a way to ensure there is enough memory *before* the protocol is installed, right?


Also, would you please clarify how to fix below comment?
  //
  // Fix me! CoreAddMemorySpace() should not be called in the allocation process
  // because it will allocate memory for GCD map entry.
  //

Thank you
Yao Jiewen

> -----Original Message-----
> From: Gao, Jiaqi <jiaqi.gao@intel.com>
> Sent: Wednesday, September 1, 2021 3:23 PM
> To: devel@edk2.groups.io; kraxel@redhat.com
> Cc: Wang, Jian J <jian.j.wang@intel.com>; Wu, Hao A <hao.a.wu@intel.com>;
> Bi, Dandan <dandan.bi@intel.com>; gaoliming@byosoft.com.cn; Ni, Ray
> <ray.ni@intel.com>; Kinney, Michael D <michael.d.kinney@intel.com>; Yao,
> Jiewen <jiewen.yao@intel.com>; Zimmer, Vincent <vincent.zimmer@intel.com>;
> Justen, Jordan L <jordan.l.justen@intel.com>; Xu, Min M <min.m.xu@intel.com>
> Subject: RE: [edk2-devel] [RFC] Design review for Lazy Page Accept in TDVF
> 
> 
> On Tuesday, August 31, 2021 2:11 PM, Gerd Hoffmann wrote:
> > > Motivation: Intel TDX provides memory encryption and integrity
> > > multi-tenancy for hardware protection. A TD-guest uses TDCALL to
> > > accept shared memory as private. However, accept whole system memory
> > > may take a long time which will have an adverse impact on the boot
> > > time performance.
> >
> > Which order of magnitude do we talk about?
> > How long would it take to accept 2G of memory (all memory below 4g on
> > qemu q35) ?
> 
> Here is some data using different guest configurations, it will take less time with
> more cpu cores.
> For 2048MB memory it takes about 4 ~ 1.5 seconds using 1 ~ 4 cores guest to
> accept all.
> For 4096MB memory it takes about 8 ~ 3 seconds using 1 ~ 4 cores guest.
> 
> > > We propose three options to address this issue:
> >
> > >   1.  Modifying the memory allocation (MdeModulePkg/Core/Dxe/Mem)
> > logic to accept memory when OUT_OF_RESOURCE occurs.
> > >   2.  Changing the process flow of QEMU direct boot and GRUB to accept
> > memory when loading the image fails and returns OUT_OF_RESOURCE.
> > >   3.  Adding AcceptMemory() as a boot service interface to simplify the
> > implementation of option 2.
> > > Underlying implementation of accepting memory is provided by a protocol
> > which can be installed by architecture-specific drivers such as TdxDxe.
> >
> > (1) Looks best to me.  From a design point of view it is a very reasonable
> > thing for the core memory manager to also manage the
> > accepted/unaccepted state of memory.  It avoids duplicating the "oom -> try
> > AcceptMemoryRessource()" logic in bootloaders and will also cover other
> > oom situations.
> >
> > take care,
> >   Gerd
> >


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [edk2-devel] [RFC] Design review for Lazy Page Accept in TDVF
  2021-09-03  0:31     ` Yao, Jiewen
@ 2021-09-03  5:56       ` Gerd Hoffmann
  2021-09-03 12:34         ` Gao, Jiaqi
  2021-09-03 12:34       ` Gao, Jiaqi
  1 sibling, 1 reply; 7+ messages in thread
From: Gerd Hoffmann @ 2021-09-03  5:56 UTC (permalink / raw)
  To: Yao, Jiewen
  Cc: Gao, Jiaqi, devel@edk2.groups.io, Wang, Jian J, Wu, Hao A,
	Bi, Dandan, gaoliming@byosoft.com.cn, Ni, Ray, Kinney, Michael D,
	Zimmer, Vincent, Justen, Jordan L, Xu, Min M

On Fri, Sep 03, 2021 at 12:31:57AM +0000, Yao, Jiewen wrote:
> Hi
> It is good idea to have a protocol to abstract TDX and SEV.
> 
> I think we need clearly document what service can be used in EFI_ACCEPT_MEMORY.
> For example, can we use memory allocation service, GCD service, or MP service?

Likewise the expected behavior.  For example whenever the protocol
driver or the memory core should update the GCD maps.

> Couple of dependency issue:
> If EFI_ACCEPT_MEMORY cannot use MP service, then there might be performance concern.
> If it uses MP service, then we need ensure MP service is installed earlier and before memory accept request.
> I think we need a way to ensure there is enough memory *before* the protocol is installed, right?

Yes.  Same for booting the OS, the kernel must have enough memory so it
can boot up to the point where the driver handling the lazy page accept
loads.

We should also define how we hand over memory range state from one stage
to the other (see also my reply to the sev-snp series posted yesterday)
so ovmf knows which ranges are accepted/validated already.

take care,
  Gerd


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [edk2-devel] [RFC] Design review for Lazy Page Accept in TDVF
  2021-09-03  0:31     ` Yao, Jiewen
  2021-09-03  5:56       ` Gerd Hoffmann
@ 2021-09-03 12:34       ` Gao, Jiaqi
  1 sibling, 0 replies; 7+ messages in thread
From: Gao, Jiaqi @ 2021-09-03 12:34 UTC (permalink / raw)
  To: Yao, Jiewen, devel@edk2.groups.io, kraxel@redhat.com
  Cc: Wang, Jian J, Wu, Hao A, Bi, Dandan, gaoliming@byosoft.com.cn,
	Ni, Ray, Kinney, Michael D, Zimmer, Vincent, Justen, Jordan L,
	Xu, Min M

Hi,

> 
> I think we need clearly document what service can be used in
> EFI_ACCEPT_MEMORY.
> For example, can we use memory allocation service, GCD service, or MP
> service?

GCD service is provided by EFI_DXE_SERVICES, it can be used by EFI_ACCEPT_MEMORY (So updating the GCD memory map in the protocol is an option). 
Memory allocation service can be used too.
We are not using MP service for now. If we need to use MP service, I think we can register a protocol notify to install EFI_ACCEPT_MEMORY when EFI_MP_SERVICE_PROTOCOL is ready.

>
> In https://github.com/mxu9/edk2/pull/9/commits, I do not find the
> producer of EFI_ACCEPT_MEMORY, would you please give me some hint?
> 

EFI_ACCEPT_MEMORY is installed by TdxDxe in our POC: https://github.com/mxu9/edk2/pull/9/commits/45974892883847f9c0397ca8efcc86a0318b2c6e

> Couple of dependency issue:
> If EFI_ACCEPT_MEMORY cannot use MP service, then there might be
> performance concern.
> If it uses MP service, then we need ensure MP service is installed earlier and
> before memory accept request.
> I think we need a way to ensure there is enough memory *before* the
> protocol is installed, right?
> 

Since the memory needed by protocol is the *certain* part and it will not change, I think it can be ensured.

> 
> Also, would you please clarify how to fix below comment?
>   //
>   // Fix me! CoreAddMemorySpace() should not be called in the allocation
> process
>   // because it will allocate memory for GCD map entry.
>   //
> 

A possible solution is to split the update of Memory Map and GCD Memory Space Map, newly accepted memory range can be added to Memory Map (through CoreAddRange(), an internal function in page management) for memory allocation firstly, then CoreAddMemorySpace() can be called safely.
This method will cause an duplicate entry in memory map. We have to do some fix in CoreAddRange() to avoid adding same range repeatedly.

Or we can use similarly way as mMapStack[] in page management for GCD map entry allocation too.
I'm not sure these are reasonable and logical solutions, so I'm trying to find another way.


Thank you,
Jiaqi


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [edk2-devel] [RFC] Design review for Lazy Page Accept in TDVF
  2021-09-03  5:56       ` Gerd Hoffmann
@ 2021-09-03 12:34         ` Gao, Jiaqi
  0 siblings, 0 replies; 7+ messages in thread
From: Gao, Jiaqi @ 2021-09-03 12:34 UTC (permalink / raw)
  To: kraxel@redhat.com, Yao, Jiewen
  Cc: devel@edk2.groups.io, Wang, Jian J, Wu, Hao A, Bi, Dandan,
	gaoliming@byosoft.com.cn, Ni, Ray, Kinney, Michael D,
	Zimmer, Vincent, Justen, Jordan L, Xu, Min M

[-- Attachment #1: Type: text/plain, Size: 1088 bytes --]

Hi,

>
> Likewise the expected behavior.  For example whenever the protocol driver
> or the memory core should update the GCD maps.
>
Yes EFI_DXE_SERVICES can be used by EFI_ACCEPT_MEMORY, which contains CoreAddMemorySpace() & CoreRemoveMemorySpace().
>
> Yes.  Same for booting the OS, the kernel must have enough memory so it
> can boot up to the point where the driver handling the lazy page accept loads.
>
> We should also define how we hand over memory range state from one
> stage to the other (see also my reply to the sev-snp series posted yesterday)
> so ovmf knows which ranges are accepted/validated already.
>
Resource HOB type EFI_RESOURCE_MEMORY_UNACCEPTED is used to indicate unaccepted memory range and is passed from HOB producer to DXE.
And there's a new type EfiUnacceptedMemory in EFI_MEMORY_TYPE to pass information to OS, which has already been updated to UEFI spec.
You can see the details in the Definitions page of slides: https://edk2.groups.io/g/devel/files/Designs/2021/0830/TDVF%20Lazy%20Page%20Accept%28v0.7%29.pptx


Thank you,
Jiaqi

[-- Attachment #2: Type: text/html, Size: 3507 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-09-03 12:34 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-08-30  7:49 [edk2-devel] [RFC] Design review for Lazy Page Accept in TDVF jiaqi.gao
2021-08-31  6:10 ` Gerd Hoffmann
2021-09-01  7:23   ` Gao, Jiaqi
2021-09-03  0:31     ` Yao, Jiewen
2021-09-03  5:56       ` Gerd Hoffmann
2021-09-03 12:34         ` Gao, Jiaqi
2021-09-03 12:34       ` Gao, Jiaqi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox