From: "Laszlo Ersek" <lersek@redhat.com>
To: devel@edk2.groups.io, julien@xen.org,
Igor Druzhinin <igor.druzhinin@citrix.com>,
xen-devel@lists.xenproject.org
Cc: jordan.l.justen@intel.com, ard.biesheuvel@arm.com,
anthony.perard@citrix.com
Subject: Re: [edk2-devel] [PATCH] OvmfPkg/XenPlatformPei: Use CPUID to get physical address width on Xen
Date: Tue, 19 Jan 2021 15:49:34 +0100 [thread overview]
Message-ID: <0d7ad7aa-cfa9-5f2c-26e3-6b371737f6bc@redhat.com> (raw)
In-Reply-To: <2a806f26-04f7-a96b-522c-118ac94df8c0@xen.org>
On 01/19/21 10:37, Julien Grall wrote:
> Hi Igor,
>
> On 13/01/2021 03:42, Igor Druzhinin wrote:
>> We faced a problem with passing through a PCI device with 64GB BAR to
>> UEFI guest. The BAR is expectedly programmed into 64-bit PCI aperture at
>> 64G address which pushes physical address space to 37 bits. That is above
>> 36-bit width that OVMF exposes currently to a guest without tweaking
>> PcdPciMmio64Size knob.
>>
>> The reverse calculation using this knob was inhereted from QEMU-KVM
>> platform
>> code where it serves the purpose of finding max accessible physical
>> address without necessary trusting emulated CPUID physbits value (that
>> could
>> be different from host physbits). On Xen we expect to use CPUID policy
>> to level the data correctly to prevent situations with guest physbits >
>> host physbits e.g. across migrations.
>>
>> The next aspect raising concern - resource consumption for DXE IPL
>> page tables
>> and time required to map the whole address space in case of using CPUID
>> bits directly. That could be mitigated by enabling support for 1G pages
>> in DXE IPL configuration. 1G pages are available on most CPUs produced in
>> the last 10 years and those without don't have many phys bits.
>>
>> Remove all the redundant code now (including PcdPciMmio64.. handling
>> that's
>> not used on Xen anyway) and grab physbits directly from CPUID that should
>> be what baremetal UEFI systems do.
>>
>> Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
>
> Reviewed-by: Julien Grall <julien@xen.org>
I'm going to hold this patch until Anthony responds in this thread as well:
http://mid.mail-archive.com/63cf6053-9787-d8cf-db18-982ebcab1780@citrix.com
https://edk2.groups.io/g/devel/message/70541
Thanks
Laszlo
>
> Cheers,
>
>> ---
>> OvmfPkg/OvmfXen.dsc | 3 +
>> OvmfPkg/XenPlatformPei/MemDetect.c | 166
>> +++----------------------------------
>> 2 files changed, 15 insertions(+), 154 deletions(-)
>>
>> diff --git a/OvmfPkg/OvmfXen.dsc b/OvmfPkg/OvmfXen.dsc
>> index 7d31e88..8ae6ed0 100644
>> --- a/OvmfPkg/OvmfXen.dsc
>> +++ b/OvmfPkg/OvmfXen.dsc
>> @@ -444,6 +444,9 @@
>> ## Xen vlapic's frequence is 100 MHz
>> gEfiMdePkgTokenSpaceGuid.PcdFSBClock|100000000
>> + # We populate DXE IPL tables with 1G pages preferably on Xen
>> + gEfiMdeModulePkgTokenSpaceGuid.PcdUse1GPageTable|TRUE
>> +
>>
>> ################################################################################
>>
>> #
>> # Pcd Dynamic Section - list of all EDK II PCD Entries defined by
>> this Platform
>> diff --git a/OvmfPkg/XenPlatformPei/MemDetect.c
>> b/OvmfPkg/XenPlatformPei/MemDetect.c
>> index 1f81eee..1970b63 100644
>> --- a/OvmfPkg/XenPlatformPei/MemDetect.c
>> +++ b/OvmfPkg/XenPlatformPei/MemDetect.c
>> @@ -172,175 +172,33 @@ GetSystemMemorySizeBelow4gb (
>> return (UINT32) (((UINTN)((Cmos0x35 << 8) + Cmos0x34) << 16) +
>> SIZE_16MB);
>> }
>> -
>> -STATIC
>> -UINT64
>> -GetSystemMemorySizeAbove4gb (
>> - )
>> -{
>> - UINT32 Size;
>> - UINTN CmosIndex;
>> -
>> - //
>> - // In PVH case, there is no CMOS, we have to calculate the memory size
>> - // from parsing the E820
>> - //
>> - if (XenPvhDetected ()) {
>> - UINT64 HighestAddress;
>> -
>> - HighestAddress = GetHighestSystemMemoryAddress (FALSE);
>> - ASSERT (HighestAddress == 0 || HighestAddress >= BASE_4GB);
>> -
>> - if (HighestAddress >= BASE_4GB) {
>> - HighestAddress -= BASE_4GB;
>> - }
>> -
>> - return HighestAddress;
>> - }
>> -
>> - //
>> - // CMOS 0x5b-0x5d specifies the system memory above 4GB MB.
>> - // * CMOS(0x5d) is the most significant size byte
>> - // * CMOS(0x5c) is the middle size byte
>> - // * CMOS(0x5b) is the least significant size byte
>> - // * The size is specified in 64kb chunks
>> - //
>> -
>> - Size = 0;
>> - for (CmosIndex = 0x5d; CmosIndex >= 0x5b; CmosIndex--) {
>> - Size = (UINT32) (Size << 8) + (UINT32) CmosRead8 (CmosIndex);
>> - }
>> -
>> - return LShiftU64 (Size, 16);
>> -}
>> -
>> -
>> -/**
>> - Return the highest address that DXE could possibly use, plus one.
>> -**/
>> -STATIC
>> -UINT64
>> -GetFirstNonAddress (
>> - VOID
>> - )
>> -{
>> - UINT64 FirstNonAddress;
>> - UINT64 Pci64Base, Pci64Size;
>> - RETURN_STATUS PcdStatus;
>> -
>> - FirstNonAddress = BASE_4GB + GetSystemMemorySizeAbove4gb ();
>> -
>> - //
>> - // If DXE is 32-bit, then we're done; PciBusDxe will degrade 64-bit
>> MMIO
>> - // resources to 32-bit anyway. See DegradeResource() in
>> - // "PciResourceSupport.c".
>> - //
>> -#ifdef MDE_CPU_IA32
>> - if (!FeaturePcdGet (PcdDxeIplSwitchToLongMode)) {
>> - return FirstNonAddress;
>> - }
>> -#endif
>> -
>> - //
>> - // Otherwise, in order to calculate the highest address plus one,
>> we must
>> - // consider the 64-bit PCI host aperture too. Fetch the default size.
>> - //
>> - Pci64Size = PcdGet64 (PcdPciMmio64Size);
>> -
>> - if (Pci64Size == 0) {
>> - if (mBootMode != BOOT_ON_S3_RESUME) {
>> - DEBUG ((DEBUG_INFO, "%a: disabling 64-bit PCI host aperture\n",
>> - __FUNCTION__));
>> - PcdStatus = PcdSet64S (PcdPciMmio64Size, 0);
>> - ASSERT_RETURN_ERROR (PcdStatus);
>> - }
>> -
>> - //
>> - // There's nothing more to do; the amount of memory above 4GB fully
>> - // determines the highest address plus one. The memory hotplug
>> area (see
>> - // below) plays no role for the firmware in this case.
>> - //
>> - return FirstNonAddress;
>> - }
>> -
>> - //
>> - // SeaBIOS aligns both boundaries of the 64-bit PCI host aperture
>> to 1GB, so
>> - // that the host can map it with 1GB hugepages. Follow suit.
>> - //
>> - Pci64Base = ALIGN_VALUE (FirstNonAddress, (UINT64)SIZE_1GB);
>> - Pci64Size = ALIGN_VALUE (Pci64Size, (UINT64)SIZE_1GB);
>> -
>> - //
>> - // The 64-bit PCI host aperture should also be "naturally" aligned.
>> The
>> - // alignment is determined by rounding the size of the aperture
>> down to the
>> - // next smaller or equal power of two. That is, align the aperture
>> by the
>> - // largest BAR size that can fit into it.
>> - //
>> - Pci64Base = ALIGN_VALUE (Pci64Base, GetPowerOfTwo64 (Pci64Size));
>> -
>> - if (mBootMode != BOOT_ON_S3_RESUME) {
>> - //
>> - // The core PciHostBridgeDxe driver will automatically add this
>> range to
>> - // the GCD memory space map through our PciHostBridgeLib
>> instance; here we
>> - // only need to set the PCDs.
>> - //
>> - PcdStatus = PcdSet64S (PcdPciMmio64Base, Pci64Base);
>> - ASSERT_RETURN_ERROR (PcdStatus);
>> - PcdStatus = PcdSet64S (PcdPciMmio64Size, Pci64Size);
>> - ASSERT_RETURN_ERROR (PcdStatus);
>> -
>> - DEBUG ((DEBUG_INFO, "%a: Pci64Base=0x%Lx Pci64Size=0x%Lx\n",
>> - __FUNCTION__, Pci64Base, Pci64Size));
>> - }
>> -
>> - //
>> - // The useful address space ends with the 64-bit PCI host aperture.
>> - //
>> - FirstNonAddress = Pci64Base + Pci64Size;
>> - return FirstNonAddress;
>> -}
>> -
>> -
>> /**
>> - Initialize the mPhysMemAddressWidth variable, based on guest RAM size.
>> + Initialize the mPhysMemAddressWidth variable, based on CPUID data.
>> **/
>> VOID
>> AddressWidthInitialization (
>> VOID
>> )
>> {
>> - UINT64 FirstNonAddress;
>> + UINT32 RegEax;
>> - //
>> - // As guest-physical memory size grows, the permanent PEI RAM
>> requirements
>> - // are dominated by the identity-mapping page tables built by the
>> DXE IPL.
>> - // The DXL IPL keys off of the physical address bits advertized in
>> the CPU
>> - // HOB. To conserve memory, we calculate the minimum address width
>> here.
>> - //
>> - FirstNonAddress = GetFirstNonAddress ();
>> - mPhysMemAddressWidth = (UINT8)HighBitSet64 (FirstNonAddress);
>> -
>> - //
>> - // If FirstNonAddress is not an integral power of two, then we need an
>> - // additional bit.
>> - //
>> - if ((FirstNonAddress & (FirstNonAddress - 1)) != 0) {
>> - ++mPhysMemAddressWidth;
>> + AsmCpuid (0x80000000, &RegEax, NULL, NULL, NULL);
>> + if (RegEax >= 0x80000008) {
>> + AsmCpuid (0x80000008, &RegEax, NULL, NULL, NULL);
>> + mPhysMemAddressWidth = (UINT8) RegEax;
>> + } else {
>> + mPhysMemAddressWidth = 36;
>> }
>> //
>> - // The minimum address width is 36 (covers up to and excluding 64
>> GB, which
>> - // is the maximum for Ia32 + PAE). The theoretical architecture
>> maximum for
>> - // X64 long mode is 52 bits, but the DXE IPL clamps that down to 48
>> bits. We
>> - // can simply assert that here, since 48 bits are good enough for
>> 256 TB.
>> + // IA-32e paging translates 48-bit linear addresses to 52-bit
>> physical addresses.
>> //
>> - if (mPhysMemAddressWidth <= 36) {
>> - mPhysMemAddressWidth = 36;
>> + ASSERT (mPhysMemAddressWidth <= 52);
>> + if (mPhysMemAddressWidth > 48) {
>> + mPhysMemAddressWidth = 48;
>> }
>> - ASSERT (mPhysMemAddressWidth <= 48);
>> }
>> -
>> /**
>> Calculate the cap for the permanent PEI memory.
>> **/
>>
>
next prev parent reply other threads:[~2021-01-19 14:49 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-01-13 3:42 [PATCH] OvmfPkg/XenPlatformPei: Use CPUID to get physical address width on Xen Igor Druzhinin
2021-01-13 3:55 ` Laszlo Ersek
2021-01-19 9:37 ` julien
2021-01-19 14:49 ` Laszlo Ersek [this message]
2021-01-19 15:52 ` [edk2-devel] " Anthony PERARD
2021-01-19 17:23 ` Laszlo Ersek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-list from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0d7ad7aa-cfa9-5f2c-26e3-6b371737f6bc@redhat.com \
--to=devel@edk2.groups.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox