public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
* [edk2-devel] AARCH64 Cacheability Attributes
@ 2025-06-04 17:54 Oliver Smith-Denny via groups.io
  2025-06-05 17:27 ` Ard Biesheuvel via groups.io
  0 siblings, 1 reply; 5+ messages in thread
From: Oliver Smith-Denny via groups.io @ 2025-06-04 17:54 UTC (permalink / raw)
  To: Ard Biesheuvel, Leif Lindholm; +Cc: devel@edk2.groups.io

Hi Ard and Leif,

We have been debugging an issue on an AARCH64 platform that has led us
to believe a UEFI spec update with a new caching type may be needed,
but we wanted to get your input before proposing that.

Diving into the specific issue that led us here first: we have a
platform with an XHCI controller that connects to the PCI
hierarchy through NonDiscoverablePciDeviceDxe, registering as
a non-coherent MMIO device, which prompts that driver to set
up its PciIo structure with the noncoherent routines:
https://github.com/tianocore/edk2/blob/8c04bcc7ed0efbb2e3fde23f787ff0249e62f874/MdeModulePkg/Bus/Pci/NonDiscoverablePciDeviceDxe/NonDiscoverablePciDeviceIo.c#L1098.

When attempting to PXE boot over a USB NIC, SnpDxe ends up having
many different alignment faults trying to access the host DMA
buffers it has set up:
https://github.com/tianocore/edk2/blob/8c04bcc7ed0efbb2e3fde23f787ff0249e62f874/NetworkPkg/SnpDxe/Snp.c#L364

(SnpDxe is poorly architected and we are putting up a PR to resolve
some of the glaring issues, like allocating its internal driver
structure in DMA-able memory).

Tracking this down, this is because the non-coherent APIs exposed
in the PciIo->AllocateBuffer routine of NonDiscoverablePciDeviceDxe
set the attributes of the buffers to what the caller has provided
or if the caller has not provided attributes (as SnpDxe does not,
and can't, it doesn't know what the underlying bus is and what the
cacheability state must be), it will set UC.

ArmMmuLib translates UC to device memory:
https://github.com/tianocore/edk2/blob/8c04bcc7ed0efbb2e3fde23f787ff0249e62f874/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibCore.c#L451

Which causes the AP to have stricter alignment requirements (among other
things) and then causes faults with various code patterns in SnpDxe and
the closed source vendor provided UNDI driver, including some patterns
that GCC has created where it tries to optimize writing 0 to the
structure and so does unaligned writes.

We explored various workarounds: setting -mstrict-align (which is a big
hammer, though I don't have exact numbers), using the coherent APIs (I
suspect the platform XHCI driver was copied from somewhere and that it
is actually coherent on this platform), and rearchitecting parts of
SnpDxe.

However, while we have a solution there, this led us to the greater
problem: edk2 (and the UEFI spec) were built for x86 and the
cacheability attributes are x86 ones that we are attempting to
shoehorn into aarch64 ones. I am guessing (though certainly the two
of you would have much better knowledge here) that when ArmPkg was
created, there were many such decisions made to avoid changing the rest
of edk2 and just bolting Arm onto the side. Because some UC memory must
be device memory, all of UC is made into device so that things "just
work", until they don't. I think that having all UC being set to
device memory is a recipe for more alignment faults because we have no
guarantees in higher level drivers that accesses are aligned (or
compiler guarantees).

OSes do of course make the distinction between normal non-cacheable and
device memory. My proposal is to follow this model (and the ARM ARM) and 
update the UEFI spec to include a new cacheability attribute,
EFI_MEMORY_UC_IO (or some such name, I don't really care) which for x86
will just map to the same thing as EFI_MEMORY_UC. On aarch64,
EFI_MEMORY_UC_IO maps to device memory and EFI_MEMORY_UC maps to
normal non-cacheable (an aside is that EFI_MEMORY_WC probably continues
to also map to normal non-cacheable). Then, drivers and the cores are
updated to reflect whether they are setting attributes on true MMIO or
host side DMA buffers (or whatever else).

I believe this alleviates the issues with widespread device memory and
the alignment faults while bringing the UEFI spec and edk2 into better
alignment with the aarch64 architecture.

We have explored a few other possibilities that are more of the bolted
on the side variety, such as the GCD knowning what is mapped MMIO and
so ArmCpuDxe can query the GCD if it sees EFI_MEMORY_UC being passed in
and query if this is actually MMIO or normal memory and map accordingly,
but besides being additional overhead, this feels like another band aid
instead of the holistic solution of actually integrating aarch64 into
the UEFI spec/edk2. Also, there is concern that because the UEFI spec
defines these attributes (and the PI spec references them) we would be
unable to reflect the real state of memory when returning the results of
a query, etc. Furthermore, adding a heuristic for the core to follow
is always a little troubling, in the end, the calling driver knows what
this memory region is being used for and what the cacheability should
be.

Please let us know your thoughts, we may well be missing some important
nuances here, but if you agree with the general direction, we'll take
this to the USWG.

Thanks,
Oliver


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#121391): https://edk2.groups.io/g/devel/message/121391
Mute This Topic: https://groups.io/mt/113471239/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-06-05 20:21 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-04 17:54 [edk2-devel] AARCH64 Cacheability Attributes Oliver Smith-Denny via groups.io
2025-06-05 17:27 ` Ard Biesheuvel via groups.io
2025-06-05 17:55   ` Oliver Smith-Denny via groups.io
2025-06-05 18:19     ` Ard Biesheuvel via groups.io
2025-06-05 20:21       ` Oliver Smith-Denny via groups.io

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox