From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail05.groups.io (mail05.groups.io [45.79.224.7]) by spool.mail.gandi.net (Postfix) with ESMTPS id 88E81941775 for ; Thu, 5 Jun 2025 17:28:12 +0000 (UTC) DKIM-Signature: a=rsa-sha256; bh=y2v6nU/fW6OT1m6bOB5EMNS14ZwLUrR0TOZJCnkNbrw=; c=relaxed/simple; d=groups.io; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject:To:Cc:Precedence:List-Subscribe:List-Help:Sender:List-Id:Mailing-List:Delivered-To:Resent-Date:Resent-From:Reply-To:List-Unsubscribe-Post:List-Unsubscribe:Content-Type; s=20240830; t=1749144492; v=1; x=1749403690; b=d2w+jKCw4FF/Urh/ut94xZf+PIGf0rQnbR323xos3t5AFTZwSEa+0HkqUWe6rEzV55o4i4zI 7daFlexuxCuPPMV+/RYkw2373uD5nguuGcbfGKbatpZJab6rJErHG2SNjhLSErP6s86GrvRnGn2 bc69S8fb5QLGvpiIrWtw5ujR2Eqi4RWIokQDcxrj2DVkUlVfIaWKMeRcYPzPMdc7+oYDjBk1TPU DbH/K5n9vH34doVvMMV72PqnKJVWERb6l8yrUbbHUAuM5+xHL2iBattgoww8Nj2Io+gnD5fzaG0 dIZNud8sjfd8hzQlbLhq+kemLnzcBVTbR5uaLSTO01GNg== X-Received: by 127.0.0.2 with SMTP id sqovYY7687511xpa7l291HqB; Thu, 05 Jun 2025 10:28:10 -0700 X-Received: from sea.source.kernel.org (sea.source.kernel.org [172.234.252.31]) by mx.groups.io with SMTP id smtpd.web10.13560.1749144490217733330 for ; Thu, 05 Jun 2025 10:28:10 -0700 X-Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by sea.source.kernel.org (Postfix) with ESMTP id 8F091436F2 for ; Thu, 5 Jun 2025 17:28:09 +0000 (UTC) X-Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6D013C4CEE7 for ; Thu, 5 Jun 2025 17:28:09 +0000 (UTC) X-Received: by mail-lf1-f42.google.com with SMTP id 2adb3069b0e04-54b09cb06b0so1483999e87.1 for ; Thu, 05 Jun 2025 10:28:09 -0700 (PDT) X-Forwarded-Encrypted: i=1; AJvYcCUcl1zPwy3gBZcIuDiKiupiPYGPp+Nn1x5IVAYb+skspdvpxPr5JUgycz0al2l74odXyDJ5/w==@edk2.groups.io X-Gm-Message-State: MqzvqHTrjfwMn8BK2YY9BRrEx7686176AA= X-Google-Smtp-Source: AGHT+IH7txw1Ksx9Pihc7VsTEIndjWQG9Od9AsQE7j6aY/jP8q0SJg1YmsMsC2hlSutQRQ0Bm8Fef9MjjHWYlV6+k/A= X-Received: by 2002:a05:6512:3d86:b0:553:2fb1:cfe5 with SMTP id 2adb3069b0e04-55366bd40fdmr2697e87.12.1749144487717; Thu, 05 Jun 2025 10:28:07 -0700 (PDT) MIME-Version: 1.0 References: <7aa61f2f-f17c-438a-b676-7d80b32895ae@linux.microsoft.com> In-Reply-To: <7aa61f2f-f17c-438a-b676-7d80b32895ae@linux.microsoft.com> From: "Ard Biesheuvel via groups.io" Date: Thu, 5 Jun 2025 19:27:55 +0200 X-Gmail-Original-Message-ID: X-Gm-Features: AX0GCFv23EI6LcviuB5glc2aXlDl4jPQgzPjmCRADRzlRbyX48W_RTVMbOVi4YQ Message-ID: Subject: Re: [edk2-devel] AARCH64 Cacheability Attributes To: Oliver Smith-Denny Cc: Leif Lindholm , "devel@edk2.groups.io" Precedence: Bulk List-Subscribe: List-Help: Sender: devel@edk2.groups.io List-Id: Mailing-List: list devel@edk2.groups.io; contact devel+owner@edk2.groups.io Resent-Date: Thu, 05 Jun 2025 10:28:10 -0700 Resent-From: ardb@kernel.org Reply-To: devel@edk2.groups.io,ardb@kernel.org List-Unsubscribe-Post: List-Unsubscribe=One-Click List-Unsubscribe: Content-Type: text/plain; charset="UTF-8" X-GND-Status: LEGIT Authentication-Results: spool.mail.gandi.net; dkim=pass header.d=groups.io header.s=20240830 header.b=d2w+jKCw; dmarc=pass (policy=none) header.from=groups.io; spf=pass (spool.mail.gandi.net: domain of bounce@groups.io designates 45.79.224.7 as permitted sender) smtp.mailfrom=bounce@groups.io On Wed, 4 Jun 2025 at 19:54, Oliver Smith-Denny wrote: > > Hi Ard and Leif, > > We have been debugging an issue on an AARCH64 platform that has led us > to believe a UEFI spec update with a new caching type may be needed, > but we wanted to get your input before proposing that. > > Diving into the specific issue that led us here first: we have a > platform with an XHCI controller that connects to the PCI > hierarchy through NonDiscoverablePciDeviceDxe, registering as > a non-coherent MMIO device, which prompts that driver to set > up its PciIo structure with the noncoherent routines: > https://github.com/tianocore/edk2/blob/8c04bcc7ed0efbb2e3fde23f787ff0249e62f874/MdeModulePkg/Bus/Pci/NonDiscoverablePciDeviceDxe/NonDiscoverablePciDeviceIo.c#L1098. > > When attempting to PXE boot over a USB NIC, SnpDxe ends up having > many different alignment faults trying to access the host DMA > buffers it has set up: > https://github.com/tianocore/edk2/blob/8c04bcc7ed0efbb2e3fde23f787ff0249e62f874/NetworkPkg/SnpDxe/Snp.c#L364 > > (SnpDxe is poorly architected and we are putting up a PR to resolve > some of the glaring issues, like allocating its internal driver > structure in DMA-able memory). > > Tracking this down, this is because the non-coherent APIs exposed > in the PciIo->AllocateBuffer routine of NonDiscoverablePciDeviceDxe > set the attributes of the buffers to what the caller has provided > or if the caller has not provided attributes (as SnpDxe does not, > and can't, it doesn't know what the underlying bus is and what the > cacheability state must be), it will set UC. > > ArmMmuLib translates UC to device memory: > https://github.com/tianocore/edk2/blob/8c04bcc7ed0efbb2e3fde23f787ff0249e62f874/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibCore.c#L451 > > Which causes the AP to have stricter alignment requirements (among other > things) and then causes faults with various code patterns in SnpDxe and > the closed source vendor provided UNDI driver, including some patterns > that GCC has created where it tries to optimize writing 0 to the > structure and so does unaligned writes. > > We explored various workarounds: setting -mstrict-align (which is a big > hammer, though I don't have exact numbers), using the coherent APIs (I > suspect the platform XHCI driver was copied from somewhere and that it > is actually coherent on this platform), and rearchitecting parts of > SnpDxe. > > However, while we have a solution there, this led us to the greater > problem: edk2 (and the UEFI spec) were built for x86 and the > cacheability attributes are x86 ones that we are attempting to > shoehorn into aarch64 ones. I am guessing (though certainly the two > of you would have much better knowledge here) that when ArmPkg was > created, there were many such decisions made to avoid changing the rest > of edk2 and just bolting Arm onto the side. Because some UC memory must > be device memory, all of UC is made into device so that things "just > work", until they don't. I think that having all UC being set to > device memory is a recipe for more alignment faults because we have no > guarantees in higher level drivers that accesses are aligned (or > compiler guarantees). > > OSes do of course make the distinction between normal non-cacheable and > device memory. My proposal is to follow this model (and the ARM ARM) and > update the UEFI spec to include a new cacheability attribute, > EFI_MEMORY_UC_IO (or some such name, I don't really care) which for x86 > will just map to the same thing as EFI_MEMORY_UC. On aarch64, > EFI_MEMORY_UC_IO maps to device memory and EFI_MEMORY_UC maps to > normal non-cacheable (an aside is that EFI_MEMORY_WC probably continues > to also map to normal non-cacheable). Then, drivers and the cores are > updated to reflect whether they are setting attributes on true MMIO or > host side DMA buffers (or whatever else). > > I believe this alleviates the issues with widespread device memory and > the alignment faults while bringing the UEFI spec and edk2 into better > alignment with the aarch64 architecture. > > We have explored a few other possibilities that are more of the bolted > on the side variety, such as the GCD knowning what is mapped MMIO and > so ArmCpuDxe can query the GCD if it sees EFI_MEMORY_UC being passed in > and query if this is actually MMIO or normal memory and map accordingly, > but besides being additional overhead, this feels like another band aid > instead of the holistic solution of actually integrating aarch64 into > the UEFI spec/edk2. Also, there is concern that because the UEFI spec > defines these attributes (and the PI spec references them) we would be > unable to reflect the real state of memory when returning the results of > a query, etc. Furthermore, adding a heuristic for the core to follow > is always a little troubling, in the end, the calling driver knows what > this memory region is being used for and what the cacheability should > be. > > Please let us know your thoughts, we may well be missing some important > nuances here, but if you agree with the general direction, we'll take > this to the USWG. > Hi, Thanks for elaborating. First of all, I would assume that the device is actually non-coherent, or it wouldn't work at all using the non-coherent routines, given that these perform cache invalidation on buffers used for inbound DMA, which would nuke any data the device would have put there via its cacheable mapping (line 1453 in the same file). I think it was a mistake for ARM platforms to ever use UC|WC||WT|WB by default for describing conventional memory. On ARM, only WC and WB make sense for RAM, and UC should be used for MMIO (i.e., memory ranges that are not backed any kind of RAM but by device registers where both reads and writes may have side effects). Note that not only misaligned accesses are problematic, also DC ZVA instructions will fail, and these are heavily used to accelerate SetMem() Simply dropping UC from the code that declares the DRAM in the platform startup code should fix the issue entirely, afaict. (Note that ArmVirtQemu et al declare their system memory as only WB capable and nothing else - this is needed because KVM is fundamentally coherent, as the VMM uses cacheable mappings to access guest memory) Adding more memory types to the spec is not going to solve anything, I'm afraid. We already have some dubious ISA mask types that were added without proper motivation, and I'd rather have fewer types (e.g., on ARM, WT is also just normal non-cacheable under the hood in most cases and I don't have a clue how it is intended to be used). Instead, we should stop using UC for conventional memory. -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#121398): https://edk2.groups.io/g/devel/message/121398 Mute This Topic: https://groups.io/mt/113471239/7686176 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io] -=-=-=-=-=-=-=-=-=-=-=-