From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail02.groups.io (mail02.groups.io [66.175.222.108]) by spool.mail.gandi.net (Postfix) with ESMTPS id CB0DB9409DE for ; Fri, 12 Jan 2024 02:10:16 +0000 (UTC) DKIM-Signature: a=rsa-sha256; bh=xSJ1UiSMNjy68FJ3B/K3+0ThEJV0gT/uMxftNiOO1tA=; c=relaxed/simple; d=groups.io; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject:To:Cc:Precedence:List-Subscribe:List-Help:Sender:List-Id:Mailing-List:Delivered-To:Reply-To:List-Unsubscribe-Post:List-Unsubscribe:Content-Type:Content-Transfer-Encoding; s=20140610; t=1705025415; v=1; b=XtqpXa8GBAq1y3OTYqF7uAOJ1WcdpNqta9XoJFNFewURLxphIKMRaQIVqYwspnpzysQ7oEU1 hlKOi8f7r8+0w8RWUvVLQRCt7Gnpl/p9Qk49etDz/92pvVINnDb8+3HumzZEKu2pEHjYSAILp1L GR5ejMY9y5nsNxdhnS3WFnu4= X-Received: by 127.0.0.2 with SMTP id hFb5YY7687511xKnNa6NBaxI; Thu, 11 Jan 2024 18:10:15 -0800 X-Received: from mail-yb1-f179.google.com (mail-yb1-f179.google.com [209.85.219.179]) by mx.groups.io with SMTP id smtpd.web11.10248.1705025414824421362 for ; Thu, 11 Jan 2024 18:10:14 -0800 X-Received: by mail-yb1-f179.google.com with SMTP id 3f1490d57ef6-dbe78430946so5152350276.0 for ; Thu, 11 Jan 2024 18:10:14 -0800 (PST) X-Gm-Message-State: dLUiBGdYeUMPtfRKay6wX5esx7686176AA= X-Google-Smtp-Source: AGHT+IFaXvJj+PJ6RhJhctVbBvEcukswR3CWVt0EGqL4loxGvstV2cdqPLIOpZin5wgWXmKXaE2741pEUaFw2nCPnh8= X-Received: by 2002:a05:6902:27c7:b0:dbd:50e4:6802 with SMTP id ec7-20020a05690227c700b00dbd50e46802mr154318ybb.127.1705025413825; Thu, 11 Jan 2024 18:10:13 -0800 (PST) MIME-Version: 1.0 References: <44ca139f-4d78-4322-b5b6-8e9788bb7486@os.amperecomputing.com> <2ad16043-754e-3bb9-3a4a-702d9a50bf63@redhat.com> <45b95719-f1fc-dbc6-a4cc-a022d691844c@redhat.com> <30901646-905c-798c-d088-255498028fff@redhat.com> In-Reply-To: <30901646-905c-798c-d088-255498028fff@redhat.com> From: "Pedro Falcato" Date: Fri, 12 Jan 2024 02:10:02 +0000 Message-ID: Subject: Re: [edk2-devel] Memory Attribute for depex section To: Laszlo Ersek Cc: devel@edk2.groups.io, nhi@os.amperecomputing.com, "ardb+tianocore@kernel.org" , Andrew Fish Precedence: Bulk List-Subscribe: List-Help: Sender: devel@edk2.groups.io List-Id: Mailing-List: list devel@edk2.groups.io; contact devel+owner@edk2.groups.io Reply-To: devel@edk2.groups.io,pedro.falcato@gmail.com List-Unsubscribe-Post: List-Unsubscribe=One-Click List-Unsubscribe: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-GND-Status: LEGIT Authentication-Results: spool.mail.gandi.net; dkim=pass header.d=groups.io header.s=20140610 header.b=XtqpXa8G; dmarc=fail reason="SPF not aligned (relaxed), DKIM not aligned (relaxed)" header.from=gmail.com (policy=none); spf=pass (spool.mail.gandi.net: domain of bounce@groups.io designates 66.175.222.108 as permitted sender) smtp.mailfrom=bounce@groups.io On Thu, Jan 11, 2024 at 8:46=E2=80=AFAM Laszlo Ersek wr= ote: > > On 1/10/24 22:50, Pedro Falcato wrote: > > On Wed, Jan 10, 2024 at 1:45=E2=80=AFPM Laszlo Ersek > > wrote: > >> > >> (+ Andrew) > >> > >> On 1/10/24 14:09, Laszlo Ersek wrote: > >> > >>> I think that keeping the depex section read-only is valuable, so I'd > >>> rule out #2. I'd also not start with option #1 -- copying the depex > >>> to heap memory, just so this optimization can succeed. I'd simply > >>> start by removing the optimization, and measuring how much driver > >>> dispatch slows down in practice, on various platforms. > >>> > >>> Can you try this? (I have only build-tested and "uncrustified" it.) > >>> > >>> The patch removes the EFI_DEP_REPLACE_TRUE handling altogether, plus > >>> it CONST-ifies the Iterator pointer (which points into the DEPEX > >>> section), so that the compiler catch any possible accesses at *build > >>> time* that would write to the write-protected DEPEX memory area. > >> > >> On a tangent: the optimization in question highlights a more general > >> problem, namely that the DXE (and possibly MM/SMM) protocol databases > >> are considered slow, for some access patterns. > >> > >> Edk2 implements those protocol databases with linked lists, where > >> lookup costs O(n) operations (average and worst cases). And protocol > >> lookups are quite frequent (for example, in depex evaluations, they > >> could be considered "particularly frequent"). > >> > >> (1) The "Tasks" wiki page mentions something *similar* (but not the > >> same); see > >> > >> https://github.com/tianocore/tianocore.github.io/wiki/Tasks#datahub--g= cd-scalability > >> > >> The description is: "The DXE core's DataHub and GCD (Global Coherency > >> Domain) layers don't scale well as the number of data items gets > >> large, since they are based on simple linked lists. Find better data > >> structures." > > > > How large do they usually get? What's the worst case? > > No idea. > > > > >> > >> The same might apply more or less to the protocol database > >> implementation. > >> > >> (2) More to the point, Andrew Fish reported earlier that at Apple, > >> they had rewritten the DXE protocol database, using the red-black > >> tree OrderedCollectionLib that I had contributed previously to edk2 > >> -- and they saw significant performance improvements. > >> > >> So upstreaming that feature to edk2 could be very valuable. > >> (Red-black trees have O(log(n)) time cost (worst case) for lookup, > >> insertion and deletion, and O(n) cost for iterating through the whole > >> data structure.) > > > > FWIW, can we do better than an RB tree? They're notoriously cache > > unfriendly... > > Sure, if someone contributes a different data structure that is suitable > for the task :) Hey, I happen to have written one! https://github.com/heatd/Onyx/blob/master/kernel/kernel/radix.cpp It just needs some C'ifying, then some EFI'ing on top, but I'm fairly confident in its stability. > > RB trees may be cache unfriendly, but the functionality they provide > with O(log(n)) worst case performance is amazing. You can use them as > read-write associate arrays, sorted lists supporting forward and > backward traversal (in fact tools for sorting), priority queues, etc. > > When I contributed the code, edk2 didn't have any associative array > type, so something generic that would address the widest range of use > cases looked like a good idea. (And indeed the library has been well > applied in several of those use cases since, in various parts of edk2 -- > for sorting, uniqueness-checking, async interface token tracking & > lookups.) Definitely, I use it in Ext4Dxe too, it's great! (Although I do have a small nit in that it allocates nodes itself, and there's no way around it, but oh well...) > > This is not an argument against a more suitable data structure of > course. Just pointing out that the RB tree has worked well thus far. > E.g., under the BZ link below, Andrew mentions a diagnostic tool that > creates 3000 handles. Looking up an element in a 3000-element list would > cost 1500 iterations on average; using a balanced binary search tree it > might cost ~11 iterations. Assuming that linked lists and linked binary > search trees are similarly cache-unfriendly, that's a ~136 factor of > improvement. I would guess that binary trees are more cache-and-speculation unfriendly as you need to decide which branch you're going down. But yes, even a RB tree would be a great improvement over the linked list. > > > > >> > >> Let me see if I can find the bugzilla ticket... > >> > >> Ah, I got it. Apologies, I misremembered: Andrew's comment was not > >> about the protocol database, but about the handle database. Here it > >> is: > >> > >> https://bugzilla.tianocore.org/show_bug.cgi?id=3D988#c7 > >> > >> (the BZ is still CONFIRMED btw...) > >> > >> Still, I think it must be related in a way. Namely, an EFI handle > >> exists if and only if at least one protocol interface is installed on > >> it. If you uninstall the last protocol interface from a handle, then > >> the handle is destroyed -- in fact that's the *only* way to destroy a > >> handle, to my understanding. See > >> EFI_BOOT_SERVICES.UninstallProtocolInterface() in the UEFI spec: "If > >> the last protocol interface is removed from a handle, the handle is > >> freed and is no longer valid". Furthermore, calling > >> InstallProtocolInterface() and InstallMultipleProtocolInterfaces() is > >> how one *creates* new handles. > >> > >> So given how handles and protocol interfaces are conceptually > >> interlinked, an rbtree-based protocol DB might have to maintain > >> multiple rbtrees internally (for the ability to search the database > >> quickly with different types of "keys"). I don't have a design ready > >> in my mind at all (I'm not that familiar with the *current*, > >> list-based implementation to begin with!). Upstreaming Apple's > >> (experimental?) code could prove very helpful. > > > > Ok, some thoughts: > > > > For the handle database, if we made EFI_HANDLE an integer instead, we > > could very easily use something similar to a radix tree (see Linux's > > xarray). This would provide O(log(n)) lookups and insertion with a > > much higher fan-out, without needing CoreValidateHandle's awful O(n) > > lookup every time it gets an EFI_HANDLE. You'd just look up the handle > > in the tree and if NULL, it's invalid. [1] > > EFI_HANDLE is a typedef to (VOID*) per UEFI spec -- section 2.3.1 "Data > Types" says: "A collection of related interfaces. Type VOID *" --, so > EFI_HANDLE can't be redefined, it would break compatibility with > everything. > > Internally, it could be cast to UINTN, and then those 32- or 64-bit long > bit strings could be considered as keys for a radix tree. But I'm not > sure if this would still be as useful as the two-level radix tree (with > a fanout of 64) that you describe. > > (If we wanted to map the EFI_HANDLEs to the index space [0..4095], then > we'd just land back at the original problem: storing the EFI_HANDLEs in > an associative data structure, for example in a hash map. We could find > a good hash function, but because it would mathematically reduce the key > space from 64-bit pointers to 12-bit indices, collisions would be > theoretically inevitable, and the data structure couldn't avoid > *storing* the mapping. This is not a general argument against using a > hash table, it just means that we'd have to provide a good initial table > size, and then deal with resizing cost whenever necessary.) My idea was to make each handle an index - like a file descriptor - AFAIK there's no reason why it *needs* to be an actual pointer. We'd allocate indices when creating a handle, and return that (casted to VO= ID*). Say, for a naive allocation scheme and the data structure I linked above: /* 0 (NULL) must be reserved, as it's a bad EFI_HANDLE result. We could also return HandleId + 1, and - 1 when indexing back to the radix tree. Either option works */ STATIC UINTN mNextHandleId =3D 1; STATIC radix_tree mHandleTree; STATIC IHANDLE * CreateHandle ( OUT UINTN *OutHandleId ) { IHANDLE *Handle =3D AllocatePool(sizeof(IHANDLE)); *OutHandleId =3D mNextHandleId++; mHandleTree.insert(*OutHandleId, Handle); return Handle; } /* And then transform CoreValidateHandle into */ STATIC EFI_STATUS CoreGetHandle ( IN EFI_HANDLE Handle, OUT IHANDLE **OutHandle ) { IHANDLE *Result; if (Handle =3D=3D NULL) return EFI_INVALID_PARAMETER; expected result =3D mHandleTree.get((UINTN) Handle); if (result.has_error()) return EFI_INVALID_PARAMETER; /* Handle not in the tree */ *OutHandle =3D (IHANDLE *) result.value(); return EFI_SUCCESS; } -- Hopefully the snippet gets my point across. Insertion, searching and iteration should all be super fast (O(logFanOut(N))). Note that one could also just use a standard resizable vector (a-la std::vector), but I don't know if that maps well to EFI handles and memory allocation. > > > For the protocol database, you'd replace the linked list with a simple > > hashtable, hashed by protocol. Something as simple as LIST_ENTRY > > mProtocolHashtable[64]; would probably be enough to fan out most of > > the problems (I think? How many different protocols are there?) > > I can answer this question reasonably well, I think. I have a script > that collects symbolic names of GUIDs from the edk2 tree (plus hardcodes > a number of well-known but not edk2 GUIDs), and creates a "sed" script > out of them. Then another script uses this "sed" script for filtering > edk2 logs -- the idea being to replace the whole bunch of logged GUIDs > with their symbolic names. That makes logs much easier to read. > > The generator script is written such a way that the generated "sed" > script only grows over time; put differently, this "dictionary" of > name<->GUID associations never forgets, it only picks up new GUIDs. The > "sed" script (=3D the dictionary file) consists of entries like > > s:FFB19961-79CC-4684-84A8-C31B0A2BBE82:[EarlyFdt16550SerialPortHookLib]:i= g > s:FFB56F09-65E3-4462-A799-2F0D1930D38C:[DxeContextBufferModuleConfigLib]:= ig > s:FFE06BDD-6107-46A6-7BB2-5A9C7EC5275C:[EfiAcpiTableProtocol]:ig > s:FFF12B8D-7696-4C8B-A985-2747075B4F50:[EfiSystemNvDataFv]:ig > > it's sorted uniquely by GUID. > > Right now, it has 3074 entries. (People like generating GUIDs! :) In > PI/UEFI/edk2, *everything* is a GUID, not just protocols!) > > So using 64 buckets would hash ~50 different protocol GUIDs to any given > bucket (although, again, not all of those three thousand GUIDs are > *protocol* GUIDs). 1024 buckets might work better. Oh! That's nice! Taking your protocol figure of 515 into account, 512 or even 1024 buckets should probably take care of that. I should note that I find it super hard to get a concrete idea on performance for EFI firmware without adequate tooling - I meant to write a standalone flamegraph tool a few weeks back (even posted in edk2-devel), but, as far as I could tell, the architectural timer protocol couldn't get me the interrupt frame[1]. Until then, whether any of this radix tree vs RB tree vs flat array stuff really matters... I find it hard to say. [1] x86 has 3 timers (PIT, LAPIC timer, HPET) and performance monitoring interrupts, and I couldn't freely use any of them :^) --=20 Pedro -=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#113638): https://edk2.groups.io/g/devel/message/113638 Mute This Topic: https://groups.io/mt/103594587/7686176 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io] -=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-