public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
From: "Pedro Falcato" <pedro.falcato@gmail.com>
To: devel@edk2.groups.io, lersek@redhat.com
Cc: nhi@os.amperecomputing.com,
	 "ardb+tianocore@kernel.org" <ardb+tianocore@kernel.org>,
	Andrew Fish <afish@apple.com>
Subject: Re: [edk2-devel] Memory Attribute for depex section
Date: Wed, 10 Jan 2024 21:50:54 +0000	[thread overview]
Message-ID: <CAKbZUD0iTi7Y92FdHR_D646sBroAapLZy_nnfNf7GEriuy8zBg@mail.gmail.com> (raw)
In-Reply-To: <45b95719-f1fc-dbc6-a4cc-a022d691844c@redhat.com>

On Wed, Jan 10, 2024 at 1:45 PM Laszlo Ersek <lersek@redhat.com> wrote:
>
> (+ Andrew)
>
> On 1/10/24 14:09, Laszlo Ersek wrote:
>
> > I think that keeping the depex section read-only is valuable, so I'd
> > rule out #2. I'd also not start with option #1 -- copying the depex to
> > heap memory, just so this optimization can succeed. I'd simply start by
> > removing the optimization, and measuring how much driver dispatch slows
> > down in practice, on various platforms.
> >
> > Can you try this? (I have only build-tested and "uncrustified" it.)
> >
> > The patch removes the EFI_DEP_REPLACE_TRUE handling altogether, plus it
> > CONST-ifies the Iterator pointer (which points into the DEPEX section),
> > so that the compiler catch any possible accesses at *build time* that
> > would write to the write-protected DEPEX memory area.
>
> On a tangent: the optimization in question highlights a more general
> problem, namely that the DXE (and possibly MM/SMM) protocol databases
> are considered slow, for some access patterns.
>
> Edk2 implements those protocol databases with linked lists, where lookup
> costs O(n) operations (average and worst cases). And protocol lookups
> are quite frequent (for example, in depex evaluations, they could be
> considered "particularly frequent").
>
> (1) The "Tasks" wiki page mentions something *similar* (but not the
> same); see
>
> https://github.com/tianocore/tianocore.github.io/wiki/Tasks#datahub--gcd-scalability
>
> The description is: "The DXE core's DataHub and GCD (Global Coherency
> Domain) layers don't scale well as the number of data items gets large,
> since they are based on simple linked lists. Find better data structures."

How large do they usually get? What's the worst case?

>
> The same might apply more or less to the protocol database implementation.
>
> (2) More to the point, Andrew Fish reported earlier that at Apple, they
> had rewritten the DXE protocol database, using the red-black tree
> OrderedCollectionLib that I had contributed previously to edk2 -- and
> they saw significant performance improvements.
>
> So upstreaming that feature to edk2 could be very valuable. (Red-black
> trees have O(log(n)) time cost (worst case) for lookup, insertion and
> deletion, and O(n) cost for iterating through the whole data structure.)

FWIW, can we do better than an RB tree? They're notoriously cache unfriendly...

>
> Let me see if I can find the bugzilla ticket...
>
> Ah, I got it. Apologies, I misremembered: Andrew's comment was not about
> the protocol database, but about the handle database. Here it is:
>
> https://bugzilla.tianocore.org/show_bug.cgi?id=988#c7
>
> (the BZ is still CONFIRMED btw...)
>
> Still, I think it must be related in a way. Namely, an EFI handle exists
> if and only if at least one protocol interface is installed on it. If
> you uninstall the last protocol interface from a handle, then the handle
> is destroyed -- in fact that's the *only* way to destroy a handle, to my
> understanding. See EFI_BOOT_SERVICES.UninstallProtocolInterface() in the
> UEFI spec: "If the last protocol interface is removed from a handle, the
> handle is freed and is no longer valid". Furthermore, calling
> InstallProtocolInterface() and InstallMultipleProtocolInterfaces() is
> how one *creates* new handles.
>
> So given how handles and protocol interfaces are conceptually
> interlinked, an rbtree-based protocol DB might have to maintain multiple
> rbtrees internally (for the ability to search the database quickly with
> different types of "keys"). I don't have a design ready in my mind at
> all (I'm not that familiar with the *current*, list-based implementation
> to begin with!). Upstreaming Apple's (experimental?) code could prove
> very helpful.

Ok, some thoughts:

For the handle database, if we made EFI_HANDLE an integer instead, we
could very easily use something similar to a radix tree (see Linux's
xarray). This would provide O(log(n)) lookups and insertion with a
much higher fan-out, without needing CoreValidateHandle's awful O(n)
lookup every time it gets an EFI_HANDLE. You'd just look up the handle
in the tree and if NULL, it's invalid. [1]

For the protocol database, you'd replace the linked list with a simple
hashtable, hashed by protocol. Something as simple as LIST_ENTRY
mProtocolHashtable[64]; would probably be enough to fan out most of
the problems (I think? How many different protocols are there?)

-- 
Pedro

[1] One could be wary of doing this lookup for every EFI_HANDLE
instead of having the pointer directly. But this is much more
efficient than needing to iterate a linked list to validate a pointer.
Considering an xarray-like radix tree as previously described, two
levels with a fan-out of 64 could describe indices 0 - 4096 (64 * 64),
which is much more efficient than chasing through pointers (worst
case) 4096 times until we find the handle.


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#113541): https://edk2.groups.io/g/devel/message/113541
Mute This Topic: https://groups.io/mt/103594587/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



  reply	other threads:[~2024-01-10 21:51 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-08 10:11 [edk2-devel] Memory Attribute for depex section Nhi Pham via groups.io
2024-01-10  0:48 ` Nhi Pham via groups.io
2024-01-10 13:09 ` Laszlo Ersek
2024-01-10 13:45   ` Laszlo Ersek
2024-01-10 21:50     ` Pedro Falcato [this message]
2024-01-11  8:46       ` Laszlo Ersek
2024-01-11 11:03         ` Laszlo Ersek
2024-01-12  2:10         ` Pedro Falcato
2024-01-12  9:35           ` Laszlo Ersek
2024-01-12 14:46             ` Pedro Falcato
2024-01-12 16:37               ` Michael D Kinney
2024-01-12 18:56                 ` Andrew Fish via groups.io
2024-01-12 19:04                   ` Michael D Kinney
2024-01-12 19:26                     ` Andrew Fish via groups.io
2024-01-12 18:50               ` Andrew Fish via groups.io
2024-01-12  2:44     ` Andrew Fish via groups.io
2024-01-12  9:45       ` Laszlo Ersek
2024-01-15 13:07         ` Nhi Pham via groups.io
2024-01-15 14:04           ` Ard Biesheuvel
2024-01-15 19:00             ` Laszlo Ersek
2024-01-18  6:00               ` Nhi Pham via groups.io
2024-01-18 14:49                 ` Laszlo Ersek
2024-01-19  4:43                   ` Nhi Pham via groups.io
2024-01-19 13:54                     ` Laszlo Ersek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAKbZUD0iTi7Y92FdHR_D646sBroAapLZy_nnfNf7GEriuy8zBg@mail.gmail.com \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox