Sorry need some more time to digest this…. First thoughts.

1) The actual performance issue we hit was the explosion of CoreValidateHandle() calls as the number of protocols got large for some diags. The newer handles tended to be at the end of the list if I remember correctly. 
  a) It looks like CoreValidateHandle() is the only place  gHandleList was walked, as the handle info is crossed referenced in the protocol database. So that is why we changed that.
2) If the issue at hand is MM why not just drop the optimizations?
  b) If we have so many MM protocols and handles that seems like its own problem? 
3) The issue is patching the grammar in place, why can’t we just make a copy for the dispatcher grammer, and operate on the copy. Maybe via a copy on 1st update strategy? 

Thanks,

Andrew Fish

On Jan 10, 2024, at 5:45 AM, Laszlo Ersek <lersek@redhat.com> wrote:

(+ Andrew)

On 1/10/24 14:09, Laszlo Ersek wrote:

I think that keeping the depex section read-only is valuable, so I'd
rule out #2. I'd also not start with option #1 -- copying the depex to
heap memory, just so this optimization can succeed. I'd simply start by
removing the optimization, and measuring how much driver dispatch slows
down in practice, on various platforms.

Can you try this? (I have only build-tested and "uncrustified" it.)

The patch removes the EFI_DEP_REPLACE_TRUE handling altogether, plus it
CONST-ifies the Iterator pointer (which points into the DEPEX section),
so that the compiler catch any possible accesses at *build time* that
would write to the write-protected DEPEX memory area.

On a tangent: the optimization in question highlights a more general
problem, namely that the DXE (and possibly MM/SMM) protocol databases
are considered slow, for some access patterns.

Edk2 implements those protocol databases with linked lists, where lookup
costs O(n) operations (average and worst cases). And protocol lookups
are quite frequent (for example, in depex evaluations, they could be
considered "particularly frequent").

(1) The "Tasks" wiki page mentions something *similar* (but not the
same); see

https://github.com/tianocore/tianocore.github.io/wiki/Tasks#datahub--gcd-scalability

The description is: "The DXE core's DataHub and GCD (Global Coherency
Domain) layers don't scale well as the number of data items gets large,
since they are based on simple linked lists. Find better data structures."

The same might apply more or less to the protocol database implementation.

(2) More to the point, Andrew Fish reported earlier that at Apple, they
had rewritten the DXE protocol database, using the red-black tree
OrderedCollectionLib that I had contributed previously to edk2 -- and
they saw significant performance improvements.

So upstreaming that feature to edk2 could be very valuable. (Red-black
trees have O(log(n)) time cost (worst case) for lookup, insertion and
deletion, and O(n) cost for iterating through the whole data structure.)

Let me see if I can find the bugzilla ticket...

Ah, I got it. Apologies, I misremembered: Andrew's comment was not about
the protocol database, but about the handle database. Here it is:

https://bugzilla.tianocore.org/show_bug.cgi?id=988#c7

(the BZ is still CONFIRMED btw...)

Still, I think it must be related in a way. Namely, an EFI handle exists
if and only if at least one protocol interface is installed on it. If
you uninstall the last protocol interface from a handle, then the handle
is destroyed -- in fact that's the *only* way to destroy a handle, to my
understanding. See EFI_BOOT_SERVICES.UninstallProtocolInterface() in the
UEFI spec: "If the last protocol interface is removed from a handle, the
handle is freed and is no longer valid". Furthermore, calling
InstallProtocolInterface() and InstallMultipleProtocolInterfaces() is
how one *creates* new handles.

So given how handles and protocol interfaces are conceptually
interlinked, an rbtree-based protocol DB might have to maintain multiple
rbtrees internally (for the ability to search the database quickly with
different types of "keys"). I don't have a design ready in my mind at
all (I'm not that familiar with the *current*, list-based implementation
to begin with!). Upstreaming Apple's (experimental?) code could prove
very helpful.

Laszlo







_._,_._,_

Groups.io Links:

You receive all messages sent to this group.

View/Reply Online (#113644) | | Mute This Topic | New Topic
Your Subscription | Contact Group Owner | Unsubscribe [rebecca@openfw.io]

_._,_._,_