public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
* [edk2-devel] Memory Attribute for depex section
@ 2024-01-08 10:11 Nhi Pham via groups.io
  2024-01-10  0:48 ` Nhi Pham via groups.io
  2024-01-10 13:09 ` Laszlo Ersek
  0 siblings, 2 replies; 24+ messages in thread
From: Nhi Pham via groups.io @ 2024-01-08 10:11 UTC (permalink / raw)
  To: ardb+tianocore@kernel.org, devel@edk2.groups.io

Hi Ard, all,

Could you please help explain how the depex section in an image is 
mapped in terms of memory attribute?

As my observation, dispatcher locates[1] the depex section inside the 
module image and write[2] an evaluated data to the depex if necessary 
for scheduled boot order. The problem that the depex section is now in 
RO+X memory due to a part of the module image, so a writing to depex 
would cause data abort. I'm unsure whether this issue is generic in EDK2 
or not.

I think of two approaches:

#1 Relocate the depex section to heap memory for dependency evaluation?

#2 EDK2 build tool to support granting write permission for depex section.

[1] StandaloneMmPkg/Core/FwVol.c:236
[2] StandaloneMmPkg/Core/Dependency.c:256

Thanks,
Nhi


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#113373): https://edk2.groups.io/g/devel/message/113373
Mute This Topic: https://groups.io/mt/103594587/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [edk2-devel] Memory Attribute for depex section
  2024-01-08 10:11 [edk2-devel] Memory Attribute for depex section Nhi Pham via groups.io
@ 2024-01-10  0:48 ` Nhi Pham via groups.io
  2024-01-10 13:09 ` Laszlo Ersek
  1 sibling, 0 replies; 24+ messages in thread
From: Nhi Pham via groups.io @ 2024-01-10  0:48 UTC (permalink / raw)
  To: ardb+tianocore@kernel.org, devel@edk2.groups.io, discuss

++ discuss@edk2.groups.io

On 1/8/2024 5:11 PM, Nhi Pham wrote:
> Hi Ard, all,
> 
> Could you please help explain how the depex section in an image is
> mapped in terms of memory attribute?
> 
> As my observation, dispatcher locates[1] the depex section inside the
> module image and write[2] an evaluated data to the depex if necessary
> for scheduled boot order. The problem that the depex section is now in
> RO+X memory due to a part of the module image, so a writing to depex
> would cause data abort. I'm unsure whether this issue is generic in EDK2
> or not.
> 
> I think of two approaches:
> 
> #1 Relocate the depex section to heap memory for dependency evaluation?
> 
> #2 EDK2 build tool to support granting write permission for depex section.
> 
> [1] StandaloneMmPkg/Core/FwVol.c:236
> [2] StandaloneMmPkg/Core/Dependency.c:256
> 
> Thanks,
> Nhi



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#113497): https://edk2.groups.io/g/devel/message/113497
Mute This Topic: https://groups.io/mt/103594587/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [edk2-devel] Memory Attribute for depex section
  2024-01-08 10:11 [edk2-devel] Memory Attribute for depex section Nhi Pham via groups.io
  2024-01-10  0:48 ` Nhi Pham via groups.io
@ 2024-01-10 13:09 ` Laszlo Ersek
  2024-01-10 13:45   ` Laszlo Ersek
  1 sibling, 1 reply; 24+ messages in thread
From: Laszlo Ersek @ 2024-01-10 13:09 UTC (permalink / raw)
  To: devel, nhi, ardb+tianocore@kernel.org

Hi,

On 1/8/24 11:11, Nhi Pham via groups.io wrote:
> Hi Ard, all,
>
> Could you please help explain how the depex section in an image is
> mapped in terms of memory attribute?
>
> As my observation, dispatcher locates[1] the depex section inside the
> module image and write[2] an evaluated data to the depex if necessary
> for scheduled boot order. The problem that the depex section is now in
> RO+X memory due to a part of the module image, so a writing to depex
> would cause data abort. I'm unsure whether this issue is generic in
> EDK2 or not.
>
> I think of two approaches:
>
> #1 Relocate the depex section to heap memory for dependency
> #evaluation?
>
> #2 EDK2 build tool to support granting write permission for depex
> #section.
>
> [1] StandaloneMmPkg/Core/FwVol.c:236
> [2] StandaloneMmPkg/Core/Dependency.c:256

here's my understanding of the issue.

In Platform Init 1.8, section II-10.7 describes the depex instruction
set for the DXE DISPATCHER. In particular, section II-10.7.3 describes
the PUSH opcode as follows:

  PUSH <Protocol GUID>

  Pushes a Boolean value onto the stack. If the GUID is present in the
  handle database, then a TRUE is pushed onto the stack. If the GUID is
  not present in the handle database, then a FALSE is pushed onto the
  stack. The test for the GUID in the handle database may be performed
  with the Boot Service LocateProtocol().

This basically says that every time the dispatcher sees a PUSH opcode in
a depex, it has to look up the protocol GUID in the protocol database.

Now, as far as I can tell, edk2's DXE Core implementation wanted to
optimize this. (And this goes back to ancient commit 28a00297189c, from
2007.) Here's the idea AIUI:

- Assume the protocol is not found, i.e. we push a FALSE. This will
likely mean that the driver whose DEPEX contains this PUSH opcode cannot
be dispatched just yet. In other words, we're going to have to
re-evaluate this PUSH at least once more, possibly multiple times. We
cannot optimize anything here, because the needed protocol is not
present yet, but it may become available later. When we next evaluate
the DEPEX for this driver, we'll check again if the protocol has become
available by then.

- Assume the protocol is found, i.e. we push a TRUE. The optimization
here is that we assume the protocol will not *disappear*, once
available. The evaluation of the driver's *entire* DEPEX may still
result in FALSE, so it's not guaranteed that the driver can be
dispatched right now. However, given our assumption that the protocol
we've just found will not disappear during DXE driver dispatch, we might
want to "cache" the current successful protocol lookup, and when we
*next* evaluate the DEPEX for this same driver (assuming we cannot
dispatch it right now due to something else missing from its DEPEX), we
don't want to perform the *same* protocol lookup again -- we'll just
want to remember it was already there last time.

And the way the DXE core seems to implement this optimization (i.e., how
it "remembers success") is that it patches the DEPEX in-place: it
replaces the PUSH opcode with a special (edk2-specific) opcode called
EFI_DEP_REPLACE_TRUE (in addition to pushing the TRUE of course, to the
eval stack). This opcode is functionally identical to the plain (and
standard) TRUE opcode, so the next time the dispatcher evaluates the
same depex and sees EFI_DEP_REPLACE_TRUE it will just push TRUE, the
difference with the TRUE opcode is that EFI_DEP_REPLACE_TRUE also
provides "room" for the -- otherwise useless -- original protocol GUID
that used to be the argument of the PUSH opcode. The dispatcher ignores
the protocol GUID on EFI_DEP_REPLACE_TRUE, beyond logging it.

EFI_DEP_REPLACE_TRUE is documented in the code as follows
("MdeModulePkg/Core/Dxe/DxeMain.h"):

///
/// EFI_DEP_REPLACE_TRUE - Used to dynamically patch the dependency expression
///                        to save time.  A EFI_DEP_PUSH is evaluated one an
///                        replaced with EFI_DEP_REPLACE_TRUE. If PI spec's Vol 2
///                        Driver Execution Environment Core Interface use 0xff
///                        as new DEPEX opcode. EFI_DEP_REPLACE_TRUE should be
///                        defined to a new value that is not conflicting with PI spec.
///
#define EFI_DEP_REPLACE_TRUE  0xff

This documentation is hard to understand due to English grammar errors.
It means to say:

"Used to dynamically patch the dependency expression to save time.  An
EFI_DEP_PUSH opcode is evaluated *once*, *and* replaced with
EFI_DEP_REPLACE_TRUE. If PI spec's Vol 2 Driver Execution Environment
Core Interface *starts using* 0xff as new DEPEX opcode *in the future*,
*then* EFI_DEP_REPLACE_TRUE should be *redefined* to a new value that is
not conflicting with *said new* PI spec."

Over time, this optimization (hack) has made its way into the
traditional PiSmmCore, and finally into the standalone SMM core,
apparently.

In summary, this seems like a performance optimization, and should make
no functional difference. If we remove it, then driver dispatch could
become a bit slower (because we'd re-evaluate such PUSH opcodes too that
had previously succeeded in locating the given protocol GUID), but then
we'd not try to overwrite the depex section of the driver image in
memory.

I think that keeping the depex section read-only is valuable, so I'd
rule out #2. I'd also not start with option #1 -- copying the depex to
heap memory, just so this optimization can succeed. I'd simply start by
removing the optimization, and measuring how much driver dispatch slows
down in practice, on various platforms.

Can you try this? (I have only build-tested and "uncrustified" it.)

The patch removes the EFI_DEP_REPLACE_TRUE handling altogether, plus it
CONST-ifies the Iterator pointer (which points into the DEPEX section),
so that the compiler catch any possible accesses at *build time* that
would write to the write-protected DEPEX memory area.

-----------------------------
diff --git a/StandaloneMmPkg/Core/Dependency.c b/StandaloneMmPkg/Core/Dependency.c
index 440fe3e45238..2bcb07d34666 100644
--- a/StandaloneMmPkg/Core/Dependency.c
+++ b/StandaloneMmPkg/Core/Dependency.c
@@ -13,16 +13,6 @@

 #include "StandaloneMmCore.h"

-///
-/// EFI_DEP_REPLACE_TRUE - Used to dynamically patch the dependency expression
-///                        to save time.  A EFI_DEP_PUSH is evaluated one an
-///                        replaced with EFI_DEP_REPLACE_TRUE. If PI spec's Vol 2
-///                        Driver Execution Environment Core Interface use 0xff
-///                        as new DEPEX opcode. EFI_DEP_REPLACE_TRUE should be
-///                        defined to a new value that is not conflicting with PI spec.
-///
-#define EFI_DEP_REPLACE_TRUE  0xff
-
 ///
 /// Define the initial size of the dependency expression evaluation stack
 ///
@@ -170,12 +160,12 @@ MmIsSchedulable (
   IN  EFI_MM_DRIVER_ENTRY  *DriverEntry
   )
 {
-  EFI_STATUS  Status;
-  UINT8       *Iterator;
-  BOOLEAN     Operator;
-  BOOLEAN     Operator2;
-  EFI_GUID    DriverGuid;
-  VOID        *Interface;
+  EFI_STATUS   Status;
+  CONST UINT8  *Iterator;
+  BOOLEAN      Operator;
+  BOOLEAN      Operator2;
+  EFI_GUID     DriverGuid;
+  VOID         *Interface;

   Operator  = FALSE;
   Operator2 = FALSE;
@@ -253,8 +243,7 @@ MmIsSchedulable (
           Status = PushBool (FALSE);
         } else {
           DEBUG ((DEBUG_DISPATCH, "  PUSH GUID(%g) = TRUE\n", &DriverGuid));
-          *Iterator = EFI_DEP_REPLACE_TRUE;
-          Status    = PushBool (TRUE);
+          Status = PushBool (TRUE);
         }

         if (EFI_ERROR (Status)) {
@@ -356,18 +345,6 @@ MmIsSchedulable (
         DEBUG ((DEBUG_DISPATCH, "  RESULT = %a\n", Operator ? "TRUE" : "FALSE"));
         return Operator;

-      case EFI_DEP_REPLACE_TRUE:
-        CopyMem (&DriverGuid, Iterator + 1, sizeof (EFI_GUID));
-        DEBUG ((DEBUG_DISPATCH, "  PUSH GUID(%g) = TRUE\n", &DriverGuid));
-        Status = PushBool (TRUE);
-        if (EFI_ERROR (Status)) {
-          DEBUG ((DEBUG_DISPATCH, "  RESULT = FALSE (Unexpected error)\n"));
-          return FALSE;
-        }
-
-        Iterator += sizeof (EFI_GUID);
-        break;
-
       default:
         DEBUG ((DEBUG_DISPATCH, "  RESULT = FALSE (Unknown opcode)\n"));
         goto Done;
-----------------------------

Thanks
Laszlo




-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#113531): https://edk2.groups.io/g/devel/message/113531
Mute This Topic: https://groups.io/mt/103594587/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/leave/12367111/7686176/1913456212/xyzzy [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [edk2-devel] Memory Attribute for depex section
  2024-01-10 13:09 ` Laszlo Ersek
@ 2024-01-10 13:45   ` Laszlo Ersek
  2024-01-10 21:50     ` Pedro Falcato
  2024-01-12  2:44     ` Andrew Fish via groups.io
  0 siblings, 2 replies; 24+ messages in thread
From: Laszlo Ersek @ 2024-01-10 13:45 UTC (permalink / raw)
  To: devel, nhi, ardb+tianocore@kernel.org, Andrew Fish

(+ Andrew)

On 1/10/24 14:09, Laszlo Ersek wrote:

> I think that keeping the depex section read-only is valuable, so I'd
> rule out #2. I'd also not start with option #1 -- copying the depex to
> heap memory, just so this optimization can succeed. I'd simply start by
> removing the optimization, and measuring how much driver dispatch slows
> down in practice, on various platforms.
> 
> Can you try this? (I have only build-tested and "uncrustified" it.)
> 
> The patch removes the EFI_DEP_REPLACE_TRUE handling altogether, plus it
> CONST-ifies the Iterator pointer (which points into the DEPEX section),
> so that the compiler catch any possible accesses at *build time* that
> would write to the write-protected DEPEX memory area.

On a tangent: the optimization in question highlights a more general
problem, namely that the DXE (and possibly MM/SMM) protocol databases
are considered slow, for some access patterns.

Edk2 implements those protocol databases with linked lists, where lookup
costs O(n) operations (average and worst cases). And protocol lookups
are quite frequent (for example, in depex evaluations, they could be
considered "particularly frequent").

(1) The "Tasks" wiki page mentions something *similar* (but not the
same); see

https://github.com/tianocore/tianocore.github.io/wiki/Tasks#datahub--gcd-scalability

The description is: "The DXE core's DataHub and GCD (Global Coherency
Domain) layers don't scale well as the number of data items gets large,
since they are based on simple linked lists. Find better data structures."

The same might apply more or less to the protocol database implementation.

(2) More to the point, Andrew Fish reported earlier that at Apple, they
had rewritten the DXE protocol database, using the red-black tree
OrderedCollectionLib that I had contributed previously to edk2 -- and
they saw significant performance improvements.

So upstreaming that feature to edk2 could be very valuable. (Red-black
trees have O(log(n)) time cost (worst case) for lookup, insertion and
deletion, and O(n) cost for iterating through the whole data structure.)

Let me see if I can find the bugzilla ticket...

Ah, I got it. Apologies, I misremembered: Andrew's comment was not about
the protocol database, but about the handle database. Here it is:

https://bugzilla.tianocore.org/show_bug.cgi?id=988#c7

(the BZ is still CONFIRMED btw...)

Still, I think it must be related in a way. Namely, an EFI handle exists
if and only if at least one protocol interface is installed on it. If
you uninstall the last protocol interface from a handle, then the handle
is destroyed -- in fact that's the *only* way to destroy a handle, to my
understanding. See EFI_BOOT_SERVICES.UninstallProtocolInterface() in the
UEFI spec: "If the last protocol interface is removed from a handle, the
handle is freed and is no longer valid". Furthermore, calling
InstallProtocolInterface() and InstallMultipleProtocolInterfaces() is
how one *creates* new handles.

So given how handles and protocol interfaces are conceptually
interlinked, an rbtree-based protocol DB might have to maintain multiple
rbtrees internally (for the ability to search the database quickly with
different types of "keys"). I don't have a design ready in my mind at
all (I'm not that familiar with the *current*, list-based implementation
to begin with!). Upstreaming Apple's (experimental?) code could prove
very helpful.

Laszlo



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#113532): https://edk2.groups.io/g/devel/message/113532
Mute This Topic: https://groups.io/mt/103594587/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/leave/12367111/7686176/1913456212/xyzzy [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [edk2-devel] Memory Attribute for depex section
  2024-01-10 13:45   ` Laszlo Ersek
@ 2024-01-10 21:50     ` Pedro Falcato
  2024-01-11  8:46       ` Laszlo Ersek
  2024-01-12  2:44     ` Andrew Fish via groups.io
  1 sibling, 1 reply; 24+ messages in thread
From: Pedro Falcato @ 2024-01-10 21:50 UTC (permalink / raw)
  To: devel, lersek; +Cc: nhi, ardb+tianocore@kernel.org, Andrew Fish

On Wed, Jan 10, 2024 at 1:45 PM Laszlo Ersek <lersek@redhat.com> wrote:
>
> (+ Andrew)
>
> On 1/10/24 14:09, Laszlo Ersek wrote:
>
> > I think that keeping the depex section read-only is valuable, so I'd
> > rule out #2. I'd also not start with option #1 -- copying the depex to
> > heap memory, just so this optimization can succeed. I'd simply start by
> > removing the optimization, and measuring how much driver dispatch slows
> > down in practice, on various platforms.
> >
> > Can you try this? (I have only build-tested and "uncrustified" it.)
> >
> > The patch removes the EFI_DEP_REPLACE_TRUE handling altogether, plus it
> > CONST-ifies the Iterator pointer (which points into the DEPEX section),
> > so that the compiler catch any possible accesses at *build time* that
> > would write to the write-protected DEPEX memory area.
>
> On a tangent: the optimization in question highlights a more general
> problem, namely that the DXE (and possibly MM/SMM) protocol databases
> are considered slow, for some access patterns.
>
> Edk2 implements those protocol databases with linked lists, where lookup
> costs O(n) operations (average and worst cases). And protocol lookups
> are quite frequent (for example, in depex evaluations, they could be
> considered "particularly frequent").
>
> (1) The "Tasks" wiki page mentions something *similar* (but not the
> same); see
>
> https://github.com/tianocore/tianocore.github.io/wiki/Tasks#datahub--gcd-scalability
>
> The description is: "The DXE core's DataHub and GCD (Global Coherency
> Domain) layers don't scale well as the number of data items gets large,
> since they are based on simple linked lists. Find better data structures."

How large do they usually get? What's the worst case?

>
> The same might apply more or less to the protocol database implementation.
>
> (2) More to the point, Andrew Fish reported earlier that at Apple, they
> had rewritten the DXE protocol database, using the red-black tree
> OrderedCollectionLib that I had contributed previously to edk2 -- and
> they saw significant performance improvements.
>
> So upstreaming that feature to edk2 could be very valuable. (Red-black
> trees have O(log(n)) time cost (worst case) for lookup, insertion and
> deletion, and O(n) cost for iterating through the whole data structure.)

FWIW, can we do better than an RB tree? They're notoriously cache unfriendly...

>
> Let me see if I can find the bugzilla ticket...
>
> Ah, I got it. Apologies, I misremembered: Andrew's comment was not about
> the protocol database, but about the handle database. Here it is:
>
> https://bugzilla.tianocore.org/show_bug.cgi?id=988#c7
>
> (the BZ is still CONFIRMED btw...)
>
> Still, I think it must be related in a way. Namely, an EFI handle exists
> if and only if at least one protocol interface is installed on it. If
> you uninstall the last protocol interface from a handle, then the handle
> is destroyed -- in fact that's the *only* way to destroy a handle, to my
> understanding. See EFI_BOOT_SERVICES.UninstallProtocolInterface() in the
> UEFI spec: "If the last protocol interface is removed from a handle, the
> handle is freed and is no longer valid". Furthermore, calling
> InstallProtocolInterface() and InstallMultipleProtocolInterfaces() is
> how one *creates* new handles.
>
> So given how handles and protocol interfaces are conceptually
> interlinked, an rbtree-based protocol DB might have to maintain multiple
> rbtrees internally (for the ability to search the database quickly with
> different types of "keys"). I don't have a design ready in my mind at
> all (I'm not that familiar with the *current*, list-based implementation
> to begin with!). Upstreaming Apple's (experimental?) code could prove
> very helpful.

Ok, some thoughts:

For the handle database, if we made EFI_HANDLE an integer instead, we
could very easily use something similar to a radix tree (see Linux's
xarray). This would provide O(log(n)) lookups and insertion with a
much higher fan-out, without needing CoreValidateHandle's awful O(n)
lookup every time it gets an EFI_HANDLE. You'd just look up the handle
in the tree and if NULL, it's invalid. [1]

For the protocol database, you'd replace the linked list with a simple
hashtable, hashed by protocol. Something as simple as LIST_ENTRY
mProtocolHashtable[64]; would probably be enough to fan out most of
the problems (I think? How many different protocols are there?)

-- 
Pedro

[1] One could be wary of doing this lookup for every EFI_HANDLE
instead of having the pointer directly. But this is much more
efficient than needing to iterate a linked list to validate a pointer.
Considering an xarray-like radix tree as previously described, two
levels with a fan-out of 64 could describe indices 0 - 4096 (64 * 64),
which is much more efficient than chasing through pointers (worst
case) 4096 times until we find the handle.


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#113541): https://edk2.groups.io/g/devel/message/113541
Mute This Topic: https://groups.io/mt/103594587/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [edk2-devel] Memory Attribute for depex section
  2024-01-10 21:50     ` Pedro Falcato
@ 2024-01-11  8:46       ` Laszlo Ersek
  2024-01-11 11:03         ` Laszlo Ersek
  2024-01-12  2:10         ` Pedro Falcato
  0 siblings, 2 replies; 24+ messages in thread
From: Laszlo Ersek @ 2024-01-11  8:46 UTC (permalink / raw)
  To: Pedro Falcato, devel; +Cc: nhi, ardb+tianocore@kernel.org, Andrew Fish

On 1/10/24 22:50, Pedro Falcato wrote:
> On Wed, Jan 10, 2024 at 1:45 PM Laszlo Ersek <lersek@redhat.com>
> wrote:
>>
>> (+ Andrew)
>>
>> On 1/10/24 14:09, Laszlo Ersek wrote:
>>
>>> I think that keeping the depex section read-only is valuable, so I'd
>>> rule out #2. I'd also not start with option #1 -- copying the depex
>>> to heap memory, just so this optimization can succeed. I'd simply
>>> start by removing the optimization, and measuring how much driver
>>> dispatch slows down in practice, on various platforms.
>>>
>>> Can you try this? (I have only build-tested and "uncrustified" it.)
>>>
>>> The patch removes the EFI_DEP_REPLACE_TRUE handling altogether, plus
>>> it CONST-ifies the Iterator pointer (which points into the DEPEX
>>> section), so that the compiler catch any possible accesses at *build
>>> time* that would write to the write-protected DEPEX memory area.
>>
>> On a tangent: the optimization in question highlights a more general
>> problem, namely that the DXE (and possibly MM/SMM) protocol databases
>> are considered slow, for some access patterns.
>>
>> Edk2 implements those protocol databases with linked lists, where
>> lookup costs O(n) operations (average and worst cases). And protocol
>> lookups are quite frequent (for example, in depex evaluations, they
>> could be considered "particularly frequent").
>>
>> (1) The "Tasks" wiki page mentions something *similar* (but not the
>> same); see
>>
>> https://github.com/tianocore/tianocore.github.io/wiki/Tasks#datahub--gcd-scalability
>>
>> The description is: "The DXE core's DataHub and GCD (Global Coherency
>> Domain) layers don't scale well as the number of data items gets
>> large, since they are based on simple linked lists. Find better data
>> structures."
>
> How large do they usually get? What's the worst case?

No idea.

>
>>
>> The same might apply more or less to the protocol database
>> implementation.
>>
>> (2) More to the point, Andrew Fish reported earlier that at Apple,
>> they had rewritten the DXE protocol database, using the red-black
>> tree OrderedCollectionLib that I had contributed previously to edk2
>> -- and they saw significant performance improvements.
>>
>> So upstreaming that feature to edk2 could be very valuable.
>> (Red-black trees have O(log(n)) time cost (worst case) for lookup,
>> insertion and deletion, and O(n) cost for iterating through the whole
>> data structure.)
>
> FWIW, can we do better than an RB tree? They're notoriously cache
> unfriendly...

Sure, if someone contributes a different data structure that is suitable
for the task :)

RB trees may be cache unfriendly, but the functionality they provide
with O(log(n)) worst case performance is amazing. You can use them as
read-write associate arrays, sorted lists supporting forward and
backward traversal (in fact tools for sorting), priority queues, etc.

When I contributed the code, edk2 didn't have any associative array
type, so something generic that would address the widest range of use
cases looked like a good idea. (And indeed the library has been well
applied in several of those use cases since, in various parts of edk2 --
for sorting, uniqueness-checking, async interface token tracking &
lookups.)

This is not an argument against a more suitable data structure of
course. Just pointing out that the RB tree has worked well thus far.
E.g., under the BZ link below, Andrew mentions a diagnostic tool that
creates 3000 handles. Looking up an element in a 3000-element list would
cost 1500 iterations on average; using a balanced binary search tree it
might cost ~11 iterations. Assuming that linked lists and linked binary
search trees are similarly cache-unfriendly, that's a ~136 factor of
improvement.

>
>>
>> Let me see if I can find the bugzilla ticket...
>>
>> Ah, I got it. Apologies, I misremembered: Andrew's comment was not
>> about the protocol database, but about the handle database. Here it
>> is:
>>
>> https://bugzilla.tianocore.org/show_bug.cgi?id=988#c7
>>
>> (the BZ is still CONFIRMED btw...)
>>
>> Still, I think it must be related in a way. Namely, an EFI handle
>> exists if and only if at least one protocol interface is installed on
>> it. If you uninstall the last protocol interface from a handle, then
>> the handle is destroyed -- in fact that's the *only* way to destroy a
>> handle, to my understanding. See
>> EFI_BOOT_SERVICES.UninstallProtocolInterface() in the UEFI spec: "If
>> the last protocol interface is removed from a handle, the handle is
>> freed and is no longer valid". Furthermore, calling
>> InstallProtocolInterface() and InstallMultipleProtocolInterfaces() is
>> how one *creates* new handles.
>>
>> So given how handles and protocol interfaces are conceptually
>> interlinked, an rbtree-based protocol DB might have to maintain
>> multiple rbtrees internally (for the ability to search the database
>> quickly with different types of "keys"). I don't have a design ready
>> in my mind at all (I'm not that familiar with the *current*,
>> list-based implementation to begin with!). Upstreaming Apple's
>> (experimental?) code could prove very helpful.
>
> Ok, some thoughts:
>
> For the handle database, if we made EFI_HANDLE an integer instead, we
> could very easily use something similar to a radix tree (see Linux's
> xarray). This would provide O(log(n)) lookups and insertion with a
> much higher fan-out, without needing CoreValidateHandle's awful O(n)
> lookup every time it gets an EFI_HANDLE. You'd just look up the handle
> in the tree and if NULL, it's invalid. [1]

EFI_HANDLE is a typedef to (VOID*) per UEFI spec -- section 2.3.1 "Data
Types" says: "A collection of related interfaces. Type VOID *" --, so
EFI_HANDLE can't be redefined, it would break compatibility with
everything.

Internally, it could be cast to UINTN, and then those 32- or 64-bit long
bit strings could be considered as keys for a radix tree. But I'm not
sure if this would still be as useful as the two-level radix tree (with
a fanout of 64) that you describe.

(If we wanted to map the EFI_HANDLEs to the index space [0..4095], then
we'd just land back at the original problem: storing the EFI_HANDLEs in
an associative data structure, for example in a hash map. We could find
a good hash function, but because it would mathematically reduce the key
space from 64-bit pointers to 12-bit indices, collisions would be
theoretically inevitable, and the data structure couldn't avoid
*storing* the mapping. This is not a general argument against using a
hash table, it just means that we'd have to provide a good initial table
size, and then deal with resizing cost whenever necessary.)

> For the protocol database, you'd replace the linked list with a simple
> hashtable, hashed by protocol. Something as simple as LIST_ENTRY
> mProtocolHashtable[64]; would probably be enough to fan out most of
> the problems (I think? How many different protocols are there?)

I can answer this question reasonably well, I think. I have a script
that collects symbolic names of GUIDs from the edk2 tree (plus hardcodes
a number of well-known but not edk2 GUIDs), and creates a "sed" script
out of them. Then another script uses this "sed" script for filtering
edk2 logs -- the idea being to replace the whole bunch of logged GUIDs
with their symbolic names. That makes logs much easier to read.

The generator script is written such a way that the generated "sed"
script only grows over time; put differently, this "dictionary" of
name<->GUID associations never forgets, it only picks up new GUIDs. The
"sed" script (= the dictionary file) consists of entries like

s:FFB19961-79CC-4684-84A8-C31B0A2BBE82:[EarlyFdt16550SerialPortHookLib]:ig
s:FFB56F09-65E3-4462-A799-2F0D1930D38C:[DxeContextBufferModuleConfigLib]:ig
s:FFE06BDD-6107-46A6-7BB2-5A9C7EC5275C:[EfiAcpiTableProtocol]:ig
s:FFF12B8D-7696-4C8B-A985-2747075B4F50:[EfiSystemNvDataFv]:ig

it's sorted uniquely by GUID.

Right now, it has 3074 entries. (People like generating GUIDs! :) In
PI/UEFI/edk2, *everything* is a GUID, not just protocols!)

So using 64 buckets would hash ~50 different protocol GUIDs to any given
bucket (although, again, not all of those three thousand GUIDs are
*protocol* GUIDs). 1024 buckets might work better.

Laszlo



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#113595): https://edk2.groups.io/g/devel/message/113595
Mute This Topic: https://groups.io/mt/103594587/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/leave/12367111/7686176/1913456212/xyzzy [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [edk2-devel] Memory Attribute for depex section
  2024-01-11  8:46       ` Laszlo Ersek
@ 2024-01-11 11:03         ` Laszlo Ersek
  2024-01-12  2:10         ` Pedro Falcato
  1 sibling, 0 replies; 24+ messages in thread
From: Laszlo Ersek @ 2024-01-11 11:03 UTC (permalink / raw)
  To: Pedro Falcato, devel; +Cc: nhi, ardb+tianocore@kernel.org, Andrew Fish

On 1/11/24 09:46, Laszlo Ersek wrote:
> On 1/10/24 22:50, Pedro Falcato wrote:

>> For the protocol database, you'd replace the linked list with a simple
>> hashtable, hashed by protocol. Something as simple as LIST_ENTRY
>> mProtocolHashtable[64]; would probably be enough to fan out most of
>> the problems (I think? How many different protocols are there?)
> 
> I can answer this question reasonably well, I think. I have a script
> that collects symbolic names of GUIDs from the edk2 tree (plus hardcodes
> a number of well-known but not edk2 GUIDs), and creates a "sed" script
> out of them. Then another script uses this "sed" script for filtering
> edk2 logs -- the idea being to replace the whole bunch of logged GUIDs
> with their symbolic names. That makes logs much easier to read.
> 
> The generator script is written such a way that the generated "sed"
> script only grows over time; put differently, this "dictionary" of
> name<->GUID associations never forgets, it only picks up new GUIDs. The
> "sed" script (= the dictionary file) consists of entries like
> 
> s:FFB19961-79CC-4684-84A8-C31B0A2BBE82:[EarlyFdt16550SerialPortHookLib]:ig
> s:FFB56F09-65E3-4462-A799-2F0D1930D38C:[DxeContextBufferModuleConfigLib]:ig
> s:FFE06BDD-6107-46A6-7BB2-5A9C7EC5275C:[EfiAcpiTableProtocol]:ig
> s:FFF12B8D-7696-4C8B-A985-2747075B4F50:[EfiSystemNvDataFv]:ig
> 
> it's sorted uniquely by GUID.
> 
> Right now, it has 3074 entries. (People like generating GUIDs! :) In
> PI/UEFI/edk2, *everything* is a GUID, not just protocols!)

If I grep the dictionary for "Protocol", I get 515 hits.



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#113615): https://edk2.groups.io/g/devel/message/113615
Mute This Topic: https://groups.io/mt/103594587/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/leave/12367111/7686176/1913456212/xyzzy [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [edk2-devel] Memory Attribute for depex section
  2024-01-11  8:46       ` Laszlo Ersek
  2024-01-11 11:03         ` Laszlo Ersek
@ 2024-01-12  2:10         ` Pedro Falcato
  2024-01-12  9:35           ` Laszlo Ersek
  1 sibling, 1 reply; 24+ messages in thread
From: Pedro Falcato @ 2024-01-12  2:10 UTC (permalink / raw)
  To: Laszlo Ersek; +Cc: devel, nhi, ardb+tianocore@kernel.org, Andrew Fish

On Thu, Jan 11, 2024 at 8:46 AM Laszlo Ersek <lersek@redhat.com> wrote:
>
> On 1/10/24 22:50, Pedro Falcato wrote:
> > On Wed, Jan 10, 2024 at 1:45 PM Laszlo Ersek <lersek@redhat.com>
> > wrote:
> >>
> >> (+ Andrew)
> >>
> >> On 1/10/24 14:09, Laszlo Ersek wrote:
> >>
> >>> I think that keeping the depex section read-only is valuable, so I'd
> >>> rule out #2. I'd also not start with option #1 -- copying the depex
> >>> to heap memory, just so this optimization can succeed. I'd simply
> >>> start by removing the optimization, and measuring how much driver
> >>> dispatch slows down in practice, on various platforms.
> >>>
> >>> Can you try this? (I have only build-tested and "uncrustified" it.)
> >>>
> >>> The patch removes the EFI_DEP_REPLACE_TRUE handling altogether, plus
> >>> it CONST-ifies the Iterator pointer (which points into the DEPEX
> >>> section), so that the compiler catch any possible accesses at *build
> >>> time* that would write to the write-protected DEPEX memory area.
> >>
> >> On a tangent: the optimization in question highlights a more general
> >> problem, namely that the DXE (and possibly MM/SMM) protocol databases
> >> are considered slow, for some access patterns.
> >>
> >> Edk2 implements those protocol databases with linked lists, where
> >> lookup costs O(n) operations (average and worst cases). And protocol
> >> lookups are quite frequent (for example, in depex evaluations, they
> >> could be considered "particularly frequent").
> >>
> >> (1) The "Tasks" wiki page mentions something *similar* (but not the
> >> same); see
> >>
> >> https://github.com/tianocore/tianocore.github.io/wiki/Tasks#datahub--gcd-scalability
> >>
> >> The description is: "The DXE core's DataHub and GCD (Global Coherency
> >> Domain) layers don't scale well as the number of data items gets
> >> large, since they are based on simple linked lists. Find better data
> >> structures."
> >
> > How large do they usually get? What's the worst case?
>
> No idea.
>
> >
> >>
> >> The same might apply more or less to the protocol database
> >> implementation.
> >>
> >> (2) More to the point, Andrew Fish reported earlier that at Apple,
> >> they had rewritten the DXE protocol database, using the red-black
> >> tree OrderedCollectionLib that I had contributed previously to edk2
> >> -- and they saw significant performance improvements.
> >>
> >> So upstreaming that feature to edk2 could be very valuable.
> >> (Red-black trees have O(log(n)) time cost (worst case) for lookup,
> >> insertion and deletion, and O(n) cost for iterating through the whole
> >> data structure.)
> >
> > FWIW, can we do better than an RB tree? They're notoriously cache
> > unfriendly...
>
> Sure, if someone contributes a different data structure that is suitable
> for the task :)

Hey, I happen to have written one!
https://github.com/heatd/Onyx/blob/master/kernel/kernel/radix.cpp
It just needs some C'ifying, then some EFI'ing on top, but I'm fairly
confident in its stability.

>
> RB trees may be cache unfriendly, but the functionality they provide
> with O(log(n)) worst case performance is amazing. You can use them as
> read-write associate arrays, sorted lists supporting forward and
> backward traversal (in fact tools for sorting), priority queues, etc.
>
> When I contributed the code, edk2 didn't have any associative array
> type, so something generic that would address the widest range of use
> cases looked like a good idea. (And indeed the library has been well
> applied in several of those use cases since, in various parts of edk2 --
> for sorting, uniqueness-checking, async interface token tracking &
> lookups.)

Definitely, I use it in Ext4Dxe too, it's great! (Although I do have a
small nit in that it allocates nodes itself, and there's no way around
it, but oh well...)

>
> This is not an argument against a more suitable data structure of
> course. Just pointing out that the RB tree has worked well thus far.
> E.g., under the BZ link below, Andrew mentions a diagnostic tool that
> creates 3000 handles. Looking up an element in a 3000-element list would
> cost 1500 iterations on average; using a balanced binary search tree it
> might cost ~11 iterations. Assuming that linked lists and linked binary
> search trees are similarly cache-unfriendly, that's a ~136 factor of
> improvement.

I would guess that binary trees are more cache-and-speculation
unfriendly as you need to decide which branch you're going down. But
yes, even a RB tree would be a great improvement over the linked list.

>
> >
> >>
> >> Let me see if I can find the bugzilla ticket...
> >>
> >> Ah, I got it. Apologies, I misremembered: Andrew's comment was not
> >> about the protocol database, but about the handle database. Here it
> >> is:
> >>
> >> https://bugzilla.tianocore.org/show_bug.cgi?id=988#c7
> >>
> >> (the BZ is still CONFIRMED btw...)
> >>
> >> Still, I think it must be related in a way. Namely, an EFI handle
> >> exists if and only if at least one protocol interface is installed on
> >> it. If you uninstall the last protocol interface from a handle, then
> >> the handle is destroyed -- in fact that's the *only* way to destroy a
> >> handle, to my understanding. See
> >> EFI_BOOT_SERVICES.UninstallProtocolInterface() in the UEFI spec: "If
> >> the last protocol interface is removed from a handle, the handle is
> >> freed and is no longer valid". Furthermore, calling
> >> InstallProtocolInterface() and InstallMultipleProtocolInterfaces() is
> >> how one *creates* new handles.
> >>
> >> So given how handles and protocol interfaces are conceptually
> >> interlinked, an rbtree-based protocol DB might have to maintain
> >> multiple rbtrees internally (for the ability to search the database
> >> quickly with different types of "keys"). I don't have a design ready
> >> in my mind at all (I'm not that familiar with the *current*,
> >> list-based implementation to begin with!). Upstreaming Apple's
> >> (experimental?) code could prove very helpful.
> >
> > Ok, some thoughts:
> >
> > For the handle database, if we made EFI_HANDLE an integer instead, we
> > could very easily use something similar to a radix tree (see Linux's
> > xarray). This would provide O(log(n)) lookups and insertion with a
> > much higher fan-out, without needing CoreValidateHandle's awful O(n)
> > lookup every time it gets an EFI_HANDLE. You'd just look up the handle
> > in the tree and if NULL, it's invalid. [1]
>
> EFI_HANDLE is a typedef to (VOID*) per UEFI spec -- section 2.3.1 "Data
> Types" says: "A collection of related interfaces. Type VOID *" --, so
> EFI_HANDLE can't be redefined, it would break compatibility with
> everything.
>
> Internally, it could be cast to UINTN, and then those 32- or 64-bit long
> bit strings could be considered as keys for a radix tree. But I'm not
> sure if this would still be as useful as the two-level radix tree (with
> a fanout of 64) that you describe.
>
> (If we wanted to map the EFI_HANDLEs to the index space [0..4095], then
> we'd just land back at the original problem: storing the EFI_HANDLEs in
> an associative data structure, for example in a hash map. We could find
> a good hash function, but because it would mathematically reduce the key
> space from 64-bit pointers to 12-bit indices, collisions would be
> theoretically inevitable, and the data structure couldn't avoid
> *storing* the mapping. This is not a general argument against using a
> hash table, it just means that we'd have to provide a good initial table
> size, and then deal with resizing cost whenever necessary.)

My idea was to make each handle an index - like a file descriptor -
AFAIK there's no reason why it *needs* to be an actual pointer.
We'd allocate indices when creating a handle, and return that (casted to VOID*).

Say, for a naive allocation scheme and the data structure I linked above:

/* 0 (NULL) must be reserved, as it's a bad EFI_HANDLE result. We
could also return HandleId + 1, and - 1 when indexing back to the
radix tree. Either option works */
STATIC UINTN mNextHandleId = 1;
STATIC radix_tree mHandleTree;

STATIC
IHANDLE *
CreateHandle (
  OUT UINTN *OutHandleId
  )
{
  IHANDLE *Handle = AllocatePool(sizeof(IHANDLE));
  *OutHandleId = mNextHandleId++;
  mHandleTree.insert(*OutHandleId, Handle);
  return Handle;
}

/* And then transform CoreValidateHandle into */
STATIC
EFI_STATUS
CoreGetHandle (
  IN EFI_HANDLE Handle,
  OUT IHANDLE **OutHandle
  )
{
  IHANDLE *Result;
  if (Handle == NULL)
    return EFI_INVALID_PARAMETER;
  expected<unsigned long, int> result = mHandleTree.get((UINTN) Handle);
  if (result.has_error())
    return EFI_INVALID_PARAMETER; /* Handle not in the tree */
  *OutHandle = (IHANDLE *) result.value();
  return EFI_SUCCESS;
}

--

Hopefully the snippet gets my point across. Insertion, searching and
iteration should all be super fast (O(logFanOut(N))).
Note that one could also just use a standard resizable vector (a-la
std::vector), but I don't know if that maps well to EFI handles and
memory allocation.

>
> > For the protocol database, you'd replace the linked list with a simple
> > hashtable, hashed by protocol. Something as simple as LIST_ENTRY
> > mProtocolHashtable[64]; would probably be enough to fan out most of
> > the problems (I think? How many different protocols are there?)
>
> I can answer this question reasonably well, I think. I have a script
> that collects symbolic names of GUIDs from the edk2 tree (plus hardcodes
> a number of well-known but not edk2 GUIDs), and creates a "sed" script
> out of them. Then another script uses this "sed" script for filtering
> edk2 logs -- the idea being to replace the whole bunch of logged GUIDs
> with their symbolic names. That makes logs much easier to read.
>
> The generator script is written such a way that the generated "sed"
> script only grows over time; put differently, this "dictionary" of
> name<->GUID associations never forgets, it only picks up new GUIDs. The
> "sed" script (= the dictionary file) consists of entries like
>
> s:FFB19961-79CC-4684-84A8-C31B0A2BBE82:[EarlyFdt16550SerialPortHookLib]:ig
> s:FFB56F09-65E3-4462-A799-2F0D1930D38C:[DxeContextBufferModuleConfigLib]:ig
> s:FFE06BDD-6107-46A6-7BB2-5A9C7EC5275C:[EfiAcpiTableProtocol]:ig
> s:FFF12B8D-7696-4C8B-A985-2747075B4F50:[EfiSystemNvDataFv]:ig
>
> it's sorted uniquely by GUID.
>
> Right now, it has 3074 entries. (People like generating GUIDs! :) In
> PI/UEFI/edk2, *everything* is a GUID, not just protocols!)
>
> So using 64 buckets would hash ~50 different protocol GUIDs to any given
> bucket (although, again, not all of those three thousand GUIDs are
> *protocol* GUIDs). 1024 buckets might work better.

Oh! That's nice! Taking your protocol figure of 515 into account, 512
or even 1024 buckets should probably take care of that.

I should note that I find it super hard to get a concrete idea on
performance for EFI firmware without adequate tooling - I meant to
write a standalone flamegraph tool a few weeks back (even posted in
edk2-devel), but, as far as I could tell, the architectural timer
protocol couldn't get me the interrupt frame[1]. Until then, whether
any of this radix tree vs RB tree vs flat array stuff really
matters... I find it hard to say.

[1] x86 has 3 timers (PIT, LAPIC timer, HPET) and performance
monitoring interrupts, and I couldn't freely use any of them :^)

-- 
Pedro


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#113638): https://edk2.groups.io/g/devel/message/113638
Mute This Topic: https://groups.io/mt/103594587/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [edk2-devel] Memory Attribute for depex section
  2024-01-10 13:45   ` Laszlo Ersek
  2024-01-10 21:50     ` Pedro Falcato
@ 2024-01-12  2:44     ` Andrew Fish via groups.io
  2024-01-12  9:45       ` Laszlo Ersek
  1 sibling, 1 reply; 24+ messages in thread
From: Andrew Fish via groups.io @ 2024-01-12  2:44 UTC (permalink / raw)
  To: edk2-devel-groups-io, Laszlo Ersek; +Cc: nhi, ardb+tianocore@kernel.org

[-- Attachment #1: Type: text/plain, Size: 4964 bytes --]

Sorry need some more time to digest this…. First thoughts.

1) The actual performance issue we hit was the explosion of CoreValidateHandle() calls as the number of protocols got large for some diags. The newer handles tended to be at the end of the list if I remember correctly. 
  a) It looks like CoreValidateHandle() is the only place  gHandleList was walked, as the handle info is crossed referenced in the protocol database. So that is why we changed that.
2) If the issue at hand is MM why not just drop the optimizations?
  b) If we have so many MM protocols and handles that seems like its own problem? 
3) The issue is patching the grammar in place, why can’t we just make a copy for the dispatcher grammer, and operate on the copy. Maybe via a copy on 1st update strategy? 

Thanks,

Andrew Fish

> On Jan 10, 2024, at 5:45 AM, Laszlo Ersek <lersek@redhat.com> wrote:
> 
> (+ Andrew)
> 
> On 1/10/24 14:09, Laszlo Ersek wrote:
> 
>> I think that keeping the depex section read-only is valuable, so I'd
>> rule out #2. I'd also not start with option #1 -- copying the depex to
>> heap memory, just so this optimization can succeed. I'd simply start by
>> removing the optimization, and measuring how much driver dispatch slows
>> down in practice, on various platforms.
>> 
>> Can you try this? (I have only build-tested and "uncrustified" it.)
>> 
>> The patch removes the EFI_DEP_REPLACE_TRUE handling altogether, plus it
>> CONST-ifies the Iterator pointer (which points into the DEPEX section),
>> so that the compiler catch any possible accesses at *build time* that
>> would write to the write-protected DEPEX memory area.
> 
> On a tangent: the optimization in question highlights a more general
> problem, namely that the DXE (and possibly MM/SMM) protocol databases
> are considered slow, for some access patterns.
> 
> Edk2 implements those protocol databases with linked lists, where lookup
> costs O(n) operations (average and worst cases). And protocol lookups
> are quite frequent (for example, in depex evaluations, they could be
> considered "particularly frequent").
> 
> (1) The "Tasks" wiki page mentions something *similar* (but not the
> same); see
> 
> https://github.com/tianocore/tianocore.github.io/wiki/Tasks#datahub--gcd-scalability
> 
> The description is: "The DXE core's DataHub and GCD (Global Coherency
> Domain) layers don't scale well as the number of data items gets large,
> since they are based on simple linked lists. Find better data structures."
> 
> The same might apply more or less to the protocol database implementation.
> 
> (2) More to the point, Andrew Fish reported earlier that at Apple, they
> had rewritten the DXE protocol database, using the red-black tree
> OrderedCollectionLib that I had contributed previously to edk2 -- and
> they saw significant performance improvements.
> 
> So upstreaming that feature to edk2 could be very valuable. (Red-black
> trees have O(log(n)) time cost (worst case) for lookup, insertion and
> deletion, and O(n) cost for iterating through the whole data structure.)
> 
> Let me see if I can find the bugzilla ticket...
> 
> Ah, I got it. Apologies, I misremembered: Andrew's comment was not about
> the protocol database, but about the handle database. Here it is:
> 
> https://bugzilla.tianocore.org/show_bug.cgi?id=988#c7
> 
> (the BZ is still CONFIRMED btw...)
> 
> Still, I think it must be related in a way. Namely, an EFI handle exists
> if and only if at least one protocol interface is installed on it. If
> you uninstall the last protocol interface from a handle, then the handle
> is destroyed -- in fact that's the *only* way to destroy a handle, to my
> understanding. See EFI_BOOT_SERVICES.UninstallProtocolInterface() in the
> UEFI spec: "If the last protocol interface is removed from a handle, the
> handle is freed and is no longer valid". Furthermore, calling
> InstallProtocolInterface() and InstallMultipleProtocolInterfaces() is
> how one *creates* new handles.
> 
> So given how handles and protocol interfaces are conceptually
> interlinked, an rbtree-based protocol DB might have to maintain multiple
> rbtrees internally (for the ability to search the database quickly with
> different types of "keys"). I don't have a design ready in my mind at
> all (I'm not that familiar with the *current*, list-based implementation
> to begin with!). Upstreaming Apple's (experimental?) code could prove
> very helpful.
> 
> Laszlo
> 
> 
> 
> 
> 
> 



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#113644): https://edk2.groups.io/g/devel/message/113644
Mute This Topic: https://groups.io/mt/103594587/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/leave/12367111/7686176/1913456212/xyzzy [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



[-- Attachment #2: Type: text/html, Size: 6116 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [edk2-devel] Memory Attribute for depex section
  2024-01-12  2:10         ` Pedro Falcato
@ 2024-01-12  9:35           ` Laszlo Ersek
  2024-01-12 14:46             ` Pedro Falcato
  0 siblings, 1 reply; 24+ messages in thread
From: Laszlo Ersek @ 2024-01-12  9:35 UTC (permalink / raw)
  To: Pedro Falcato; +Cc: devel, nhi, ardb+tianocore@kernel.org, Andrew Fish

On 1/12/24 03:10, Pedro Falcato wrote:
> On Thu, Jan 11, 2024 at 8:46 AM Laszlo Ersek <lersek@redhat.com> wrote:
>>
>> On 1/10/24 22:50, Pedro Falcato wrote:

>>> FWIW, can we do better than an RB tree? They're notoriously cache
>>> unfriendly...
>>
>> Sure, if someone contributes a different data structure that is suitable
>> for the task :)
> 
> Hey, I happen to have written one!
> https://github.com/heatd/Onyx/blob/master/kernel/kernel/radix.cpp
> It just needs some C'ifying, then some EFI'ing on top, but I'm fairly
> confident in its stability.

Yes, this is absolutely viable and welcome. You seem to hold the
copyright on it, too, so you could even relicense (although MIT should
just be fine for edk2).

(I had done the same with the rbtree code -- I had written that code
much earlier,  not for edk2. I re-verified it and ported it to edk2, and
relicensed it.)

>> When I contributed the code, edk2 didn't have any associative array
>> type, so something generic that would address the widest range of use
>> cases looked like a good idea. (And indeed the library has been well
>> applied in several of those use cases since, in various parts of edk2 --
>> for sorting, uniqueness-checking, async interface token tracking &
>> lookups.)
> 
> Definitely, I use it in Ext4Dxe too, it's great!

Heh, I didn't know that. Thanks!

> (Although I do have a
> small nit in that it allocates nodes itself, and there's no way around
> it, but oh well...)

Right, that's the usual intrusive vs. non-intrusive containers debate :)

I didn't mind more allocations (and hence perhaps more fragmentation),
but I wanted to (a) hide the node internals, (b) the possibility to link
any given piece of user structure into multiple trees (and other
containers) without having to modify the user structure itself (i.e.
without embedding further node / link structs into the user struct).

I figure each of intrusive and non-intrusive has its advantages over the
other; I just went with my preferences.

> My idea was to make each handle an index - like a file descriptor -
> AFAIK there's no reason why it *needs* to be an actual pointer.
> We'd allocate indices when creating a handle, and return that (casted to VOID*).

Huh, sneaky.

I see two potential problems with this.

First, per C std, these "pointers" would be invalid (they wouldn't
actually point to valid memory), so even just evaluating them (not
dereferencing them!) would invoke undefined behavior. May or may not
matter in practice. With compilers getting smarter about optimization
(and stricter about std conformance), there could be issues, perhaps.

The other concern is a bit contrived, but I *guess* there could be code
out there that actually dereferences EFI_HANDLEs. That code would crash.
May or may not matter, because such code is arguably already
semantically invalid. (It would not necessarily be invalid at the
language level -- cf. my previous paragraph --, because passing an
otherwise valid EFI_HANDLE to CopyMem, for copying just 1 byte out of
the underlying opaque data structure, would not violate the language.)

> I should note that I find it super hard to get a concrete idea on
> performance for EFI firmware without adequate tooling - I meant to
> write a standalone flamegraph tool a few weeks back (even posted in
> edk2-devel), but, as far as I could tell, the architectural timer
> protocol couldn't get me the interrupt frame[1]. Until then, whether
> any of this radix tree vs RB tree vs flat array stuff really
> matters... I find it hard to say.
> 
> [1] x86 has 3 timers (PIT, LAPIC timer, HPET) and performance
> monitoring interrupts, and I couldn't freely use any of them :^)

Edk2 has some form of profiling already (see
"MdePkg/Include/Library/PerformanceLib.h"). Usually one sees core code
peppered with PERF_CODE_BEGIN and PERF_CODE_END macros. I *think* there
is something like a "display performance" (dp) shell application too,
that can show the collected stats. But I've never used these facilities.

The wiki seems to have two related articles:

https://github.com/tianocore/tianocore.github.io/wiki/Edk2-Performance-Infrastructure

https://github.com/tianocore/tianocore.github.io/wiki/PerformancePkg

The former looks quite comprehensive, at a quick skim.

Laszlo



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#113703): https://edk2.groups.io/g/devel/message/113703
Mute This Topic: https://groups.io/mt/103594587/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/leave/12367111/7686176/1913456212/xyzzy [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [edk2-devel] Memory Attribute for depex section
  2024-01-12  2:44     ` Andrew Fish via groups.io
@ 2024-01-12  9:45       ` Laszlo Ersek
  2024-01-15 13:07         ` Nhi Pham via groups.io
  0 siblings, 1 reply; 24+ messages in thread
From: Laszlo Ersek @ 2024-01-12  9:45 UTC (permalink / raw)
  To: Andrew (EFI) Fish, edk2-devel-groups-io; +Cc: nhi, ardb+tianocore@kernel.org

On 1/12/24 03:44, Andrew (EFI) Fish wrote:
> Sorry need some more time to digest this…. First thoughts.
> 
> 1) The actual performance issue we hit was the explosion
> of CoreValidateHandle() calls as the number of protocols got large for
> some diags. The newer handles tended to be at the end of the list if I
> remember correctly. 
>   a) It looks like CoreValidateHandle() is the only
> place  *gHandleList* was walked, as the handle info is crossed
> referenced in the protocol database. So that is why we changed that.
> 2) If the issue at hand is MM why not just drop the optimizations?

Yes, that's the first (and easy) idea. But I guess it needs some
measurements (no perf regression). It would be nice to know whether
Intel (?) measured any serious perf gains when they originally
implemented (in the 2000s?) the optimization.

>   b) If we have so many MM protocols and handles that seems like its own
> problem? 

I would agree; however, IIRC, the depex evaluator for the MM drivers
also searches the UEFI (DXE) protocol database, not just the MM protocol
database.

(Independently: I think that's a valid thing to do for *SMM* drivers,
because the entry point functions of those drivers are permitted to use
both SMM and DXE/UEFI protocols. But whether the same is valid for the
*standalone* MM drivers -- that looks questionable. Standalone MM
drivers should not depend on UEFI/DXE protocols ever, IIUC.)

> 3) The issue is patching the grammar in place, why can’t we just make a
> copy for the dispatcher grammer, and operate on the copy. Maybe via a
> copy on 1st update strategy? 

Yes, copying the depex to the heap, and patching it there, was Nhi's #1
fix proposal. I think that could be made work. But I'm not sure if the
perf savings are worth the additional complexity. The heap allocation
(where the writeable depex would exist) would have to be permanently
associated with the loaded PE image -- because the dispatcher might need
to reevaluate the depex across multiple rounds of dispatching. So that's
a new field in some image-related structure, it also needs to be freed
upon unload (?), what if the memory allocation fails during depex eval
(just consider the depex to eval to FALSE?), etc. Doable, but hairy; not
sure if the perf is worth that effort.

Thanks!
Laszlo



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#113704): https://edk2.groups.io/g/devel/message/113704
Mute This Topic: https://groups.io/mt/103594587/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/leave/12367111/7686176/1913456212/xyzzy [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [edk2-devel] Memory Attribute for depex section
  2024-01-12  9:35           ` Laszlo Ersek
@ 2024-01-12 14:46             ` Pedro Falcato
  2024-01-12 16:37               ` Michael D Kinney
  2024-01-12 18:50               ` Andrew Fish via groups.io
  0 siblings, 2 replies; 24+ messages in thread
From: Pedro Falcato @ 2024-01-12 14:46 UTC (permalink / raw)
  To: Laszlo Ersek; +Cc: devel, nhi, ardb+tianocore@kernel.org, Andrew Fish

On Fri, Jan 12, 2024 at 9:35 AM Laszlo Ersek <lersek@redhat.com> wrote:
>
> On 1/12/24 03:10, Pedro Falcato wrote:
> > My idea was to make each handle an index - like a file descriptor -
> > AFAIK there's no reason why it *needs* to be an actual pointer.
> > We'd allocate indices when creating a handle, and return that (casted to VOID*).
>
> Huh, sneaky.
>
> I see two potential problems with this.
>
> First, per C std, these "pointers" would be invalid (they wouldn't
> actually point to valid memory), so even just evaluating them (not
> dereferencing them!) would invoke undefined behavior. May or may not
> matter in practice. With compilers getting smarter about optimization
> (and stricter about std conformance), there could be issues, perhaps.

This is true. Stashing random integers in pointers is
implementation-defined. But it's also super common. Win32 HANDLEs are
exactly this, just integers (stashed in VOID*). The Linux kernel world
also has a bunch of fun tricks with stashing flags in a pointer's
bottom bits, magic pointer values, etc. I severely doubt we can run
into issues with this. EDK2 will not exactly run on the C standard's
abstract machine anyway ;)

>
> The other concern is a bit contrived, but I *guess* there could be code
> out there that actually dereferences EFI_HANDLEs. That code would crash.
> May or may not matter, because such code is arguably already
> semantically invalid. (It would not necessarily be invalid at the
> language level -- cf. my previous paragraph --, because passing an
> otherwise valid EFI_HANDLE to CopyMem, for copying just 1 byte out of
> the underlying opaque data structure, would not violate the language.)

This is a feature, not a bug! :P

Seriously though, IHANDLE is not even exposed in semi-public headers,
so any code that's derefing an EFI_HANDLE will need to do something
like

typedef struct {
  /* ... */
} IHANDLE;

EFI_HANDLE Handle = /* ... */;

IHANDLE *HandleImpl = (IHANDLE *) Handle;

and I'm a strong believer in "play stupid games, win stupid prizes".
You can definitely make an argument for "this should definitely crash"
instead of just "maybe crashing" (for instance, platforms that still
map the NULL page (like OVMF!), or handles > 4096), so I'm inclined to
think that if we indeed go this route, we should set one or two upper
bits (on 64-bit platforms!) to make handles non-canonical addresses
and therefore necessarily crash on dereference.

>
> > I should note that I find it super hard to get a concrete idea on
> > performance for EFI firmware without adequate tooling - I meant to
> > write a standalone flamegraph tool a few weeks back (even posted in
> > edk2-devel), but, as far as I could tell, the architectural timer
> > protocol couldn't get me the interrupt frame[1]. Until then, whether
> > any of this radix tree vs RB tree vs flat array stuff really
> > matters... I find it hard to say.
> >
> > [1] x86 has 3 timers (PIT, LAPIC timer, HPET) and performance
> > monitoring interrupts, and I couldn't freely use any of them :^)
>
> Edk2 has some form of profiling already (see
> "MdePkg/Include/Library/PerformanceLib.h"). Usually one sees core code
> peppered with PERF_CODE_BEGIN and PERF_CODE_END macros. I *think* there
> is something like a "display performance" (dp) shell application too,
> that can show the collected stats. But I've never used these facilities.
>
> The wiki seems to have two related articles:
>
> https://github.com/tianocore/tianocore.github.io/wiki/Edk2-Performance-Infrastructure
>
> https://github.com/tianocore/tianocore.github.io/wiki/PerformancePkg
>
> The former looks quite comprehensive, at a quick skim.

/me nods
I've seen those macros around, but I've never used them.
In any case, this problem has piqued my interest, I'll see if I can
find some free time this weekend to hack on a test benchmark and a PoC
:)

-- 
Pedro


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#113741): https://edk2.groups.io/g/devel/message/113741
Mute This Topic: https://groups.io/mt/103594587/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [edk2-devel] Memory Attribute for depex section
  2024-01-12 14:46             ` Pedro Falcato
@ 2024-01-12 16:37               ` Michael D Kinney
  2024-01-12 18:56                 ` Andrew Fish via groups.io
  2024-01-12 18:50               ` Andrew Fish via groups.io
  1 sibling, 1 reply; 24+ messages in thread
From: Michael D Kinney @ 2024-01-12 16:37 UTC (permalink / raw)
  To: devel@edk2.groups.io, pedro.falcato@gmail.com, Laszlo Ersek
  Cc: nhi@os.amperecomputing.com, ardb+tianocore@kernel.org,
	Andrew Fish, Kinney, Michael D

Hi Pedro,

Thank you for evaluating this idea change from linked list to improve
performance of the handle database.

The concept of using integers for an EFI_HANDLE has been considered before.
One advantage over pointers is that a guarantee can be made that an EFI_HANDLE
value can be guaranteed to be unique.  In the current implementation with
EFI_HANDLE being a pointer to an allocated buffer, an EFI_HANDLE value could
potentially be reused if an EFI_HANDLE is freed and later allocated for a 
different EFI_HANDLE.

If you continue to explore this path I agree that EFI_HANDLE value of 0 should
be reserved and never used.  I also recommend that new EFI_HANDLE values are
always assigned a new unique value that freed EFI_HANDLE values are never reused.

The overall linked list performance of the handle database has also been noted
before, but has rarely raised to the level where it impacts the overall boot
performance relative to other optimization opportunities.  I look forward to
the performance data.  The PERF_X() macros are the right service to use.  On
x86 it is based on the time stamp counter (TSC) which has very small granularity.

Mike


> -----Original Message-----
> From: devel@edk2.groups.io <devel@edk2.groups.io> On Behalf Of Pedro Falcato
> Sent: Friday, January 12, 2024 6:47 AM
> To: Laszlo Ersek <lersek@redhat.com>
> Cc: devel@edk2.groups.io; nhi@os.amperecomputing.com;
> ardb+tianocore@kernel.org; Andrew Fish <afish@apple.com>
> Subject: Re: [edk2-devel] Memory Attribute for depex section
> 
> On Fri, Jan 12, 2024 at 9:35 AM Laszlo Ersek <lersek@redhat.com> wrote:
> >
> > On 1/12/24 03:10, Pedro Falcato wrote:
> > > My idea was to make each handle an index - like a file descriptor -
> > > AFAIK there's no reason why it *needs* to be an actual pointer.
> > > We'd allocate indices when creating a handle, and return that (casted to
> VOID*).
> >
> > Huh, sneaky.
> >
> > I see two potential problems with this.
> >
> > First, per C std, these "pointers" would be invalid (they wouldn't
> > actually point to valid memory), so even just evaluating them (not
> > dereferencing them!) would invoke undefined behavior. May or may not
> > matter in practice. With compilers getting smarter about optimization
> > (and stricter about std conformance), there could be issues, perhaps.
> 
> This is true. Stashing random integers in pointers is
> implementation-defined. But it's also super common. Win32 HANDLEs are
> exactly this, just integers (stashed in VOID*). The Linux kernel world
> also has a bunch of fun tricks with stashing flags in a pointer's
> bottom bits, magic pointer values, etc. I severely doubt we can run
> into issues with this. EDK2 will not exactly run on the C standard's
> abstract machine anyway ;)
> 
> >
> > The other concern is a bit contrived, but I *guess* there could be code
> > out there that actually dereferences EFI_HANDLEs. That code would crash.
> > May or may not matter, because such code is arguably already
> > semantically invalid. (It would not necessarily be invalid at the
> > language level -- cf. my previous paragraph --, because passing an
> > otherwise valid EFI_HANDLE to CopyMem, for copying just 1 byte out of
> > the underlying opaque data structure, would not violate the language.)
> 
> This is a feature, not a bug! :P
> 
> Seriously though, IHANDLE is not even exposed in semi-public headers,
> so any code that's derefing an EFI_HANDLE will need to do something
> like
> 
> typedef struct {
>   /* ... */
> } IHANDLE;
> 
> EFI_HANDLE Handle = /* ... */;
> 
> IHANDLE *HandleImpl = (IHANDLE *) Handle;
> 
> and I'm a strong believer in "play stupid games, win stupid prizes".
> You can definitely make an argument for "this should definitely crash"
> instead of just "maybe crashing" (for instance, platforms that still
> map the NULL page (like OVMF!), or handles > 4096), so I'm inclined to
> think that if we indeed go this route, we should set one or two upper
> bits (on 64-bit platforms!) to make handles non-canonical addresses
> and therefore necessarily crash on dereference.
> 
> >
> > > I should note that I find it super hard to get a concrete idea on
> > > performance for EFI firmware without adequate tooling - I meant to
> > > write a standalone flamegraph tool a few weeks back (even posted in
> > > edk2-devel), but, as far as I could tell, the architectural timer
> > > protocol couldn't get me the interrupt frame[1]. Until then, whether
> > > any of this radix tree vs RB tree vs flat array stuff really
> > > matters... I find it hard to say.
> > >
> > > [1] x86 has 3 timers (PIT, LAPIC timer, HPET) and performance
> > > monitoring interrupts, and I couldn't freely use any of them :^)
> >
> > Edk2 has some form of profiling already (see
> > "MdePkg/Include/Library/PerformanceLib.h"). Usually one sees core code
> > peppered with PERF_CODE_BEGIN and PERF_CODE_END macros. I *think* there
> > is something like a "display performance" (dp) shell application too,
> > that can show the collected stats. But I've never used these facilities.
> >
> > The wiki seems to have two related articles:
> >
> > https://github.com/tianocore/tianocore.github.io/wiki/Edk2-Performance-
> Infrastructure
> >
> > https://github.com/tianocore/tianocore.github.io/wiki/PerformancePkg
> >
> > The former looks quite comprehensive, at a quick skim.
> 
> /me nods
> I've seen those macros around, but I've never used them.
> In any case, this problem has piqued my interest, I'll see if I can
> find some free time this weekend to hack on a test benchmark and a PoC
> :)
> 
> --
> Pedro
> 
> 
> 
> 



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#113751): https://edk2.groups.io/g/devel/message/113751
Mute This Topic: https://groups.io/mt/103594587/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/leave/12367111/7686176/1913456212/xyzzy [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [edk2-devel] Memory Attribute for depex section
  2024-01-12 14:46             ` Pedro Falcato
  2024-01-12 16:37               ` Michael D Kinney
@ 2024-01-12 18:50               ` Andrew Fish via groups.io
  1 sibling, 0 replies; 24+ messages in thread
From: Andrew Fish via groups.io @ 2024-01-12 18:50 UTC (permalink / raw)
  To: Pedro Falcato
  Cc: Laszlo Ersek, edk2-devel-groups-io, nhi,
	ardb+tianocore@kernel.org



> On Jan 12, 2024, at 6:46 AM, Pedro Falcato <pedro.falcato@gmail.com> wrote:
> 
> On Fri, Jan 12, 2024 at 9:35 AM Laszlo Ersek <lersek@redhat.com> wrote:
>> 
>> On 1/12/24 03:10, Pedro Falcato wrote:
>>> My idea was to make each handle an index - like a file descriptor -
>>> AFAIK there's no reason why it *needs* to be an actual pointer.
>>> We'd allocate indices when creating a handle, and return that (casted to VOID*).
>> 
>> Huh, sneaky.
>> 
>> I see two potential problems with this.
>> 
>> First, per C std, these "pointers" would be invalid (they wouldn't
>> actually point to valid memory), so even just evaluating them (not
>> dereferencing them!) would invoke undefined behavior. May or may not
>> matter in practice. With compilers getting smarter about optimization
>> (and stricter about std conformance), there could be issues, perhaps.
> 
> This is true. Stashing random integers in pointers is
> implementation-defined. But it's also super common. Win32 HANDLEs are
> exactly this, just integers (stashed in VOID*). The Linux kernel world
> also has a bunch of fun tricks with stashing flags in a pointer's
> bottom bits, magic pointer values, etc. I severely doubt we can run
> into issues with this. EDK2 will not exactly run on the C standard's
> abstract machine anyway ;)
> 

I’d point out that CPUs we support do magic things with bits of pointers. ARMv8.3 defines pointer authentication and on a Mac arm64e is an ABI with pointer authentication enabled. 

Given call return rules are different and you have to initialize signed pointers it is a different ABI, but never say never that a signed pointer ABI could get added to UEFI.

Thanks,

Andrew Fish

>> 
>> The other concern is a bit contrived, but I *guess* there could be code
>> out there that actually dereferences EFI_HANDLEs. That code would crash.
>> May or may not matter, because such code is arguably already
>> semantically invalid. (It would not necessarily be invalid at the
>> language level -- cf. my previous paragraph --, because passing an
>> otherwise valid EFI_HANDLE to CopyMem, for copying just 1 byte out of
>> the underlying opaque data structure, would not violate the language.)
> 
> This is a feature, not a bug! :P
> 
> Seriously though, IHANDLE is not even exposed in semi-public headers,
> so any code that's derefing an EFI_HANDLE will need to do something
> like
> 
> typedef struct {
>  /* ... */
> } IHANDLE;
> 
> EFI_HANDLE Handle = /* ... */;
> 
> IHANDLE *HandleImpl = (IHANDLE *) Handle;
> 
> and I'm a strong believer in "play stupid games, win stupid prizes".
> You can definitely make an argument for "this should definitely crash"
> instead of just "maybe crashing" (for instance, platforms that still
> map the NULL page (like OVMF!), or handles > 4096), so I'm inclined to
> think that if we indeed go this route, we should set one or two upper
> bits (on 64-bit platforms!) to make handles non-canonical addresses
> and therefore necessarily crash on dereference.
> 
>> 
>>> I should note that I find it super hard to get a concrete idea on
>>> performance for EFI firmware without adequate tooling - I meant to
>>> write a standalone flamegraph tool a few weeks back (even posted in
>>> edk2-devel), but, as far as I could tell, the architectural timer
>>> protocol couldn't get me the interrupt frame[1]. Until then, whether
>>> any of this radix tree vs RB tree vs flat array stuff really
>>> matters... I find it hard to say.
>>> 
>>> [1] x86 has 3 timers (PIT, LAPIC timer, HPET) and performance
>>> monitoring interrupts, and I couldn't freely use any of them :^)
>> 
>> Edk2 has some form of profiling already (see
>> "MdePkg/Include/Library/PerformanceLib.h"). Usually one sees core code
>> peppered with PERF_CODE_BEGIN and PERF_CODE_END macros. I *think* there
>> is something like a "display performance" (dp) shell application too,
>> that can show the collected stats. But I've never used these facilities.
>> 
>> The wiki seems to have two related articles:
>> 
>> https://github.com/tianocore/tianocore.github.io/wiki/Edk2-Performance-Infrastructure
>> 
>> https://github.com/tianocore/tianocore.github.io/wiki/PerformancePkg
>> 
>> The former looks quite comprehensive, at a quick skim.
> 
> /me nods
> I've seen those macros around, but I've never used them.
> In any case, this problem has piqued my interest, I'll see if I can
> find some free time this weekend to hack on a test benchmark and a PoC
> :)
> 
> -- 
> Pedro



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#113753): https://edk2.groups.io/g/devel/message/113753
Mute This Topic: https://groups.io/mt/103594587/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/leave/12367111/7686176/1913456212/xyzzy [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [edk2-devel] Memory Attribute for depex section
  2024-01-12 16:37               ` Michael D Kinney
@ 2024-01-12 18:56                 ` Andrew Fish via groups.io
  2024-01-12 19:04                   ` Michael D Kinney
  0 siblings, 1 reply; 24+ messages in thread
From: Andrew Fish via groups.io @ 2024-01-12 18:56 UTC (permalink / raw)
  To: edk2-devel-groups-io, Mike Kinney
  Cc: pedro.falcato@gmail.com, Laszlo Ersek, nhi@os.amperecomputing.com,
	ardb+tianocore@kernel.org

[-- Attachment #1: Type: text/plain, Size: 6970 bytes --]



> On Jan 12, 2024, at 8:37 AM, Michael D Kinney <michael.d.kinney@intel.com> wrote:
> 
> Hi Pedro,
> 
> Thank you for evaluating this idea change from linked list to improve
> performance of the handle database.
> 
> The concept of using integers for an EFI_HANDLE has been considered before.
> One advantage over pointers is that a guarantee can be made that an EFI_HANDLE
> value can be guaranteed to be unique.  In the current implementation with
> EFI_HANDLE being a pointer to an allocated buffer, an EFI_HANDLE value could
> potentially be reused if an EFI_HANDLE is freed and later allocated for a 
> different EFI_HANDLE.
> 
> If you continue to explore this path I agree that EFI_HANDLE value of 0 should
> be reserved and never used.  I also recommend that new EFI_HANDLE values are
> always assigned a new unique value that freed EFI_HANDLE values are never reused.
> 
> The overall linked list performance of the handle database has also been noted
> before, but has rarely raised to the level where it impacts the overall boot
> performance relative to other optimization opportunities.  I look forward to
> the performance data.  The PERF_X() macros are the right service to use.  On
> x86 it is based on the time stamp counter (TSC) which has very small granularity.
> 

Mike,

We tracked this a while back with the PERF macros when we had some performance issues running diagnostics on a system with 3,000+ handles. The hot path was CoreValidateHandle(). I think the number of calls to CoreValidateHandle() explodes if you have more handles so it is not just the raw performance of CoreValidateHandle(), but also how many times it gets called. 

Thanks,

Andrew Fish

> Mike
> 
> 
>> -----Original Message-----
>> From: devel@edk2.groups.io <mailto:devel@edk2.groups.io> <devel@edk2.groups.io <mailto:devel@edk2.groups.io>> On Behalf Of Pedro Falcato
>> Sent: Friday, January 12, 2024 6:47 AM
>> To: Laszlo Ersek <lersek@redhat.com <mailto:lersek@redhat.com>>
>> Cc: devel@edk2.groups.io <mailto:devel@edk2.groups.io>; nhi@os.amperecomputing.com <mailto:nhi@os.amperecomputing.com>;
>> ardb+tianocore@kernel.org <mailto:ardb+tianocore@kernel.org>; Andrew Fish <afish@apple.com <mailto:afish@apple.com>>
>> Subject: Re: [edk2-devel] Memory Attribute for depex section
>> 
>> On Fri, Jan 12, 2024 at 9:35 AM Laszlo Ersek <lersek@redhat.com> wrote:
>>> 
>>> On 1/12/24 03:10, Pedro Falcato wrote:
>>>> My idea was to make each handle an index - like a file descriptor -
>>>> AFAIK there's no reason why it *needs* to be an actual pointer.
>>>> We'd allocate indices when creating a handle, and return that (casted to
>> VOID*).
>>> 
>>> Huh, sneaky.
>>> 
>>> I see two potential problems with this.
>>> 
>>> First, per C std, these "pointers" would be invalid (they wouldn't
>>> actually point to valid memory), so even just evaluating them (not
>>> dereferencing them!) would invoke undefined behavior. May or may not
>>> matter in practice. With compilers getting smarter about optimization
>>> (and stricter about std conformance), there could be issues, perhaps.
>> 
>> This is true. Stashing random integers in pointers is
>> implementation-defined. But it's also super common. Win32 HANDLEs are
>> exactly this, just integers (stashed in VOID*). The Linux kernel world
>> also has a bunch of fun tricks with stashing flags in a pointer's
>> bottom bits, magic pointer values, etc. I severely doubt we can run
>> into issues with this. EDK2 will not exactly run on the C standard's
>> abstract machine anyway ;)
>> 
>>> 
>>> The other concern is a bit contrived, but I *guess* there could be code
>>> out there that actually dereferences EFI_HANDLEs. That code would crash.
>>> May or may not matter, because such code is arguably already
>>> semantically invalid. (It would not necessarily be invalid at the
>>> language level -- cf. my previous paragraph --, because passing an
>>> otherwise valid EFI_HANDLE to CopyMem, for copying just 1 byte out of
>>> the underlying opaque data structure, would not violate the language.)
>> 
>> This is a feature, not a bug! :P
>> 
>> Seriously though, IHANDLE is not even exposed in semi-public headers,
>> so any code that's derefing an EFI_HANDLE will need to do something
>> like
>> 
>> typedef struct {
>>  /* ... */
>> } IHANDLE;
>> 
>> EFI_HANDLE Handle = /* ... */;
>> 
>> IHANDLE *HandleImpl = (IHANDLE *) Handle;
>> 
>> and I'm a strong believer in "play stupid games, win stupid prizes".
>> You can definitely make an argument for "this should definitely crash"
>> instead of just "maybe crashing" (for instance, platforms that still
>> map the NULL page (like OVMF!), or handles > 4096), so I'm inclined to
>> think that if we indeed go this route, we should set one or two upper
>> bits (on 64-bit platforms!) to make handles non-canonical addresses
>> and therefore necessarily crash on dereference.
>> 
>>> 
>>>> I should note that I find it super hard to get a concrete idea on
>>>> performance for EFI firmware without adequate tooling - I meant to
>>>> write a standalone flamegraph tool a few weeks back (even posted in
>>>> edk2-devel), but, as far as I could tell, the architectural timer
>>>> protocol couldn't get me the interrupt frame[1]. Until then, whether
>>>> any of this radix tree vs RB tree vs flat array stuff really
>>>> matters... I find it hard to say.
>>>> 
>>>> [1] x86 has 3 timers (PIT, LAPIC timer, HPET) and performance
>>>> monitoring interrupts, and I couldn't freely use any of them :^)
>>> 
>>> Edk2 has some form of profiling already (see
>>> "MdePkg/Include/Library/PerformanceLib.h"). Usually one sees core code
>>> peppered with PERF_CODE_BEGIN and PERF_CODE_END macros. I *think* there
>>> is something like a "display performance" (dp) shell application too,
>>> that can show the collected stats. But I've never used these facilities.
>>> 
>>> The wiki seems to have two related articles:
>>> 
>>> https://github.com/tianocore/tianocore.github.io/wiki/Edk2-Performance-
>> Infrastructure
>>> 
>>> https://github.com/tianocore/tianocore.github.io/wiki/PerformancePkg
>>> 
>>> The former looks quite comprehensive, at a quick skim.
>> 
>> /me nods
>> I've seen those macros around, but I've never used them.
>> In any case, this problem has piqued my interest, I'll see if I can
>> find some free time this weekend to hack on a test benchmark and a PoC
>> :)
>> 
>> --
>> Pedro
>> 
>> 
>> 
>> 
> 
> 
> 
> 



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#113754): https://edk2.groups.io/g/devel/message/113754
Mute This Topic: https://groups.io/mt/103594587/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/leave/12367111/7686176/1913456212/xyzzy [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



[-- Attachment #2: Type: text/html, Size: 24742 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [edk2-devel] Memory Attribute for depex section
  2024-01-12 18:56                 ` Andrew Fish via groups.io
@ 2024-01-12 19:04                   ` Michael D Kinney
  2024-01-12 19:26                     ` Andrew Fish via groups.io
  0 siblings, 1 reply; 24+ messages in thread
From: Michael D Kinney @ 2024-01-12 19:04 UTC (permalink / raw)
  To: Andrew (EFI) Fish, edk2-devel-groups-io
  Cc: pedro.falcato@gmail.com, Laszlo Ersek, nhi@os.amperecomputing.com,
	ardb+tianocore@kernel.org, Kinney, Michael D

[-- Attachment #1: Type: text/plain, Size: 7456 bytes --]

Agreed.  Basically every API that takes an EF_HANDLE as input calls that API to make sure it is a valid handle.

The first question is if we get value from making sure the EFI_HANDLE is a member of the active set of handles.

A simple signature check in the EFI_HANDLE may be enough as long as all freed handles clear the signature.

Then, the only way that the linked list walk adds value is if there it a call with an invalid handle that happens to
have the matching signature.

The

From: Andrew (EFI) Fish <afish@apple.com>
Sent: Friday, January 12, 2024 10:57 AM
To: edk2-devel-groups-io <devel@edk2.groups.io>; Kinney, Michael D <michael.d.kinney@intel.com>
Cc: pedro.falcato@gmail.com; Laszlo Ersek <lersek@redhat.com>; nhi@os.amperecomputing.com; ardb+tianocore@kernel.org
Subject: Re: [edk2-devel] Memory Attribute for depex section




On Jan 12, 2024, at 8:37 AM, Michael D Kinney <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>> wrote:

Hi Pedro,

Thank you for evaluating this idea change from linked list to improve
performance of the handle database.

The concept of using integers for an EFI_HANDLE has been considered before.
One advantage over pointers is that a guarantee can be made that an EFI_HANDLE
value can be guaranteed to be unique.  In the current implementation with
EFI_HANDLE being a pointer to an allocated buffer, an EFI_HANDLE value could
potentially be reused if an EFI_HANDLE is freed and later allocated for a
different EFI_HANDLE.

If you continue to explore this path I agree that EFI_HANDLE value of 0 should
be reserved and never used.  I also recommend that new EFI_HANDLE values are
always assigned a new unique value that freed EFI_HANDLE values are never reused.

The overall linked list performance of the handle database has also been noted
before, but has rarely raised to the level where it impacts the overall boot
performance relative to other optimization opportunities.  I look forward to
the performance data.  The PERF_X() macros are the right service to use.  On
x86 it is based on the time stamp counter (TSC) which has very small granularity.


Mike,

We tracked this a while back with the PERF macros when we had some performance issues running diagnostics on a system with 3,000+ handles. The hot path was CoreValidateHandle(). I think the number of calls to CoreValidateHandle() explodes if you have more handles so it is not just the raw performance of CoreValidateHandle(), but also how many times it gets called.

Thanks,

Andrew Fish


Mike



-----Original Message-----
From: devel@edk2.groups.io<mailto:devel@edk2.groups.io> <devel@edk2.groups.io<mailto:devel@edk2.groups.io>> On Behalf Of Pedro Falcato
Sent: Friday, January 12, 2024 6:47 AM
To: Laszlo Ersek <lersek@redhat.com<mailto:lersek@redhat.com>>
Cc: devel@edk2.groups.io<mailto:devel@edk2.groups.io>; nhi@os.amperecomputing.com<mailto:nhi@os.amperecomputing.com>;
ardb+tianocore@kernel.org<mailto:ardb+tianocore@kernel.org>; Andrew Fish <afish@apple.com<mailto:afish@apple.com>>
Subject: Re: [edk2-devel] Memory Attribute for depex section

On Fri, Jan 12, 2024 at 9:35 AM Laszlo Ersek <lersek@redhat.com<mailto:lersek@redhat.com>> wrote:


On 1/12/24 03:10, Pedro Falcato wrote:

My idea was to make each handle an index - like a file descriptor -
AFAIK there's no reason why it *needs* to be an actual pointer.
We'd allocate indices when creating a handle, and return that (casted to
VOID*).


Huh, sneaky.

I see two potential problems with this.

First, per C std, these "pointers" would be invalid (they wouldn't
actually point to valid memory), so even just evaluating them (not
dereferencing them!) would invoke undefined behavior. May or may not
matter in practice. With compilers getting smarter about optimization
(and stricter about std conformance), there could be issues, perhaps.

This is true. Stashing random integers in pointers is
implementation-defined. But it's also super common. Win32 HANDLEs are
exactly this, just integers (stashed in VOID*). The Linux kernel world
also has a bunch of fun tricks with stashing flags in a pointer's
bottom bits, magic pointer values, etc. I severely doubt we can run
into issues with this. EDK2 will not exactly run on the C standard's
abstract machine anyway ;)



The other concern is a bit contrived, but I *guess* there could be code
out there that actually dereferences EFI_HANDLEs. That code would crash.
May or may not matter, because such code is arguably already
semantically invalid. (It would not necessarily be invalid at the
language level -- cf. my previous paragraph --, because passing an
otherwise valid EFI_HANDLE to CopyMem, for copying just 1 byte out of
the underlying opaque data structure, would not violate the language.)

This is a feature, not a bug! :P

Seriously though, IHANDLE is not even exposed in semi-public headers,
so any code that's derefing an EFI_HANDLE will need to do something
like

typedef struct {
 /* ... */
} IHANDLE;

EFI_HANDLE Handle = /* ... */;

IHANDLE *HandleImpl = (IHANDLE *) Handle;

and I'm a strong believer in "play stupid games, win stupid prizes".
You can definitely make an argument for "this should definitely crash"
instead of just "maybe crashing" (for instance, platforms that still
map the NULL page (like OVMF!), or handles > 4096), so I'm inclined to
think that if we indeed go this route, we should set one or two upper
bits (on 64-bit platforms!) to make handles non-canonical addresses
and therefore necessarily crash on dereference.




I should note that I find it super hard to get a concrete idea on
performance for EFI firmware without adequate tooling - I meant to
write a standalone flamegraph tool a few weeks back (even posted in
edk2-devel), but, as far as I could tell, the architectural timer
protocol couldn't get me the interrupt frame[1]. Until then, whether
any of this radix tree vs RB tree vs flat array stuff really
matters... I find it hard to say.

[1] x86 has 3 timers (PIT, LAPIC timer, HPET) and performance
monitoring interrupts, and I couldn't freely use any of them :^)

Edk2 has some form of profiling already (see
"MdePkg/Include/Library/PerformanceLib.h"). Usually one sees core code
peppered with PERF_CODE_BEGIN and PERF_CODE_END macros. I *think* there
is something like a "display performance" (dp) shell application too,
that can show the collected stats. But I've never used these facilities.

The wiki seems to have two related articles:

https://github.com/tianocore/tianocore.github.io/wiki/Edk2-Performance-
Infrastructure


https://github.com/tianocore/tianocore.github.io/wiki/PerformancePkg

The former looks quite comprehensive, at a quick skim.

/me nods
I've seen those macros around, but I've never used them.
In any case, this problem has piqued my interest, I'll see if I can
find some free time this weekend to hack on a test benchmark and a PoC
:)

--
Pedro










-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#113755): https://edk2.groups.io/g/devel/message/113755
Mute This Topic: https://groups.io/mt/103594587/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/leave/12367111/7686176/1913456212/xyzzy [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



[-- Attachment #2: Type: text/html, Size: 15247 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [edk2-devel] Memory Attribute for depex section
  2024-01-12 19:04                   ` Michael D Kinney
@ 2024-01-12 19:26                     ` Andrew Fish via groups.io
  0 siblings, 0 replies; 24+ messages in thread
From: Andrew Fish via groups.io @ 2024-01-12 19:26 UTC (permalink / raw)
  To: Mike Kinney
  Cc: edk2-devel-groups-io, pedro.falcato@gmail.com, Laszlo Ersek,
	nhi@os.amperecomputing.com, ardb+tianocore@kernel.org

[-- Attachment #1: Type: text/plain, Size: 8674 bytes --]



> On Jan 12, 2024, at 11:04 AM, Kinney, Michael D <michael.d.kinney@intel.com> wrote:
> 
> Agreed.  Basically every API that takes an EF_HANDLE as input calls that API to make sure it is a valid handle.
>  
> The first question is if we get value from making sure the EFI_HANDLE is a member of the active set of handles.
>  
> A simple signature check in the EFI_HANDLE may be enough as long as all freed handles clear the signature.
>  
> Then, the only way that the linked list walk adds value is if there it a call with an invalid handle that happens to
> have the matching signature.
>  

MIke,

I seem to remember we ended up fixing the way you describe locally. Even if you convert the list into a tree given the index was a pointer to allocate memory it could get reused. The tree just made the lookup go faster. 

If we wanted to get tricky on 64-bit systems we could encode a monotonically increasing number in the non-canonical part of the virtual address, or above the max physical address if paging is not enabled. To use the EFI_HNDLE as a pointer you just remove count and replace it with sign extend canonical address (zero in our case). We could probably define a HOB to define the bits to use given the code constructing the page tables for DXE needs to know all the rules here. 

Thanks,

Andrew Fish

> The 
>  
> From: Andrew (EFI) Fish <afish@apple.com> 
> Sent: Friday, January 12, 2024 10:57 AM
> To: edk2-devel-groups-io <devel@edk2.groups.io>; Kinney, Michael D <michael.d.kinney@intel.com>
> Cc: pedro.falcato@gmail.com; Laszlo Ersek <lersek@redhat.com>; nhi@os.amperecomputing.com; ardb+tianocore@kernel.org
> Subject: Re: [edk2-devel] Memory Attribute for depex section
>  
>  
> 
> 
> On Jan 12, 2024, at 8:37 AM, Michael D Kinney <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>> wrote:
>  
> Hi Pedro,
> 
> Thank you for evaluating this idea change from linked list to improve
> performance of the handle database.
> 
> The concept of using integers for an EFI_HANDLE has been considered before.
> One advantage over pointers is that a guarantee can be made that an EFI_HANDLE
> value can be guaranteed to be unique.  In the current implementation with
> EFI_HANDLE being a pointer to an allocated buffer, an EFI_HANDLE value could
> potentially be reused if an EFI_HANDLE is freed and later allocated for a 
> different EFI_HANDLE.
> 
> If you continue to explore this path I agree that EFI_HANDLE value of 0 should
> be reserved and never used.  I also recommend that new EFI_HANDLE values are
> always assigned a new unique value that freed EFI_HANDLE values are never reused.
> 
> The overall linked list performance of the handle database has also been noted
> before, but has rarely raised to the level where it impacts the overall boot
> performance relative to other optimization opportunities.  I look forward to
> the performance data.  The PERF_X() macros are the right service to use.  On
> x86 it is based on the time stamp counter (TSC) which has very small granularity.
> 
>  
> Mike,
>  
> We tracked this a while back with the PERF macros when we had some performance issues running diagnostics on a system with 3,000+ handles. The hot path was CoreValidateHandle(). I think the number of calls to CoreValidateHandle() explodes if you have more handles so it is not just the raw performance of CoreValidateHandle(), but also how many times it gets called. 
>  
> Thanks,
>  
> Andrew Fish
> 
> 
> Mike
> 
> 
> 
> -----Original Message-----
> From: devel@edk2.groups.io <mailto:devel@edk2.groups.io> <devel@edk2.groups.io <mailto:devel@edk2.groups.io>> On Behalf Of Pedro Falcato
> Sent: Friday, January 12, 2024 6:47 AM
> To: Laszlo Ersek <lersek@redhat.com <mailto:lersek@redhat.com>>
> Cc: devel@edk2.groups.io <mailto:devel@edk2.groups.io>; nhi@os.amperecomputing.com <mailto:nhi@os.amperecomputing.com>;
> ardb+tianocore@kernel.org <mailto:ardb+tianocore@kernel.org>; Andrew Fish <afish@apple.com <mailto:afish@apple.com>>
> Subject: Re: [edk2-devel] Memory Attribute for depex section
> 
> On Fri, Jan 12, 2024 at 9:35 AM Laszlo Ersek <lersek@redhat.com <mailto:lersek@redhat.com>> wrote:
> 
> 
> On 1/12/24 03:10, Pedro Falcato wrote:
> 
> My idea was to make each handle an index - like a file descriptor -
> AFAIK there's no reason why it *needs* to be an actual pointer.
> We'd allocate indices when creating a handle, and return that (casted to
> VOID*).
> 
> 
> Huh, sneaky.
> 
> I see two potential problems with this.
> 
> First, per C std, these "pointers" would be invalid (they wouldn't
> actually point to valid memory), so even just evaluating them (not
> dereferencing them!) would invoke undefined behavior. May or may not
> matter in practice. With compilers getting smarter about optimization
> (and stricter about std conformance), there could be issues, perhaps.
> 
> This is true. Stashing random integers in pointers is
> implementation-defined. But it's also super common. Win32 HANDLEs are
> exactly this, just integers (stashed in VOID*). The Linux kernel world
> also has a bunch of fun tricks with stashing flags in a pointer's
> bottom bits, magic pointer values, etc. I severely doubt we can run
> into issues with this. EDK2 will not exactly run on the C standard's
> abstract machine anyway ;)
> 
> 
> 
> The other concern is a bit contrived, but I *guess* there could be code
> out there that actually dereferences EFI_HANDLEs. That code would crash.
> May or may not matter, because such code is arguably already
> semantically invalid. (It would not necessarily be invalid at the
> language level -- cf. my previous paragraph --, because passing an
> otherwise valid EFI_HANDLE to CopyMem, for copying just 1 byte out of
> the underlying opaque data structure, would not violate the language.)
> 
> This is a feature, not a bug! :P
> 
> Seriously though, IHANDLE is not even exposed in semi-public headers,
> so any code that's derefing an EFI_HANDLE will need to do something
> like
> 
> typedef struct {
>  /* ... */
> } IHANDLE;
> 
> EFI_HANDLE Handle = /* ... */;
> 
> IHANDLE *HandleImpl = (IHANDLE *) Handle;
> 
> and I'm a strong believer in "play stupid games, win stupid prizes".
> You can definitely make an argument for "this should definitely crash"
> instead of just "maybe crashing" (for instance, platforms that still
> map the NULL page (like OVMF!), or handles > 4096), so I'm inclined to
> think that if we indeed go this route, we should set one or two upper
> bits (on 64-bit platforms!) to make handles non-canonical addresses
> and therefore necessarily crash on dereference.
> 
> 
> 
> 
> I should note that I find it super hard to get a concrete idea on
> performance for EFI firmware without adequate tooling - I meant to
> write a standalone flamegraph tool a few weeks back (even posted in
> edk2-devel), but, as far as I could tell, the architectural timer
> protocol couldn't get me the interrupt frame[1]. Until then, whether
> any of this radix tree vs RB tree vs flat array stuff really
> matters... I find it hard to say.
> 
> [1] x86 has 3 timers (PIT, LAPIC timer, HPET) and performance
> monitoring interrupts, and I couldn't freely use any of them :^)
> 
> Edk2 has some form of profiling already (see
> "MdePkg/Include/Library/PerformanceLib.h"). Usually one sees core code
> peppered with PERF_CODE_BEGIN and PERF_CODE_END macros. I *think* there
> is something like a "display performance" (dp) shell application too,
> that can show the collected stats. But I've never used these facilities.
> 
> The wiki seems to have two related articles:
> 
> https://github.com/tianocore/tianocore.github.io/wiki/Edk2-Performance-
> Infrastructure
> 
> 
> https://github.com/tianocore/tianocore.github.io/wiki/PerformancePkg
> 
> The former looks quite comprehensive, at a quick skim.
> 
> /me nods
> I've seen those macros around, but I've never used them.
> In any case, this problem has piqued my interest, I'll see if I can
> find some free time this weekend to hack on a test benchmark and a PoC
> :)
> 
> --
> Pedro
> 
> 
> 
> 
> 
> 
> 
> 



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#113775): https://edk2.groups.io/g/devel/message/113775
Mute This Topic: https://groups.io/mt/103594587/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/leave/12367111/7686176/1913456212/xyzzy [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



[-- Attachment #2: Type: text/html, Size: 18617 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [edk2-devel] Memory Attribute for depex section
  2024-01-12  9:45       ` Laszlo Ersek
@ 2024-01-15 13:07         ` Nhi Pham via groups.io
  2024-01-15 14:04           ` Ard Biesheuvel
  0 siblings, 1 reply; 24+ messages in thread
From: Nhi Pham via groups.io @ 2024-01-15 13:07 UTC (permalink / raw)
  To: Laszlo Ersek, Andrew (EFI) Fish, edk2-devel-groups-io
  Cc: ardb+tianocore@kernel.org

On 1/12/2024 4:45 PM, Laszlo Ersek wrote:
> (Independently: I think that's a valid thing to do for *SMM* drivers,
> because the entry point functions of those drivers are permitted to use
> both SMM and DXE/UEFI protocols. But whether the same is valid for the
> *standalone* MM drivers -- that looks questionable. Standalone MM
> drivers should not depend on UEFI/DXE protocols ever, IIUC.)
> 
>> 3) The issue is patching the grammar in place, why can’t we just make a
>> copy for the dispatcher grammer, and operate on the copy. Maybe via a
>> copy on 1st update strategy? 
> 
> Yes, copying the depex to the heap, and patching it there, was Nhi's #1
> fix proposal. I think that could be made work. But I'm not sure if the
> perf savings are worth the additional complexity. The heap allocation
> (where the writeable depex would exist) would have to be permanently
> associated with the loaded PE image -- because the dispatcher might need
> to reevaluate the depex across multiple rounds of dispatching. So that's
> a new field in some image-related structure, it also needs to be freed
> upon unload (?), what if the memory allocation fails during depex eval
> (just consider the depex to eval to FALSE?), etc. Doable, but hairy; not
> sure if the perf is worth that effort.
> 

Thanks so much, Laszlo for your valuable insights.

The approach #1 works for me. I will do further check for your concerns
above.

I'm trying your suggested patch and investigating the performance being
discussed here.

Regards,
Nhi


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#113821): https://edk2.groups.io/g/devel/message/113821
Mute This Topic: https://groups.io/mt/103594587/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [edk2-devel] Memory Attribute for depex section
  2024-01-15 13:07         ` Nhi Pham via groups.io
@ 2024-01-15 14:04           ` Ard Biesheuvel
  2024-01-15 19:00             ` Laszlo Ersek
  0 siblings, 1 reply; 24+ messages in thread
From: Ard Biesheuvel @ 2024-01-15 14:04 UTC (permalink / raw)
  To: Nhi Pham
  Cc: Laszlo Ersek, Andrew (EFI) Fish, edk2-devel-groups-io,
	ardb+tianocore@kernel.org

On Mon, 15 Jan 2024 at 14:07, Nhi Pham <nhi@os.amperecomputing.com> wrote:
>
> On 1/12/2024 4:45 PM, Laszlo Ersek wrote:
> > (Independently: I think that's a valid thing to do for *SMM* drivers,
> > because the entry point functions of those drivers are permitted to use
> > both SMM and DXE/UEFI protocols. But whether the same is valid for the
> > *standalone* MM drivers -- that looks questionable. Standalone MM
> > drivers should not depend on UEFI/DXE protocols ever, IIUC.)
> >
> >> 3) The issue is patching the grammar in place, why can’t we just make a
> >> copy for the dispatcher grammer, and operate on the copy. Maybe via a
> >> copy on 1st update strategy?
> >
> > Yes, copying the depex to the heap, and patching it there, was Nhi's #1
> > fix proposal. I think that could be made work. But I'm not sure if the
> > perf savings are worth the additional complexity. The heap allocation
> > (where the writeable depex would exist) would have to be permanently
> > associated with the loaded PE image -- because the dispatcher might need
> > to reevaluate the depex across multiple rounds of dispatching. So that's
> > a new field in some image-related structure, it also needs to be freed
> > upon unload (?), what if the memory allocation fails during depex eval
> > (just consider the depex to eval to FALSE?), etc. Doable, but hairy; not
> > sure if the perf is worth that effort.
> >
>
> Thanks so much, Laszlo for your valuable insights.
>
> The approach #1 works for me. I will do further check for your concerns
> above.
>
> I'm trying your suggested patch and investigating the performance being
> discussed here.
>

Not sure what approach #1 means, but I'd prefer to just remove this
optimization from standalone MM, given that not only a) it shouldn't
have to deal with a large number of protocol GUIDs, but also b) the
driver dispatch is much more straight-forward. (Typically, StMM
drivers can be dispatched in the order they appear in the firmware
volume, in which case each DEPEX is evaluated only once anyway)


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#113822): https://edk2.groups.io/g/devel/message/113822
Mute This Topic: https://groups.io/mt/103594587/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [edk2-devel] Memory Attribute for depex section
  2024-01-15 14:04           ` Ard Biesheuvel
@ 2024-01-15 19:00             ` Laszlo Ersek
  2024-01-18  6:00               ` Nhi Pham via groups.io
  0 siblings, 1 reply; 24+ messages in thread
From: Laszlo Ersek @ 2024-01-15 19:00 UTC (permalink / raw)
  To: Ard Biesheuvel, Nhi Pham
  Cc: Andrew (EFI) Fish, edk2-devel-groups-io,
	ardb+tianocore@kernel.org

On 1/15/24 15:04, Ard Biesheuvel wrote:
> On Mon, 15 Jan 2024 at 14:07, Nhi Pham <nhi@os.amperecomputing.com> wrote:
>>
>> On 1/12/2024 4:45 PM, Laszlo Ersek wrote:
>>> (Independently: I think that's a valid thing to do for *SMM* drivers,
>>> because the entry point functions of those drivers are permitted to use
>>> both SMM and DXE/UEFI protocols. But whether the same is valid for the
>>> *standalone* MM drivers -- that looks questionable. Standalone MM
>>> drivers should not depend on UEFI/DXE protocols ever, IIUC.)
>>>
>>>> 3) The issue is patching the grammar in place, why can’t we just make a
>>>> copy for the dispatcher grammer, and operate on the copy. Maybe via a
>>>> copy on 1st update strategy?
>>>
>>> Yes, copying the depex to the heap, and patching it there, was Nhi's #1
>>> fix proposal. I think that could be made work. But I'm not sure if the
>>> perf savings are worth the additional complexity. The heap allocation
>>> (where the writeable depex would exist) would have to be permanently
>>> associated with the loaded PE image -- because the dispatcher might need
>>> to reevaluate the depex across multiple rounds of dispatching. So that's
>>> a new field in some image-related structure, it also needs to be freed
>>> upon unload (?), what if the memory allocation fails during depex eval
>>> (just consider the depex to eval to FALSE?), etc. Doable, but hairy; not
>>> sure if the perf is worth that effort.
>>>
>>
>> Thanks so much, Laszlo for your valuable insights.
>>
>> The approach #1 works for me. I will do further check for your concerns
>> above.
>>
>> I'm trying your suggested patch and investigating the performance being
>> discussed here.
>>
> 
> Not sure what approach #1 means,

(copying the depex to the heap, and maintaining it there, so that it can
be patched)

> but I'd prefer to just remove this
> optimization from standalone MM, given that not only a) it shouldn't
> have to deal with a large number of protocol GUIDs, but also b) the
> driver dispatch is much more straight-forward. (Typically, StMM
> drivers can be dispatched in the order they appear in the firmware
> volume, in which case each DEPEX is evaluated only once anyway)

Sounds like a promising basis for removing the optimization indeed!

Laszlo



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#113844): https://edk2.groups.io/g/devel/message/113844
Mute This Topic: https://groups.io/mt/103594587/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/leave/12367111/7686176/1913456212/xyzzy [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [edk2-devel] Memory Attribute for depex section
  2024-01-15 19:00             ` Laszlo Ersek
@ 2024-01-18  6:00               ` Nhi Pham via groups.io
  2024-01-18 14:49                 ` Laszlo Ersek
  0 siblings, 1 reply; 24+ messages in thread
From: Nhi Pham via groups.io @ 2024-01-18  6:00 UTC (permalink / raw)
  To: Laszlo Ersek, Ard Biesheuvel
  Cc: Andrew (EFI) Fish, edk2-devel-groups-io,
	ardb+tianocore@kernel.org

Hi Laszlo,

On 1/16/2024 2:00 AM, Laszlo Ersek wrote:
> On 1/15/24 15:04, Ard Biesheuvel wrote:
>> On Mon, 15 Jan 2024 at 14:07, Nhi Pham <nhi@os.amperecomputing.com> wrote:
>>>
>>> On 1/12/2024 4:45 PM, Laszlo Ersek wrote:
>>>> (Independently: I think that's a valid thing to do for *SMM* drivers,
>>>> because the entry point functions of those drivers are permitted to use
>>>> both SMM and DXE/UEFI protocols. But whether the same is valid for the
>>>> *standalone* MM drivers -- that looks questionable. Standalone MM
>>>> drivers should not depend on UEFI/DXE protocols ever, IIUC.)
>>>>
>>>>> 3) The issue is patching the grammar in place, why can’t we just make a
>>>>> copy for the dispatcher grammer, and operate on the copy. Maybe via a
>>>>> copy on 1st update strategy?
>>>>
>>>> Yes, copying the depex to the heap, and patching it there, was Nhi's #1
>>>> fix proposal. I think that could be made work. But I'm not sure if the
>>>> perf savings are worth the additional complexity. The heap allocation
>>>> (where the writeable depex would exist) would have to be permanently
>>>> associated with the loaded PE image -- because the dispatcher might need
>>>> to reevaluate the depex across multiple rounds of dispatching. So that's
>>>> a new field in some image-related structure, it also needs to be freed
>>>> upon unload (?), what if the memory allocation fails during depex eval
>>>> (just consider the depex to eval to FALSE?), etc. Doable, but hairy; not
>>>> sure if the perf is worth that effort.
>>>>
>>>
>>> Thanks so much, Laszlo for your valuable insights.
>>>
>>> The approach #1 works for me. I will do further check for your concerns
>>> above.
>>>
>>> I'm trying your suggested patch and investigating the performance being
>>> discussed here.
>>>
>>
>> Not sure what approach #1 means,
> 
> (copying the depex to the heap, and maintaining it there, so that it can
> be patched)

Thanks!

> 
>> but I'd prefer to just remove this
>> optimization from standalone MM, given that not only a) it shouldn't
>> have to deal with a large number of protocol GUIDs, but also b) the
>> driver dispatch is much more straight-forward. (Typically, StMM
>> drivers can be dispatched in the order they appear in the firmware
>> volume, in which case each DEPEX is evaluated only once anyway)
> 
> Sounds like a promising basis for removing the optimization indeed!
> 

Your patch suggested earlier works for me. And I don't see significant 
performance reduction compared with keeping optimization.

I don't have strong reason on removing the optimization, but I think it 
would be simply good for now. Could you post your patch to edk2-devel 
for review and merge?

Thanks,
Nhi


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#113986): https://edk2.groups.io/g/devel/message/113986
Mute This Topic: https://groups.io/mt/103594587/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [edk2-devel] Memory Attribute for depex section
  2024-01-18  6:00               ` Nhi Pham via groups.io
@ 2024-01-18 14:49                 ` Laszlo Ersek
  2024-01-19  4:43                   ` Nhi Pham via groups.io
  0 siblings, 1 reply; 24+ messages in thread
From: Laszlo Ersek @ 2024-01-18 14:49 UTC (permalink / raw)
  To: Nhi Pham, Ard Biesheuvel
  Cc: Andrew (EFI) Fish, edk2-devel-groups-io,
	ardb+tianocore@kernel.org

On 1/18/24 07:00, Nhi Pham wrote:
> Hi Laszlo,
> 
> On 1/16/2024 2:00 AM, Laszlo Ersek wrote:
>> On 1/15/24 15:04, Ard Biesheuvel wrote:
>>> On Mon, 15 Jan 2024 at 14:07, Nhi Pham <nhi@os.amperecomputing.com>
>>> wrote:
>>>>
>>>> On 1/12/2024 4:45 PM, Laszlo Ersek wrote:
>>>>> (Independently: I think that's a valid thing to do for *SMM* drivers,
>>>>> because the entry point functions of those drivers are permitted to
>>>>> use
>>>>> both SMM and DXE/UEFI protocols. But whether the same is valid for the
>>>>> *standalone* MM drivers -- that looks questionable. Standalone MM
>>>>> drivers should not depend on UEFI/DXE protocols ever, IIUC.)
>>>>>
>>>>>> 3) The issue is patching the grammar in place, why can’t we just
>>>>>> make a
>>>>>> copy for the dispatcher grammer, and operate on the copy. Maybe via a
>>>>>> copy on 1st update strategy?
>>>>>
>>>>> Yes, copying the depex to the heap, and patching it there, was
>>>>> Nhi's #1
>>>>> fix proposal. I think that could be made work. But I'm not sure if the
>>>>> perf savings are worth the additional complexity. The heap allocation
>>>>> (where the writeable depex would exist) would have to be permanently
>>>>> associated with the loaded PE image -- because the dispatcher might
>>>>> need
>>>>> to reevaluate the depex across multiple rounds of dispatching. So
>>>>> that's
>>>>> a new field in some image-related structure, it also needs to be freed
>>>>> upon unload (?), what if the memory allocation fails during depex eval
>>>>> (just consider the depex to eval to FALSE?), etc. Doable, but
>>>>> hairy; not
>>>>> sure if the perf is worth that effort.
>>>>>
>>>>
>>>> Thanks so much, Laszlo for your valuable insights.
>>>>
>>>> The approach #1 works for me. I will do further check for your concerns
>>>> above.
>>>>
>>>> I'm trying your suggested patch and investigating the performance being
>>>> discussed here.
>>>>
>>>
>>> Not sure what approach #1 means,
>>
>> (copying the depex to the heap, and maintaining it there, so that it can
>> be patched)
> 
> Thanks!
> 
>>
>>> but I'd prefer to just remove this
>>> optimization from standalone MM, given that not only a) it shouldn't
>>> have to deal with a large number of protocol GUIDs, but also b) the
>>> driver dispatch is much more straight-forward. (Typically, StMM
>>> drivers can be dispatched in the order they appear in the firmware
>>> volume, in which case each DEPEX is evaluated only once anyway)
>>
>> Sounds like a promising basis for removing the optimization indeed!
>>
> 
> Your patch suggested earlier works for me. And I don't see significant
> performance reduction compared with keeping optimization.
> 
> I don't have strong reason on removing the optimization, but I think it
> would be simply good for now. Could you post your patch to edk2-devel
> for review and merge?

That wouldn't be correct; I don't have any platform for testing StMM. I
proposed the patch purely based on code analysis. I prefer not to post
untested patches, if I can avoid it.

You can however post my patch; simply add your S-o-b at the bottom. You
can also preserve my authorship on the patch with --author=... on
git-commit; but even that is unnecessary for such a simple patch (you
don't even have to pick the patch up from the email, it's trivial to
reimplement from scratch, just reading the email).

Thanks
Laszlo



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#113994): https://edk2.groups.io/g/devel/message/113994
Mute This Topic: https://groups.io/mt/103594587/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/leave/12367111/7686176/1913456212/xyzzy [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [edk2-devel] Memory Attribute for depex section
  2024-01-18 14:49                 ` Laszlo Ersek
@ 2024-01-19  4:43                   ` Nhi Pham via groups.io
  2024-01-19 13:54                     ` Laszlo Ersek
  0 siblings, 1 reply; 24+ messages in thread
From: Nhi Pham via groups.io @ 2024-01-19  4:43 UTC (permalink / raw)
  To: Laszlo Ersek, Ard Biesheuvel
  Cc: Andrew (EFI) Fish, edk2-devel-groups-io,
	ardb+tianocore@kernel.org

On 1/18/2024 9:49 PM, Laszlo Ersek wrote:
>>>> but I'd prefer to just remove this
>>>> optimization from standalone MM, given that not only a) it shouldn't
>>>> have to deal with a large number of protocol GUIDs, but also b) the
>>>> driver dispatch is much more straight-forward. (Typically, StMM
>>>> drivers can be dispatched in the order they appear in the firmware
>>>> volume, in which case each DEPEX is evaluated only once anyway)
>>>
>>> Sounds like a promising basis for removing the optimization indeed!
>>>
>>
>> Your patch suggested earlier works for me. And I don't see significant
>> performance reduction compared with keeping optimization.
>>
>> I don't have strong reason on removing the optimization, but I think it
>> would be simply good for now. Could you post your patch to edk2-devel
>> for review and merge?
> 
> That wouldn't be correct; I don't have any platform for testing StMM. I
> proposed the patch purely based on code analysis. I prefer not to post
> untested patches, if I can avoid it.

I got it, thanks!

I thought I could give Tested-by tag when you post the patch since I 
already verified the patch on a StMM platform

> 
> You can however post my patch; simply add your S-o-b at the bottom. You
> can also preserve my authorship on the patch with --author=... on
> git-commit; but even that is unnecessary for such a simple patch (you
> don't even have to pick the patch up from the email, it's trivial to
> reimplement from scratch, just reading the email).

I'm going to send the patch to edk2-devel and keep your authorship on 
the patch because there is no change compared with your suggestion in 
the email.

Thanks,
Nhi


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#114024): https://edk2.groups.io/g/devel/message/114024
Mute This Topic: https://groups.io/mt/103594587/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [edk2-devel] Memory Attribute for depex section
  2024-01-19  4:43                   ` Nhi Pham via groups.io
@ 2024-01-19 13:54                     ` Laszlo Ersek
  0 siblings, 0 replies; 24+ messages in thread
From: Laszlo Ersek @ 2024-01-19 13:54 UTC (permalink / raw)
  To: Nhi Pham, Ard Biesheuvel
  Cc: Andrew (EFI) Fish, edk2-devel-groups-io,
	ardb+tianocore@kernel.org

On 1/19/24 05:43, Nhi Pham wrote:
> On 1/18/2024 9:49 PM, Laszlo Ersek wrote:
>>>>> but I'd prefer to just remove this
>>>>> optimization from standalone MM, given that not only a) it shouldn't
>>>>> have to deal with a large number of protocol GUIDs, but also b) the
>>>>> driver dispatch is much more straight-forward. (Typically, StMM
>>>>> drivers can be dispatched in the order they appear in the firmware
>>>>> volume, in which case each DEPEX is evaluated only once anyway)
>>>>
>>>> Sounds like a promising basis for removing the optimization indeed!
>>>>
>>>
>>> Your patch suggested earlier works for me. And I don't see significant
>>> performance reduction compared with keeping optimization.
>>>
>>> I don't have strong reason on removing the optimization, but I think it
>>> would be simply good for now. Could you post your patch to edk2-devel
>>> for review and merge?
>>
>> That wouldn't be correct; I don't have any platform for testing StMM. I
>> proposed the patch purely based on code analysis. I prefer not to post
>> untested patches, if I can avoid it.
> 
> I got it, thanks!
> 
> I thought I could give Tested-by tag when you post the patch since I
> already verified the patch on a StMM platform

Oh, that seems to have eluded me, sorry -- it's not uncommon that a
patch submitter needs help with testing (external testing) on *some*
platforms, beside his or her main platform(s) of interest. What's quite
uncommon though is total untestability for the submitter; I admit I drew
a blank there. What you mention would have been viable.

Thank you for posting the patch, ultimately!
Laszlo

> 
>>
>> You can however post my patch; simply add your S-o-b at the bottom. You
>> can also preserve my authorship on the patch with --author=... on
>> git-commit; but even that is unnecessary for such a simple patch (you
>> don't even have to pick the patch up from the email, it's trivial to
>> reimplement from scratch, just reading the email).
> 
> I'm going to send the patch to edk2-devel and keep your authorship on
> the patch because there is no change compared with your suggestion in
> the email.
> 
> Thanks,
> Nhi
> 



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#114048): https://edk2.groups.io/g/devel/message/114048
Mute This Topic: https://groups.io/mt/103594587/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/leave/12367111/7686176/1913456212/xyzzy [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2024-01-19 13:55 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-01-08 10:11 [edk2-devel] Memory Attribute for depex section Nhi Pham via groups.io
2024-01-10  0:48 ` Nhi Pham via groups.io
2024-01-10 13:09 ` Laszlo Ersek
2024-01-10 13:45   ` Laszlo Ersek
2024-01-10 21:50     ` Pedro Falcato
2024-01-11  8:46       ` Laszlo Ersek
2024-01-11 11:03         ` Laszlo Ersek
2024-01-12  2:10         ` Pedro Falcato
2024-01-12  9:35           ` Laszlo Ersek
2024-01-12 14:46             ` Pedro Falcato
2024-01-12 16:37               ` Michael D Kinney
2024-01-12 18:56                 ` Andrew Fish via groups.io
2024-01-12 19:04                   ` Michael D Kinney
2024-01-12 19:26                     ` Andrew Fish via groups.io
2024-01-12 18:50               ` Andrew Fish via groups.io
2024-01-12  2:44     ` Andrew Fish via groups.io
2024-01-12  9:45       ` Laszlo Ersek
2024-01-15 13:07         ` Nhi Pham via groups.io
2024-01-15 14:04           ` Ard Biesheuvel
2024-01-15 19:00             ` Laszlo Ersek
2024-01-18  6:00               ` Nhi Pham via groups.io
2024-01-18 14:49                 ` Laszlo Ersek
2024-01-19  4:43                   ` Nhi Pham via groups.io
2024-01-19 13:54                     ` Laszlo Ersek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox