On Jan 12, 2024, at 8:37 AM, Michael D Kinney <michael.d.kinney@intel.com> wrote:
Hi Pedro,
Thank you for evaluating this idea change from linked list to improve
performance of the handle database.
The concept of using integers for an EFI_HANDLE has been considered before.
One advantage over pointers is that a guarantee can be made that an EFI_HANDLE
value can be guaranteed to be unique. In the current implementation with
EFI_HANDLE being a pointer to an allocated buffer, an EFI_HANDLE value could
potentially be reused if an EFI_HANDLE is freed and later allocated for a
different EFI_HANDLE.
If you continue to explore this path I agree that EFI_HANDLE value of 0 should
be reserved and never used. I also recommend that new EFI_HANDLE values are
always assigned a new unique value that freed EFI_HANDLE values are never reused.
The overall linked list performance of the handle database has also been noted
before, but has rarely raised to the level where it impacts the overall boot
performance relative to other optimization opportunities. I look forward to
the performance data. The PERF_X() macros are the right service to use. On
x86 it is based on the time stamp counter (TSC) which has very small granularity.
Mike,
We tracked this a while back with the PERF macros when we had some performance issues running diagnostics on a system with 3,000+ handles. The hot path was CoreValidateHandle(). I think the number of calls to CoreValidateHandle() explodes if you have more handles so it is not just the raw performance of CoreValidateHandle(), but also how many times it gets called.
Thanks,
Andrew Fish
Mike-----Original Message-----
From: devel@edk2.groups.io <devel@edk2.groups.io> On Behalf Of Pedro Falcato
Sent: Friday, January 12, 2024 6:47 AM
To: Laszlo Ersek <lersek@redhat.com>
Cc: devel@edk2.groups.io; nhi@os.amperecomputing.com;
ardb+tianocore@kernel.org; Andrew Fish <afish@apple.com>
Subject: Re: [edk2-devel] Memory Attribute for depex section
On Fri, Jan 12, 2024 at 9:35 AM Laszlo Ersek <lersek@redhat.com> wrote:
On 1/12/24 03:10, Pedro Falcato wrote:
My idea was to make each handle an index - like a file descriptor -
AFAIK there's no reason why it *needs* to be an actual pointer.
We'd allocate indices when creating a handle, and return that (casted to
VOID*).
Huh, sneaky.
I see two potential problems with this.
First, per C std, these "pointers" would be invalid (they wouldn't
actually point to valid memory), so even just evaluating them (not
dereferencing them!) would invoke undefined behavior. May or may not
matter in practice. With compilers getting smarter about optimization
(and stricter about std conformance), there could be issues, perhaps.
This is true. Stashing random integers in pointers is
implementation-defined. But it's also super common. Win32 HANDLEs are
exactly this, just integers (stashed in VOID*). The Linux kernel world
also has a bunch of fun tricks with stashing flags in a pointer's
bottom bits, magic pointer values, etc. I severely doubt we can run
into issues with this. EDK2 will not exactly run on the C standard's
abstract machine anyway ;)
The other concern is a bit contrived, but I *guess* there could be code
out there that actually dereferences EFI_HANDLEs. That code would crash.
May or may not matter, because such code is arguably already
semantically invalid. (It would not necessarily be invalid at the
language level -- cf. my previous paragraph --, because passing an
otherwise valid EFI_HANDLE to CopyMem, for copying just 1 byte out of
the underlying opaque data structure, would not violate the language.)
This is a feature, not a bug! :P
Seriously though, IHANDLE is not even exposed in semi-public headers,
so any code that's derefing an EFI_HANDLE will need to do something
like
typedef struct {
/* ... */
} IHANDLE;
EFI_HANDLE Handle = /* ... */;
IHANDLE *HandleImpl = (IHANDLE *) Handle;
and I'm a strong believer in "play stupid games, win stupid prizes".
You can definitely make an argument for "this should definitely crash"
instead of just "maybe crashing" (for instance, platforms that still
map the NULL page (like OVMF!), or handles > 4096), so I'm inclined to
think that if we indeed go this route, we should set one or two upper
bits (on 64-bit platforms!) to make handles non-canonical addresses
and therefore necessarily crash on dereference.
I should note that I find it super hard to get a concrete idea on
performance for EFI firmware without adequate tooling - I meant to
write a standalone flamegraph tool a few weeks back (even posted in
edk2-devel), but, as far as I could tell, the architectural timer
protocol couldn't get me the interrupt frame[1]. Until then, whether
any of this radix tree vs RB tree vs flat array stuff really
matters... I find it hard to say.
[1] x86 has 3 timers (PIT, LAPIC timer, HPET) and performance
monitoring interrupts, and I couldn't freely use any of them :^)
Edk2 has some form of profiling already (see
"MdePkg/Include/Library/PerformanceLib.h"). Usually one sees core code
peppered with PERF_CODE_BEGIN and PERF_CODE_END macros. I *think* there
is something like a "display performance" (dp) shell application too,
that can show the collected stats. But I've never used these facilities.
The wiki seems to have two related articles:
https://github.com/tianocore/tianocore.github.io/wiki/Edk2-Performance-
Infrastructure
https://github.com/tianocore/tianocore.github.io/wiki/PerformancePkg
The former looks quite comprehensive, at a quick skim.
/me nods
I've seen those macros around, but I've never used them.
In any case, this problem has piqued my interest, I'll see if I can
find some free time this weekend to hack on a test benchmark and a PoC
:)
--
Pedro