From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail02.groups.io (mail02.groups.io [66.175.222.108]) by spool.mail.gandi.net (Postfix) with ESMTPS id 87E7A7803CE for ; Fri, 12 Jan 2024 18:56:57 +0000 (UTC) DKIM-Signature: a=rsa-sha256; bh=f1zSO7orOT6fnCcXB04S+Dxpc1EdaimFc/yo+BwAL1g=; c=relaxed/simple; d=groups.io; h=From:Message-id:MIME-version:Subject:Date:In-reply-to:Cc:To:References:Precedence:List-Subscribe:List-Help:Sender:List-Id:Mailing-List:Delivered-To:Reply-To:List-Unsubscribe-Post:List-Unsubscribe:Content-type; s=20140610; t=1705085816; v=1; b=flm18LO7wq/gBxCX9iAQVNNFVSLJZ1HyxFAOw6t+Ap/Z43cdCT1bYRNdzvis34MOj43fwlFt 5Ifpdz65f+9DRdDj+TL1hItnHvQul2/2HGTFB0eKhm/r+kQh0UEtaWLPAts3wGu3Kuz6fjbMY5I Hq75z1+DpduUQz8rJKvDsFVY= X-Received: by 127.0.0.2 with SMTP id 1BacYY7687511xqx7n4eWIuQ; Fri, 12 Jan 2024 10:56:56 -0800 X-Received: from rn-mailsvcp-mx-lapp01.apple.com (rn-mailsvcp-mx-lapp01.apple.com [17.179.253.22]) by mx.groups.io with SMTP id smtpd.web11.407.1705085815659452835 for ; Fri, 12 Jan 2024 10:56:55 -0800 X-Received: from rn-mailsvcp-mta-lapp02.rno.apple.com (rn-mailsvcp-mta-lapp02.rno.apple.com [10.225.203.150]) by rn-mailsvcp-mx-lapp01.rno.apple.com (Oracle Communications Messaging Server 8.1.0.23.20230328 64bit (built Mar 28 2023)) with ESMTPS id <0S7500DAEWMVV930@rn-mailsvcp-mx-lapp01.rno.apple.com> for devel@edk2.groups.io; Fri, 12 Jan 2024 10:56:55 -0800 (PST) X-Proofpoint-GUID: thTkWo0BFuLjojR-p4ElsBAOOxRU8fo0 X-Proofpoint-ORIG-GUID: thTkWo0BFuLjojR-p4ElsBAOOxRU8fo0 X-Received: from rn-mailsvcp-policy-lapp01.rno.apple.com (rn-mailsvcp-policy-lapp01.rno.apple.com [17.179.253.18]) by rn-mailsvcp-mta-lapp02.rno.apple.com (Oracle Communications Messaging Server 8.1.0.23.20230328 64bit (built Mar 28 2023)) with ESMTPS id <0S75010WFWMQF9Q0@rn-mailsvcp-mta-lapp02.rno.apple.com>; Fri, 12 Jan 2024 10:56:50 -0800 (PST) X-Received: from process_milters-daemon.rn-mailsvcp-policy-lapp01.rno.apple.com by rn-mailsvcp-policy-lapp01.rno.apple.com (Oracle Communications Messaging Server 8.1.0.22.20230228 64bit (built Feb 28 2023)) id <0S7500M00WE83X00@rn-mailsvcp-policy-lapp01.rno.apple.com>; Fri, 12 Jan 2024 10:56:50 -0800 (PST) X-Va-A: X-Va-T-CD: 70a38c3f5b1d46c4b8dccb3b011be358 X-Va-E-CD: d38c7c8a8a09a1683f7c6688ae4723e1 X-Va-R-CD: 610f65e57ba77b9f82334b7102a66226 X-Va-ID: 1996d383-9444-43de-aec0-a86910986064 X-Va-CD: 0 X-V-A: X-V-T-CD: 70a38c3f5b1d46c4b8dccb3b011be358 X-V-E-CD: d38c7c8a8a09a1683f7c6688ae4723e1 X-V-R-CD: 610f65e57ba77b9f82334b7102a66226 X-V-ID: d7a2fe59-9a25-435e-a79a-1ae27e50f5f5 X-V-CD: 0 X-Received: from smtpclient.apple (unknown [17.234.110.227]) by rn-mailsvcp-policy-lapp01.rno.apple.com (Oracle Communications Messaging Server 8.1.0.22.20230228 64bit (built Feb 28 2023)) with ESMTPSA id <0S75003FJWMPUE00@rn-mailsvcp-policy-lapp01.rno.apple.com>; Fri, 12 Jan 2024 10:56:50 -0800 (PST) From: "Andrew Fish via groups.io" Message-id: <35BA9784-D5EC-4287-BEDA-34FC3286F977@apple.com> MIME-version: 1.0 (Mac OS X Mail 16.0 \(3774.300.61.1.2\)) Subject: Re: [edk2-devel] Memory Attribute for depex section Date: Fri, 12 Jan 2024 10:56:39 -0800 In-reply-to: Cc: "pedro.falcato@gmail.com" , Laszlo Ersek , "nhi@os.amperecomputing.com" , "ardb+tianocore@kernel.org" To: edk2-devel-groups-io , Mike Kinney References: <44ca139f-4d78-4322-b5b6-8e9788bb7486@os.amperecomputing.com> <2ad16043-754e-3bb9-3a4a-702d9a50bf63@redhat.com> <45b95719-f1fc-dbc6-a4cc-a022d691844c@redhat.com> <30901646-905c-798c-d088-255498028fff@redhat.com> <7659165f-a73b-b628-59fe-c29e67beb850@redhat.com> Precedence: Bulk List-Subscribe: List-Help: Sender: devel@edk2.groups.io List-Id: Mailing-List: list devel@edk2.groups.io; contact devel+owner@edk2.groups.io Reply-To: devel@edk2.groups.io,afish@apple.com List-Unsubscribe-Post: List-Unsubscribe=One-Click List-Unsubscribe: X-Gm-Message-State: 1WocsLm98xDvJBW7hBbswP6ax7686176AA= Content-type: multipart/alternative; boundary="Apple-Mail=_15D30F19-45A1-477D-A389-D6B9C243A58D" X-GND-Status: LEGIT Authentication-Results: spool.mail.gandi.net; dkim=pass header.d=groups.io header.s=20140610 header.b=flm18LO7; dmarc=none; spf=pass (spool.mail.gandi.net: domain of bounce@groups.io designates 66.175.222.108 as permitted sender) smtp.mailfrom=bounce@groups.io --Apple-Mail=_15D30F19-45A1-477D-A389-D6B9C243A58D Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > On Jan 12, 2024, at 8:37=E2=80=AFAM, Michael D Kinney wrote: >=20 > Hi Pedro, >=20 > Thank you for evaluating this idea change from linked list to improve > performance of the handle database. >=20 > The concept of using integers for an EFI_HANDLE has been considered befor= e. > One advantage over pointers is that a guarantee can be made that an EFI_H= ANDLE > value can be guaranteed to be unique. In the current implementation with > EFI_HANDLE being a pointer to an allocated buffer, an EFI_HANDLE value co= uld > potentially be reused if an EFI_HANDLE is freed and later allocated for a= =20 > different EFI_HANDLE. >=20 > If you continue to explore this path I agree that EFI_HANDLE value of 0 s= hould > be reserved and never used. I also recommend that new EFI_HANDLE values = are > always assigned a new unique value that freed EFI_HANDLE values are never= reused. >=20 > The overall linked list performance of the handle database has also been = noted > before, but has rarely raised to the level where it impacts the overall b= oot > performance relative to other optimization opportunities. I look forward= to > the performance data. The PERF_X() macros are the right service to use. = On > x86 it is based on the time stamp counter (TSC) which has very small gran= ularity. >=20 Mike, We tracked this a while back with the PERF macros when we had some performa= nce issues running diagnostics on a system with 3,000+ handles. The hot pat= h was CoreValidateHandle(). I think the number of calls to CoreValidateHand= le() explodes if you have more handles so it is not just the raw performanc= e of CoreValidateHandle(), but also how many times it gets called.=20 Thanks, Andrew Fish > Mike >=20 >=20 >> -----Original Message----- >> From: devel@edk2.groups.io > On Behalf Of Pedro Falcato >> Sent: Friday, January 12, 2024 6:47 AM >> To: Laszlo Ersek > >> Cc: devel@edk2.groups.io ; nhi@os.amperecom= puting.com ; >> ardb+tianocore@kernel.org ; Andrew Fis= h > >> Subject: Re: [edk2-devel] Memory Attribute for depex section >>=20 >> On Fri, Jan 12, 2024 at 9:35=E2=80=AFAM Laszlo Ersek = wrote: >>>=20 >>> On 1/12/24 03:10, Pedro Falcato wrote: >>>> My idea was to make each handle an index - like a file descriptor - >>>> AFAIK there's no reason why it *needs* to be an actual pointer. >>>> We'd allocate indices when creating a handle, and return that (casted = to >> VOID*). >>>=20 >>> Huh, sneaky. >>>=20 >>> I see two potential problems with this. >>>=20 >>> First, per C std, these "pointers" would be invalid (they wouldn't >>> actually point to valid memory), so even just evaluating them (not >>> dereferencing them!) would invoke undefined behavior. May or may not >>> matter in practice. With compilers getting smarter about optimization >>> (and stricter about std conformance), there could be issues, perhaps. >>=20 >> This is true. Stashing random integers in pointers is >> implementation-defined. But it's also super common. Win32 HANDLEs are >> exactly this, just integers (stashed in VOID*). The Linux kernel world >> also has a bunch of fun tricks with stashing flags in a pointer's >> bottom bits, magic pointer values, etc. I severely doubt we can run >> into issues with this. EDK2 will not exactly run on the C standard's >> abstract machine anyway ;) >>=20 >>>=20 >>> The other concern is a bit contrived, but I *guess* there could be code >>> out there that actually dereferences EFI_HANDLEs. That code would crash= . >>> May or may not matter, because such code is arguably already >>> semantically invalid. (It would not necessarily be invalid at the >>> language level -- cf. my previous paragraph --, because passing an >>> otherwise valid EFI_HANDLE to CopyMem, for copying just 1 byte out of >>> the underlying opaque data structure, would not violate the language.) >>=20 >> This is a feature, not a bug! :P >>=20 >> Seriously though, IHANDLE is not even exposed in semi-public headers, >> so any code that's derefing an EFI_HANDLE will need to do something >> like >>=20 >> typedef struct { >> /* ... */ >> } IHANDLE; >>=20 >> EFI_HANDLE Handle =3D /* ... */; >>=20 >> IHANDLE *HandleImpl =3D (IHANDLE *) Handle; >>=20 >> and I'm a strong believer in "play stupid games, win stupid prizes". >> You can definitely make an argument for "this should definitely crash" >> instead of just "maybe crashing" (for instance, platforms that still >> map the NULL page (like OVMF!), or handles > 4096), so I'm inclined to >> think that if we indeed go this route, we should set one or two upper >> bits (on 64-bit platforms!) to make handles non-canonical addresses >> and therefore necessarily crash on dereference. >>=20 >>>=20 >>>> I should note that I find it super hard to get a concrete idea on >>>> performance for EFI firmware without adequate tooling - I meant to >>>> write a standalone flamegraph tool a few weeks back (even posted in >>>> edk2-devel), but, as far as I could tell, the architectural timer >>>> protocol couldn't get me the interrupt frame[1]. Until then, whether >>>> any of this radix tree vs RB tree vs flat array stuff really >>>> matters... I find it hard to say. >>>>=20 >>>> [1] x86 has 3 timers (PIT, LAPIC timer, HPET) and performance >>>> monitoring interrupts, and I couldn't freely use any of them :^) >>>=20 >>> Edk2 has some form of profiling already (see >>> "MdePkg/Include/Library/PerformanceLib.h"). Usually one sees core code >>> peppered with PERF_CODE_BEGIN and PERF_CODE_END macros. I *think* there >>> is something like a "display performance" (dp) shell application too, >>> that can show the collected stats. But I've never used these facilities= . >>>=20 >>> The wiki seems to have two related articles: >>>=20 >>> https://github.com/tianocore/tianocore.github.io/wiki/Edk2-Performance- >> Infrastructure >>>=20 >>> https://github.com/tianocore/tianocore.github.io/wiki/PerformancePkg >>>=20 >>> The former looks quite comprehensive, at a quick skim. >>=20 >> /me nods >> I've seen those macros around, but I've never used them. >> In any case, this problem has piqued my interest, I'll see if I can >> find some free time this weekend to hack on a test benchmark and a PoC >> :) >>=20 >> -- >> Pedro >>=20 >>=20 >>=20 >>=20 >=20 >=20 >=20 >=20 -=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#113754): https://edk2.groups.io/g/devel/message/113754 Mute This Topic: https://groups.io/mt/103594587/7686176 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/leave/12367111/7686176/19134562= 12/xyzzy [rebecca@openfw.io] -=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D- --Apple-Mail=_15D30F19-45A1-477D-A389-D6B9C243A58D Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8

On Jan 12, 2024, at 8:37=E2= =80=AFAM, Michael D Kinney <michael.d.kinney@intel.com> wrote:
<= br class=3D"Apple-interchange-newline">
Hi Pedro,

Thank you for evaluating this = idea change from linked list to improve
performance= of the handle database.


T= he concept of using integers for an EFI_HANDLE has been considered before.<= /span>
One advantage over pointers is that a guarantee can= be made that an EFI_HANDLE
value can be guaranteed= to be unique.  In the current implementation with
EFI_HANDLE being a pointer to an allocated buffer, an EFI_HANDLE value= could
potentially be reused if an EFI_HANDLE is fr= eed and later allocated for a 
different EFI_HANDLE.

If you continue to explore this path I agree that= EFI_HANDLE value of 0 should
be reserved and never= used.  I also recommend that new EFI_HANDLE values are
always assigned a new unique value that freed EFI_HANDLE values a= re never reused.

The overa= ll linked list performance of the handle database has also been noted
before, but has rarely raised to the level where it impa= cts the overall boot
performance relative to other = optimization opportunities.  I look forward to
the performance data.  The PERF_X() macros are the right service to u= se.  On
x86 it is based on the time stamp coun= ter (TSC) which has very small granularity.
=

Mike,
=

We tracked this a while back with the PERF macros when = we had some performance issues running diagnostics on a system with 3,000+ = handles. The hot path was CoreValidateHandle(). I think the number of calls= to CoreValidateHandle() explodes if you have more handles so it is not jus= t the raw performance of CoreValidateHandle(), but also how many times it g= ets called. 

Thanks,

Andrew Fish

Mike


-----Original Message-----
From: devel@edk2.groups.io&nbs= p;<devel@edk2.groups.io> On Behalf Of Pedro Falcato
Sent: Friday, January 12, 2024 6:47 AM=
To: Laszlo Ersek <
lersek@redhat= .com>
Cc: devel@edk2.groups.io; nhi@os.amperecomputing.com;
ardb+tianocore@kernel.org; Andrew Fish <afish@apple.com>
Subject: Re: [edk2-devel] Mem= ory Attribute for depex section

On Fri, Jan 12, 2024 at 9:35=E2=80= =AFAM Laszlo Ersek <lersek@redhat.com> wrote:

On 1/12/24 03:10, Pedro Falcato wrote:
My idea was to make each handle an index - like a file descriptor -
A= FAIK there's no reason why it *needs* to be an actual pointer.
We'd allo= cate indices when creating a handle, and return that (casted to
VOID*).

Huh, sneaky.
=
I see two potential problems with this.

First, per C std, these = "pointers" would be invalid (they wouldn't
actually point to valid memor= y), so even just evaluating them (not
dereferencing them!) would invoke = undefined behavior. May or may not
matter in practice. With compilers ge= tting smarter about optimization
(and stricter about std conformance), t= here could be issues, perhaps.

This is true. Stashing r= andom integers in pointers is
implementation-defined. But it's also supe= r common. Win32 HANDLEs are
exactly this, just integers (stashed in VOID= *). The Linux kernel world
also has a bunch of fun tricks with stashing = flags in a pointer's
bottom bits, magic pointer values, etc. I severely = doubt we can run
into issues with this. EDK2 will not exactly run on the= C standard's
abstract machine anyway ;)


The other concern is a bit contrived, but I *guess* there could be co= de
out there that actually dereferences EFI_HANDLEs. That code would cra= sh.
May or may not matter, because such code is arguably already
sema= ntically invalid. (It would not necessarily be invalid at the
language l= evel -- cf. my previous paragraph --, because passing an
otherwise valid= EFI_HANDLE to CopyMem, for copying just 1 byte out of
the underlying op= aque data structure, would not violate the language.)

T= his is a feature, not a bug! :P

Seriously though, IHANDLE is not eve= n exposed in semi-public headers,
so any code that's derefing an EFI_HAN= DLE will need to do something
like

typedef struct {
 /* .= .. */
} IHANDLE;

EFI_HANDLE Handle =3D /* ... */;

IHANDLE = *HandleImpl =3D (IHANDLE *) Handle;

and I'm a strong believer in "pl= ay stupid games, win stupid prizes".
You can definitely make an argument= for "this should definitely crash"
instead of just "maybe crashing" (fo= r instance, platforms that still
map the NULL page (like OVMF!), or hand= les > 4096), so I'm inclined to
think that if we indeed go this route= , we should set one or two upper
bits (on 64-bit platforms!) to make han= dles non-canonical addresses
and therefore necessarily crash on derefere= nce.


I shoul= d note that I find it super hard to get a concrete idea on
performance f= or EFI firmware without adequate tooling - I meant to
write a standalone= flamegraph tool a few weeks back (even posted in
edk2-devel), but, as f= ar as I could tell, the architectural timer
protocol couldn't get me the= interrupt frame[1]. Until then, whether
any of this radix tree vs RB tr= ee vs flat array stuff really
matters... I find it hard to say.

[= 1] x86 has 3 timers (PIT, LAPIC timer, HPET) and performance
monitoring = interrupts, and I couldn't freely use any of them :^)

E= dk2 has some form of profiling already (see
"MdePkg/Include/Library/Perf= ormanceLib.h"). Usually one sees core code
peppered with PERF_CODE_BEGIN= and PERF_CODE_END macros. I *think* there
is something like a "display = performance" (dp) shell application too,
that can show the collected sta= ts. But I've never used these facilities.

The wiki seems to have two= related articles:

https://github.com/tianocore/tianocore.github.io/= wiki/Edk2-Performance-
Infrastructure

https://github.com/tianocore/tianocore.github.io/wiki/Perform= ancePkg

The former looks quite comprehensive, at a quick skim.

/me nods
I've seen those macros around, but I've never us= ed them.
In any case, this problem has piqued my interest, I'll see if I= can
find some free time this weekend to hack on a test benchmark and a = PoC
:)

--
Pedro








_._,_._,_

Groups.io Links:

=20 You receive all messages sent to this group. =20 =20

View/Reply Online (#113754) | =20 | Mute= This Topic | New Topic
Your Subscriptio= n | Contact Group Owner | Unsubscribe [rebecca@openfw.io]

_._,_._,_
--Apple-Mail=_15D30F19-45A1-477D-A389-D6B9C243A58D--