From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from ma-mailsvcp-mx-lapp02.apple.com (ma-mailsvcp-mx-lapp02.apple.com [17.32.222.23]) by mx.groups.io with SMTP id smtpd.web11.1585.1684430654853654668 for ; Thu, 18 May 2023 10:24:15 -0700 Authentication-Results: mx.groups.io; dkim=pass header.i=@apple.com header.s=20180706 header.b=IDUpC9f4; spf=pass (domain: apple.com, ip: 17.32.222.23, mailfrom: afish@apple.com) Received: from rn-mailsvcp-mta-lapp04.rno.apple.com (rn-mailsvcp-mta-lapp04.rno.apple.com [10.225.203.152]) by ma-mailsvcp-mx-lapp02.apple.com (Oracle Communications Messaging Server 8.1.0.22.20230228 64bit (built Feb 28 2023)) with ESMTPS id <0RUV00Y0970CNK20@ma-mailsvcp-mx-lapp02.apple.com> for devel@edk2.groups.io; Thu, 18 May 2023 10:24:14 -0700 (PDT) X-Proofpoint-ORIG-GUID: P_3AHQwq3OA-HznOScf9wptTlvXYqz_k X-Proofpoint-GUID: P_3AHQwq3OA-HznOScf9wptTlvXYqz_k X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.573,18.0.942 definitions=2023-05-10_04:2023-05-05,2023-05-10 signatures=0 X-Proofpoint-Spam-Details: rule=interactive_user_notspam policy=interactive_user score=0 mlxlogscore=999 phishscore=0 suspectscore=0 spamscore=0 adultscore=0 mlxscore=0 bulkscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2304280000 definitions=main-2305100143 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=apple.com; h=from : message-id : content-type : mime-version : subject : date : in-reply-to : cc : to : references; s=20180706; bh=lGDV3wZD2caKMaUR/zFxj7harlK1xm/X83ewJsQ0zqc=; b=IDUpC9f4ag0LDOzdn9/vTYpcr16cMD9jE55si7x9xZ+WmLBaSdrLGwMbeF2Hant+pjzP Py03eUyaWYCg/L9iaOvPPIyI4GJ4eV/nSDH9t3jq8szDF3HPE3jvKfkw1LkkkX1b8afw SHzHMh/vwBVGXKMrSfTDA2Cuqk2OG9l0gwVO62jgpxBx1l0zgDOGexqGOsc9WGXAjLMI pwg6K1QXPIdQz7RqewKWDaD6kPx0LnPZAqWFHrJ23ns440AkGi4PAKd06JxQ2cwxemn3 iHEYNdFLYxroPhMjger8NM2EKV9LmsF9ctJZYR5mC0mqydaX7EzMUwc9IsiOvHyLl5iP Kw== Received: from rn-mailsvcp-policy-lapp01.rno.apple.com (rn-mailsvcp-policy-lapp01.rno.apple.com [17.179.253.18]) by rn-mailsvcp-mta-lapp04.rno.apple.com (Oracle Communications Messaging Server 8.1.0.22.20230228 64bit (built Feb 28 2023)) with ESMTPS id <0RUV00W4N70CU0K0@rn-mailsvcp-mta-lapp04.rno.apple.com>; Thu, 18 May 2023 10:24:12 -0700 (PDT) Received: from process_milters-daemon.rn-mailsvcp-policy-lapp01.rno.apple.com by rn-mailsvcp-policy-lapp01.rno.apple.com (Oracle Communications Messaging Server 8.1.0.22.20230228 64bit (built Feb 28 2023)) id <0RUV00Z006TWF000@rn-mailsvcp-policy-lapp01.rno.apple.com>; Thu, 18 May 2023 10:24:12 -0700 (PDT) X-Va-A: X-Va-T-CD: 70a38c3f5b1d46c4b8dccb3b011be358 X-Va-E-CD: 156d5ae2ad5b70872c2d8b4f187426ab X-Va-R-CD: faed37d9f0280288bcb510df700d8485 X-Va-ID: a2966da0-1d84-4cdb-9839-6542e695d33b X-Va-CD: 0 X-V-A: X-V-T-CD: 70a38c3f5b1d46c4b8dccb3b011be358 X-V-E-CD: 156d5ae2ad5b70872c2d8b4f187426ab X-V-R-CD: faed37d9f0280288bcb510df700d8485 X-V-ID: 7d2ead56-a82e-4cde-b248-b04c027a2107 X-V-CD: 0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.573,18.0.957 definitions=2023-05-18_13:2023-05-17,2023-05-18 signatures=0 Received: from smtpclient.apple (unknown [17.115.1.186]) by rn-mailsvcp-policy-lapp01.rno.apple.com (Oracle Communications Messaging Server 8.1.0.22.20230228 64bit (built Feb 28 2023)) with ESMTPSA id <0RUV0014H70BN800@rn-mailsvcp-policy-lapp01.rno.apple.com>; Thu, 18 May 2023 10:24:12 -0700 (PDT) From: "Andrew Fish" Message-id: <0EECE39F-0B65-4BFE-8668-CB59448C578D@apple.com> MIME-version: 1.0 (Mac OS X Mail 16.0 \(3731.600.7\)) Subject: Re: [edk2-devel] CpuDeadLoop() is optimized by compiler Date: Thu, 18 May 2023 10:24:01 -0700 In-reply-to: Cc: "Ni, Ray" , Rebecca Cran To: edk2-devel-groups-io , Mike Kinney References: <7C9FD4BA-328C-4CFE-AF5A-3A795BB147E4@apple.com> X-Mailer: Apple Mail (2.3731.600.7) Content-type: multipart/alternative; boundary="Apple-Mail=_1EFF5097-C380-49E4-B80B-0CE55E7D170D" --Apple-Mail=_1EFF5097-C380-49E4-B80B-0CE55E7D170D Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 Mike, I guess my other question=E2=80=A6 If this turns out to be a compiler bug s= hould we scope the change to the broken toolchain. I=E2=80=99m not sure wha= t the right answer is for that, but I want to ask the question?=20 Thanks, Andrew Fish > On May 18, 2023, at 10:19 AM, Michael D Kinney wrote: >=20 > Andrew, > =20 > This might work for XIP. Set non const global to initial value that is e= xpected value to stay in dead loop. > =20 > UINTN mDeadLoopCount =3D 0; > =20 > VOID > CpuDeadLoop( > VOID > )=20 > { > while (mDeadLoopCount =3D=3D 0) { > CpuPause(); > } > } > =20 > When deadloop is entered, developer can not change value of mDeadLoopCoun= t, but they can use debugger to force exit loop and return from function. > =20 > Mike > =20 > =20 > From: Andrew (EFI) Fish >=20 > Sent: Thursday, May 18, 2023 10:09 AM > To: Kinney, Michael D > > Cc: edk2-devel-groups-io >; Ni, Ray >; Rebecca Cran > > Subject: Re: [edk2-devel] CpuDeadLoop() is optimized by compiler > =20 > Mike, > =20 > Good point, that is why we are using the stack =E2=80=A6. > =20 > The only other thing I can think of is to pass the address of Index to so= me inline assembler, or an asm no op function, to give it a side effect the= compiler can=E2=80=99t resolve.=20 > =20 > Thanks, > =20 > Andrew Fish >=20 >=20 > On May 18, 2023, at 10:05 AM, Kinney, Michael D > wrote: > =20 > Static global will not work for XIP > =20 > Mike > =20 > From: Andrew (EFI) Fish >=20 > Sent: Thursday, May 18, 2023 9:49 AM > To: edk2-devel-groups-io >; Kinney, Michael D > > Cc: Ni, Ray >; Rebecca Cran > > Subject: Re: [edk2-devel] CpuDeadLoop() is optimized by compiler > =20 > Mike, > =20 > I pinged some compiler experts to see if our code is correct, or if the c= ompiler has an issue. Seems to be trending compiler issue right now, but I= =E2=80=99ve NOT gotten feedback from anyone on the spec committee yet.=20 > =20 > If we move Index to a static global that would likely work around the com= piler issue. > =20 > Thanks, > =20 > Andrew Fish >=20 >=20 >=20 > On May 18, 2023, at 8:36 AM, Michael D Kinney > wrote: > =20 > Hi Ray, > =20 > So the code generated does deadloop, but is just not easy to resume from = as we have been able to do in the past. > =20 > We use CpuDeadloop() for 2 purposes. One is a terminal condition with no= reason to ever continue. > =20 > The 2nd is a debug aide for developers to halt the system at a specific l= ocation and then continue from that point, usually with a debugger, to step= through code to an area to evaluate unexpected behavior. > =20 > We may have to do a NASM implementation of CpuDeadloop() to make sure it = meets both use cases. > =20 > Mike > =20 > From: Ni, Ray >=20 > Sent: Thursday, May 18, 2023 3:00 AM > To: devel@edk2.groups.io > Cc: Kinney, Michael D >; Rebecca Cran >;= Ni, Ray > > Subject: CpuDeadLoop() is optimized by compiler > =20 > Hi, > Starting from certain version of Visual Studio C compiler (I don=E2=80=99= t have the exact version. I am using VS2019), CpuDeadLoop is now optimized = quite well by compiler. > =20 > The optimization is so =E2=80=9Cgood=E2=80=9D that it becomes harder for = developers to break out of the deadloop. > =20 > I copied the assembly instructions as below for your reference. > The compiler does not generate instructions that jump out of the loop whe= n the Index is not zero. > So in order to break out of the loop, developers need to: > Manually adjust rsp by increasing 40 > Manually =E2=80=9Cret=E2=80=9D > =20 > I am not sure if anyone has interest to re-write this function so that co= mpiler can be =E2=80=9Cfooled=E2=80=9D again. > Thanks, > Ray > =20 > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > ; Function compile flags: /Ogspy > ; File e:\work\edk2\MdePkg\Library\BaseLib\CpuDeadLoop.c > ; COMDAT CpuDeadLoop > _TEXT SEGMENT > Index$ =3D 48 > CpuDeadLoop PROC = ; COMDAT > =20 > ; 26 : { > =20 > $LN12: > 00000 48 83 ec 28 sub rsp, 40 = ; 00000028H > =20 > ; 27 : volatile UINTN Index; > ; 28 :=20 > ; 29 : for (Index =3D 0; Index =3D=3D 0;) { > =20 > 00004 48 c7 44 24 30 > 00 00 00 00 mov QWORD PTR Index$[rsp], 0 > $LN10@CpuDeadLoo: > =20 > ; 30 : CpuPause (); > =20 > 0000d 48 8b 44 24 30 mov rax, QWORD PTR Index$[rsp] > 00012 e8 00 00 00 00 call CpuPause > 00017 eb f4 jmp SHORT $LN10@CpuDeadLoo > CpuDeadLoop ENDP > _TEXT ENDS > END > =20 > =20 > =20 >=20 --Apple-Mail=_1EFF5097-C380-49E4-B80B-0CE55E7D170D Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 Mike,

I guess = my other question=E2=80=A6 If this turns out to be a compiler bug should we= scope the change to the broken toolchain. I=E2=80=99m not sure what the ri= ght answer is for that, but I want to ask the question? 
Thanks,

Andrew Fish

On May 18, 2023, at 10:19 AM, Michael D Kinney <= ;michael.d.kinney@intel.com> wrote:

Andrew,
 =
This might work for XIP.  Set non const global to init= ial value that is expected value to stay in dead loop.
=  
UINTN  mDeadLoopCount =3D 0;
 
VOID
CpuDeadLoop(=
  VOID
  ) 
{
=   while (mDeadLoopCount =3D=3D 0) {
   &= nbsp;  CpuPause();
  }
<= div style=3D"margin: 0in; font-size: 11pt; font-family: Calibri, sans-serif= ;">}
 
When deadloop is entere= d, developer can not change value of mDeadLoopCount, but they can use debug= ger to force exit loop and return from function.
&= nbsp;
Mike
 
 
<= div style=3D"border-width: 1pt medium medium; border-style: solid none none= ; border-color: rgb(225, 225, 225) currentcolor currentcolor; border-image:= none; padding: 3pt 0in 0in;">
From: Andrew (EFI) Fish <afish@apple.com&= gt; 
Sent: Thursday, May 18, 2023 10:09 = AM
To: Kinney, = Michael D <michael.d.kinney@intel.com>
<= b>Cc: edk2-devel-grou= ps-io <devel@edk2.groups.io>; Ni, Ray <ray.ni@intel.com>; Rebecca Cran <rebecca@bsdio.= com>
Subject: <= /span>Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
 
Mike,
Good point, that is why we are usin= g the stack =E2=80=A6.
 
The only other thing I can think of is to pass the address= of Index to some inline assembler, or an asm no op function, to give it a = side effect the compiler can=E2=80=99t resolve. 
 
Thanks,
 
Andrew Fish


On May 18, 2023, at 10:05 AM, Kinney, Michael= D <michael.d.kinney@intel.com> wrote:=
 
Static global wi= ll not work for XIP
 
<= /div>
Mike
 
From: Andrew (EFI) Fish <= afish@apple.com>&nb= sp;
Sent: Thursday, May 18, 2023 9:49 AM
To: edk2-devel-groups-io <devel@edk2.g= roups.io>; Kinney, Michael D <michael.d.kin= ney@intel.com>
Cc:&n= bsp;Ni, Ray <ray.ni@intel.com>; Rebecca Cran &= lt;rebecca@bsdio.com>
Subject: Re: [edk2-devel] CpuDeadLoop() is optim= ized by compiler
 
Mike,
 
I pinged some compiler experts to se= e if our code is correct, or if the compiler has an issue. Seems to be tren= ding compiler issue right now, but I=E2=80=99ve NOT gotten feedback from an= yone on the spec committee yet. 
 
If we move Index to a= static global that would likely work around the compiler issue.=
 
Thanks,
 
Andrew Fish
<= div style=3D"margin: 0in; font-size: 11pt; font-family: Calibri, sans-serif= ;">


On May 18, 2023, at 8:36 AM, Michael D = Kinney <michael.d.kinney@intel.com> wrote:<= o:p>
 
Hi Ray,
 
So the code generated does deadloop, but = is just not easy to resume from as we have been able to do in the past.
 
<= div>
We use CpuDeadloop() for 2 purposes.  One is a terminal c= ondition with no reason to ever continue.
=
 
The 2nd is a debug aide for dev= elopers to halt the system at a specific location and then continue from th= at point, usually with a debugger, to step through code to an area to evalu= ate unexpected behavior.
 =
We may have to do a NASM impleme= ntation of CpuDeadloop() to make sure it meets both use cases.
 
=
Mike
 
<= /div>
<= div style=3D"margin: 0in; font-size: 11pt; font-family: Calibri, sans-serif= ;">From: Ni, Ray &= lt;ray.ni@intel.com> 
Sent: = Thursday, May 18, 2023 3:00 AM
To: devel@edk2.groups.io
C= c: Kinney, Michael D = <michael.d.kinney@intel.com>; Rebecca Cran = <rebecca@bsdio.com>; Ni, Ray <ray.ni@= intel.com>
Subject:&= nbsp;CpuDeadLoop() is optimized by compiler
 
=
Hi,
Starting from certain ver= sion of Visual Studio C compiler (I don=E2=80=99t have the exact version. I= am using VS2019), CpuDeadLoop is now optimized quite well by compiler.
 
<= div>
The optimization is so =E2=80=9Cgood=E2=80=9D that it becomes = harder for developers to break out of the deadloop.
<= /div>
 
I copi= ed the assembly instructions as below for your reference.
<= /div>
The compiler does not generate instructions that j= ump out of the loop when the Index is not zero.
So in order to break out of the loop, developers need to:
  1. Manually adjust rsp = by increasing 40
  2. Manually =E2= =80=9Cret=E2=80=9D
 
I am not sure if anyone has interest to re-write= this function so that compiler can be =E2=80=9Cfooled=E2=80=9D again.=
Thanks,
Ray
=
 
=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D<= /div>
; Function compile flags: /Ogspy<= /div>
; File e:\work\edk2\MdePkg\Library\BaseLib\C= puDeadLoop.c
;   &= nbsp;          COMDAT CpuDeadL= oop
_TEXT    SEGME= NT
Index$ =3D 48
=
CpuDeadLoop PROC     &nb= sp;            =             &nb= sp;            =             &nb= sp;            ; COM= DAT
 
=
; 26   : {
<= div>
 
$LN12:<= /div>
  00000  48 83 ec 28  &n= bsp;      sub      &= nbsp; rsp, 40          &n= bsp;            = ;         ; 00000028H
 
<= div style=3D"margin: 0in; font-size: 11pt; font-family: Calibri, sans-serif= ;">; 27   :   volatile UINTN  Index;
; 28   : 
; 29 = ;  :   for (Index =3D 0; Index =3D=3D 0;) {
=
 
=   00004  48 c7 44 24 30
            &n= bsp;  00 00 00 00        mov &= nbsp;    QWORD PTR Index$[rsp], 0
$LN10@CpuDeadLoo:
 
; 30   : &= nbsp;   CpuPause ();
&n= bsp;
  0000d  48 8b 44 = 24 30   mov      rax, QWORD PTR Index$[r= sp]
  00012  e8 00 00 0= 0 00   call        CpuPause
  00017  eb f4  = ;            &n= bsp;      jmp       = SHORT $LN10@CpuDeadLoo
CpuDeadLoo= p ENDP
_TEXT    EN= DS
END
 
 
<= /div>
 = ;

--Apple-Mail=_1EFF5097-C380-49E4-B80B-0CE55E7D170D--