From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from ma-mailsvcp-mx-lapp01.apple.com (ma-mailsvcp-mx-lapp01.apple.com [17.32.222.22]) by mx.groups.io with SMTP id smtpd.web11.3065.1684434114625828705 for ; Thu, 18 May 2023 11:21:54 -0700 Authentication-Results: mx.groups.io; dkim=pass header.i=@apple.com header.s=20180706 header.b=NZtiQt5n; spf=pass (domain: apple.com, ip: 17.32.222.22, mailfrom: afish@apple.com) Received: from rn-mailsvcp-mta-lapp03.rno.apple.com (rn-mailsvcp-mta-lapp03.rno.apple.com [10.225.203.151]) by ma-mailsvcp-mx-lapp01.apple.com (Oracle Communications Messaging Server 8.1.0.22.20230228 64bit (built Feb 28 2023)) with ESMTPS id <0RUV011219OEQO10@ma-mailsvcp-mx-lapp01.apple.com> for devel@edk2.groups.io; Thu, 18 May 2023 11:21:54 -0700 (PDT) X-Proofpoint-ORIG-GUID: uCdBnuwEUPCLT3UDxuZz4cWzL-dmt0Ou X-Proofpoint-GUID: uCdBnuwEUPCLT3UDxuZz4cWzL-dmt0Ou X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.573,18.0.957 definitions=2023-05-18_13:2023-05-17,2023-05-18 signatures=0 X-Proofpoint-Spam-Details: rule=interactive_user_notspam policy=interactive_user score=0 mlxscore=0 phishscore=0 spamscore=0 malwarescore=0 mlxlogscore=999 adultscore=0 bulkscore=0 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2304280000 definitions=main-2305180149 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=apple.com; h=from : message-id : content-type : mime-version : subject : date : in-reply-to : cc : to : references; s=20180706; bh=uu/nx3ceFREqJwu3AeQ9h8h/Glsv18rAtHFXGo54T48=; b=NZtiQt5nADHaMHmBEdXxPaRpefccNlnQAImScK62OrMSvv+Vsqs2W/TYTgvGfDkOVTTS PEGBfrxFDjAV+y/zv3g2BsixSO6WGk2tHb1oO2dVwzpQOT8MqAWWdUWabNWKtqKl8qmA kcwrNPWCkb2DR8hsc3UEseEcqOZNwOXMQ6dol3CBQvHh3ZzNTHduCDBb6tweJ6stwnRJ NomMtzBvmt0AwLviBZkux9E4XC8VMrws9so4pW4F6VhliqBja+VLWbSrhcZiMrbE5Kal GdBTM8WnVJaXTfa9LERVKUCtET+lHWngp4aflCQjy8C9BnxJ9VnWRnTTp3emqgv69Gg4 lw== Received: from rn-mailsvcp-policy-lapp01.rno.apple.com (rn-mailsvcp-policy-lapp01.rno.apple.com [17.179.253.18]) by rn-mailsvcp-mta-lapp03.rno.apple.com (Oracle Communications Messaging Server 8.1.0.22.20230228 64bit (built Feb 28 2023)) with ESMTPS id <0RUV008M39OHFLA0@rn-mailsvcp-mta-lapp03.rno.apple.com>; Thu, 18 May 2023 11:21:53 -0700 (PDT) Received: from process_milters-daemon.rn-mailsvcp-policy-lapp01.rno.apple.com by rn-mailsvcp-policy-lapp01.rno.apple.com (Oracle Communications Messaging Server 8.1.0.22.20230228 64bit (built Feb 28 2023)) id <0RUV003009ED1M00@rn-mailsvcp-policy-lapp01.rno.apple.com>; Thu, 18 May 2023 11:21:53 -0700 (PDT) X-Va-A: X-Va-T-CD: b6ae2f954135f1a0b09635a733ab1554 X-Va-E-CD: 156d5ae2ad5b70872c2d8b4f187426ab X-Va-R-CD: faed37d9f0280288bcb510df700d8485 X-Va-ID: a17c08b1-843e-416e-9397-db8c1b51a0b6 X-Va-CD: 0 X-V-A: X-V-T-CD: b6ae2f954135f1a0b09635a733ab1554 X-V-E-CD: 156d5ae2ad5b70872c2d8b4f187426ab X-V-R-CD: faed37d9f0280288bcb510df700d8485 X-V-ID: 063f3a78-8f70-4217-86c7-1a709e3463d6 X-V-CD: 0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.573,18.0.957 definitions=2023-05-18_13:2023-05-17,2023-05-18 signatures=0 Received: from smtpclient.apple (unknown [17.115.1.186]) by rn-mailsvcp-policy-lapp01.rno.apple.com (Oracle Communications Messaging Server 8.1.0.22.20230228 64bit (built Feb 28 2023)) with ESMTPSA id <0RUV00ANB9OGC500@rn-mailsvcp-policy-lapp01.rno.apple.com>; Thu, 18 May 2023 11:21:52 -0700 (PDT) From: "Andrew Fish" Message-id: MIME-version: 1.0 (Mac OS X Mail 16.0 \(3731.600.7\)) Subject: Re: [edk2-devel] CpuDeadLoop() is optimized by compiler Date: Thu, 18 May 2023 11:21:42 -0700 In-reply-to: Cc: Mike Kinney , Ray' 'Ni To: devel@edk2.groups.io, rebecca@bsdio.com References: X-Mailer: Apple Mail (2.3731.600.7) Content-type: multipart/alternative; boundary="Apple-Mail=_7A42AE16-A3AC-4DA0-888A-3867EBF038AE" --Apple-Mail=_7A42AE16-A3AC-4DA0-888A-3867EBF038AE Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 Rebecca, It looks like VC++ is trying to honor the volatile by reading the variable,= incase that has side effects. But the loop is not checking the value of th= e variable and it is just doing an unconditional jump. This is why I think = it is likely a compiler bug. Since the compiler emitted a hard code jmp in = a loop it optimized out the return instruction=E2=80=A6. $LN10@CpuDeadLoo: mov rax, QWORD PTR Index$[rsp] call CpuPause jmp SHORT $LN10@CpuDeadLoo =E2=80=A6. So changing the variable does not break you out of the loop. If you pc +=3D= 2 when you are at the jmp instruction that will not return you from CpuDea= dLoop() that will just fall into the next function. That might work if CpuD= eadLoop() was inlined, but if it was a call you would start running the nex= t function in the binary.=20 Thanks, Andrew Fish > On May 18, 2023, at 10:36 AM, Rebecca Cran wrote: >=20 > When I use CpuDeadLoop for debugging on Aarch64 I have symbols loaded so = I can just do =E2=80=98set Index=3D1=E2=80=99 and resume, but it sounds lik= e the issue is that people want to sometimes debug without symbols/source, = and the generated assembly is making that difficult. >=20 > Rebecca >=20 > On Thu, May 18, 2023, at 9:36 AM, Michael D Kinney wrote: >> Hi Ray, >>=20 >> So the code generated does deadloop, but is just not easy to resume=20 >> from as we have been able to do in the past. >>=20 >> We use CpuDeadloop() for 2 purposes. One is a terminal condition with= =20 >> no reason to ever continue. >>=20 >> The 2nd is a debug aide for developers to halt the system at a specific= =20 >> location and then continue from that point, usually with a debugger, to= =20 >> step through code to an area to evaluate unexpected behavior. >>=20 >> We may have to do a NASM implementation of CpuDeadloop() to make sure=20 >> it meets both use cases. >>=20 >> Mike >>=20 >> *From:* Ni, Ray =20 >> *Sent:* Thursday, May 18, 2023 3:00 AM >> *To:* devel@edk2.groups.io >> *Cc:* Kinney, Michael D ; Rebecca Cran=20 >> ; Ni, Ray >> *Subject:* CpuDeadLoop() is optimized by compiler >>=20 >> Hi, >> Starting from certain version of Visual Studio C compiler (I don=E2=80= =99t have=20 >> the exact version. I am using VS2019), CpuDeadLoop is now optimized=20 >> quite well by compiler. >>=20 >> The optimization is so =E2=80=9Cgood=E2=80=9D that it becomes harder for= developers to=20 >> break out of the deadloop. >>=20 >> I copied the assembly instructions as below for your reference. >> The compiler does not generate instructions that jump out of the loop=20 >> when the Index is not zero. >> So in order to break out of the loop, developers need to: >> 1. Manually adjust rsp by increasing 40 >> 2. Manually =E2=80=9Cret=E2=80=9D >>=20 >> I am not sure if anyone has interest to re-write this function so that= =20 >> compiler can be =E2=80=9Cfooled=E2=80=9D again. >> Thanks, >> Ray >>=20 >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >> ; Function compile flags: /Ogspy >> ; File e:\work\edk2\MdePkg\Library\BaseLib\CpuDeadLoop.c >> ; COMDAT CpuDeadLoop >> _TEXT SEGMENT >> Index$ =3D 48 >> CpuDeadLoop PROC = =20 >> ; COMDAT >>=20 >> ; 26 : { >>=20 >> $LN12: >> 00000 48 83 ec 28 sub rsp, 40 = =20 >> ; 00000028H >>=20 >> ; 27 : volatile UINTN Index; >> ; 28 :=20 >> ; 29 : for (Index =3D 0; Index =3D=3D 0;) { >>=20 >> 00004 48 c7 44 24 30 >> 00 00 00 00 mov QWORD PTR Index$[rsp], 0 >> $LN10@CpuDeadLoo: >>=20 >> ; 30 : CpuPause (); >>=20 >> 0000d 48 8b 44 24 30 mov rax, QWORD PTR Index$[rsp] >> 00012 e8 00 00 00 00 call CpuPause >> 00017 eb f4 jmp SHORT $LN10@CpuDeadLoo >> CpuDeadLoop ENDP >> _TEXT ENDS >> END >>=20 >>=20 >>=20 >=20 >=20 >=20 --Apple-Mail=_7A42AE16-A3AC-4DA0-888A-3867EBF038AE Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 Rebecca,

It lo= oks like VC++ is trying to honor the volatile by reading the variable, inca= se that has side effects. But the loop is not checking the value of the var= iable and it is just doing an unconditional jump. This is why I think it is= likely a compiler bug. Since the compiler emitted a hard code jmp in a loo= p it optimized out the return instruction=E2=80=A6.

$LN10@CpuDeadLoo:
mov       rax, QWORD= PTR Index$[rsp]
call      CpuPause
jmp =       SHORT $LN10@CpuDeadLoo
=E2=80=A6= .

So changing the variable does not break you out = of the loop. If you pc +=3D 2 when you are at the jmp instruction that will= not return you from CpuDeadLoop() that will just fall into the next functi= on. That might work if CpuDeadLoop() was inlined, but if it was a call you = would start running the next function in the binary. 

Thanks,

Andrew Fish

On May 18, 2023, at 10:36 AM, Rebecca Cran= <rebecca@bsdio.com> wrote:

When I use CpuDeadLoop for debugging on Aarch64 I= have symbols loaded so I can just do =E2=80=98set Index=3D1=E2=80=99 and r= esume, but it sounds like the issue is that people want to sometimes debug = without symbols/source, and the generated assembly is making that difficult= .

Rebecca

On Thu, May 18, 2023, at 9:36 AM, Michael= D Kinney wrote:
Hi Ray,

So the code generated does deadloop, b= ut is just not easy to resume 
from as we have been able to do in the past.

We use CpuDead= loop() for 2 purposes.  One is a terminal condition with 
no reason to ever continue.
The 2nd is a debug aide for developers to halt the system at a specific 

location and then conti= nue from that point, usually with a debugger, to 
step through code to an area to evaluate unexpe= cted behavior.

We may have to do a NASM implementation of CpuDeadloo= p() to make sure 
it m= eets both use cases.

Mike

*From:* Ni, Ray <ray.ni@intel.co= m> 
*Sent:* Thursda= y, May 18, 2023 3:00 AM
*To:* devel@edk2.groups.io
*Cc:* Kinney, Mich= ael D <michael.d.kinney@intel.com>; Rebecca Cran 
<rebecca@bsdio.com>; Ni, Ray <ra= y.ni@intel.com>
*Subject:* CpuDeadLoop() is optimized by compiler
=
Hi,
Starting from certain version of Visual Studio C compiler (I don= =E2=80=99t have 
the e= xact version. I am using VS2019), CpuDeadLoop is now optimized 
quite well by compiler.

= The optimization is so =E2=80=9Cgood=E2=80=9D that it becomes harder for de= velopers to 
break out= of the deadloop.

I copied the assembly instructions as below for yo= ur reference.
The compiler does not generate instructions that jump out = of the loop 
when the = Index is not zero.
So in order to break out of the loop, developers need= to:
1. Manually adjust rsp by increasing 40
2. Manually =E2=80=9Cret= =E2=80=9D

I am not sure if anyone has interest to re-write this func= tion so that 
compiler= can be =E2=80=9Cfooled=E2=80=9D again.
Thanks,
Ray

=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
; Function = compile flags: /Ogspy
; File e:\work\edk2\MdePkg\Library\BaseLib\CpuDead= Loop.c
;           &nb= sp;  COMDAT CpuDeadLoop
_TEXT    SEGMENT
Ind= ex$ =3D 48
CpuDeadLoop PROC        &n= bsp;            = ;            &n= bsp;            = ;          
 &nbs= p;         ; COMDAT

; 2= 6   : {

$LN12:
 00000  48 83 ec 28  &nbs= p;      sub      &nb= sp; rsp, 40          &nbs= p;            &= nbsp; 
      ; 00000028H

; 27 =   :   volatile UINTN  Index;
; 28   :=  
; 29   : &= nbsp; for (Index =3D 0; Index =3D=3D 0;) {

 00004  48= c7 44 24 30
          = ;    00 00 00 00       &n= bsp;mov      QWORD PTR Index$[rsp], 0
$LN10@Cpu= DeadLoo:

; 30   :     CpuPause ();
=
 0000d  48 8b 44 24 30   mov    &nbs= p; rax, QWORD PTR Index$[rsp]
 00012  e8 00 00 00 00 &nbs= p; call        CpuPause
 00= 017  eb f4           = ;          jmp  &nbs= p;    SHORT $LN10@CpuDeadLoo
CpuDeadLoop ENDP
_TE= XT    ENDS
END






--Apple-Mail=_7A42AE16-A3AC-4DA0-888A-3867EBF038AE--