public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
* CpuDeadLoop() is optimized by compiler
@ 2023-05-18  9:59 Ni, Ray
  2023-05-18 13:19 ` [edk2-devel] " Pedro Falcato
  2023-05-18 15:36 ` Michael D Kinney
  0 siblings, 2 replies; 27+ messages in thread
From: Ni, Ray @ 2023-05-18  9:59 UTC (permalink / raw)
  To: devel@edk2.groups.io; +Cc: Kinney, Michael D, Rebecca Cran, Ni, Ray

[-- Attachment #1: Type: text/plain, Size: 1537 bytes --]

Hi,
Starting from certain version of Visual Studio C compiler (I don't have the exact version. I am using VS2019), CpuDeadLoop is now optimized quite well by compiler.

The optimization is so "good" that it becomes harder for developers to break out of the deadloop.

I copied the assembly instructions as below for your reference.
The compiler does not generate instructions that jump out of the loop when the Index is not zero.
So in order to break out of the loop, developers need to:

  1.  Manually adjust rsp by increasing 40
  2.  Manually "ret"

I am not sure if anyone has interest to re-write this function so that compiler can be "fooled" again.
Thanks,
Ray

=======================
; Function compile flags: /Ogspy
; File e:\work\edk2\MdePkg\Library\BaseLib\CpuDeadLoop.c
;              COMDAT CpuDeadLoop
_TEXT    SEGMENT
Index$ = 48
CpuDeadLoop PROC                                                                    ; COMDAT

; 26   : {

$LN12:
  00000  48 83 ec 28         sub        rsp, 40                                ; 00000028H

; 27   :   volatile UINTN  Index;
; 28   :
; 29   :   for (Index = 0; Index == 0;) {

  00004  48 c7 44 24 30
               00 00 00 00        mov      QWORD PTR Index$[rsp], 0
$LN10@CpuDeadLoo:

; 30   :     CpuPause ();

  0000d  48 8b 44 24 30   mov      rax, QWORD PTR Index$[rsp]
  00012  e8 00 00 00 00   call        CpuPause
  00017  eb f4                     jmp       SHORT $LN10@CpuDeadLoo
CpuDeadLoop ENDP
_TEXT    ENDS
END



[-- Attachment #2: Type: text/html, Size: 7871 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
  2023-05-18  9:59 CpuDeadLoop() is optimized by compiler Ni, Ray
@ 2023-05-18 13:19 ` Pedro Falcato
  2023-05-18 15:36 ` Michael D Kinney
  1 sibling, 0 replies; 27+ messages in thread
From: Pedro Falcato @ 2023-05-18 13:19 UTC (permalink / raw)
  To: devel, ray.ni; +Cc: Kinney, Michael D, Rebecca Cran

On Thu, May 18, 2023 at 10:59 AM Ni, Ray <ray.ni@intel.com> wrote:
>
> Hi,
>
> Starting from certain version of Visual Studio C compiler (I don’t have the exact version. I am using VS2019), CpuDeadLoop is now optimized quite well by compiler.
>
>
>
> The optimization is so “good” that it becomes harder for developers to break out of the deadloop.
>
>
>
> I copied the assembly instructions as below for your reference.
>
> The compiler does not generate instructions that jump out of the loop when the Index is not zero.
>
> So in order to break out of the loop, developers need to:
>
> Manually adjust rsp by increasing 40
> Manually “ret”
>
>
>
> I am not sure if anyone has interest to re-write this function so that compiler can be “fooled” again.
>
> Thanks,
> Ray
>
>
>
> =======================
>
> ; Function compile flags: /Ogspy
>
> ; File e:\work\edk2\MdePkg\Library\BaseLib\CpuDeadLoop.c
>
> ;              COMDAT CpuDeadLoop
>
> _TEXT    SEGMENT
>
> Index$ = 48
>
> CpuDeadLoop PROC                                                                    ; COMDAT
>
>
>
> ; 26   : {
>
>
>
> $LN12:
>
>   00000  48 83 ec 28         sub        rsp, 40                                ; 00000028H
>
>
>
> ; 27   :   volatile UINTN  Index;
>
> ; 28   :
>
> ; 29   :   for (Index = 0; Index == 0;) {
>
>
>
>   00004  48 c7 44 24 30
>
>                00 00 00 00        mov      QWORD PTR Index$[rsp], 0
>
> $LN10@CpuDeadLoo:
>
>
>
> ; 30   :     CpuPause ();
>
>
>
>   0000d  48 8b 44 24 30   mov      rax, QWORD PTR Index$[rsp]
>
>   00012  e8 00 00 00 00   call        CpuPause
>
>   00017  eb f4                     jmp       SHORT $LN10@CpuDeadLoo
>
> CpuDeadLoop ENDP
>
> _TEXT    ENDS
>
> END

Hi Ray,

Can you try something like this? https://godbolt.org/z/x7P1PqY59

Seems to work, but godbolt does not support MSVC LTO :/

-- 
Pedro

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: CpuDeadLoop() is optimized by compiler
  2023-05-18  9:59 CpuDeadLoop() is optimized by compiler Ni, Ray
  2023-05-18 13:19 ` [edk2-devel] " Pedro Falcato
@ 2023-05-18 15:36 ` Michael D Kinney
  2023-05-18 16:49   ` [edk2-devel] " Andrew Fish
  2023-05-18 17:36   ` Rebecca Cran
  1 sibling, 2 replies; 27+ messages in thread
From: Michael D Kinney @ 2023-05-18 15:36 UTC (permalink / raw)
  To: Ni, Ray, devel@edk2.groups.io; +Cc: Rebecca Cran, Kinney, Michael D

[-- Attachment #1: Type: text/plain, Size: 2344 bytes --]

Hi Ray,

So the code generated does deadloop, but is just not easy to resume from as we have been able to do in the past.

We use CpuDeadloop() for 2 purposes.  One is a terminal condition with no reason to ever continue.

The 2nd is a debug aide for developers to halt the system at a specific location and then continue from that point, usually with a debugger, to step through code to an area to evaluate unexpected behavior.

We may have to do a NASM implementation of CpuDeadloop() to make sure it meets both use cases.

Mike

From: Ni, Ray <ray.ni@intel.com>
Sent: Thursday, May 18, 2023 3:00 AM
To: devel@edk2.groups.io
Cc: Kinney, Michael D <michael.d.kinney@intel.com>; Rebecca Cran <rebecca@bsdio.com>; Ni, Ray <ray.ni@intel.com>
Subject: CpuDeadLoop() is optimized by compiler

Hi,
Starting from certain version of Visual Studio C compiler (I don't have the exact version. I am using VS2019), CpuDeadLoop is now optimized quite well by compiler.

The optimization is so "good" that it becomes harder for developers to break out of the deadloop.

I copied the assembly instructions as below for your reference.
The compiler does not generate instructions that jump out of the loop when the Index is not zero.
So in order to break out of the loop, developers need to:

  1.  Manually adjust rsp by increasing 40
  2.  Manually "ret"

I am not sure if anyone has interest to re-write this function so that compiler can be "fooled" again.
Thanks,
Ray

=======================
; Function compile flags: /Ogspy
; File e:\work\edk2\MdePkg\Library\BaseLib\CpuDeadLoop.c
;              COMDAT CpuDeadLoop
_TEXT    SEGMENT
Index$ = 48
CpuDeadLoop PROC                                                                    ; COMDAT

; 26   : {

$LN12:
  00000  48 83 ec 28         sub        rsp, 40                                ; 00000028H

; 27   :   volatile UINTN  Index;
; 28   :
; 29   :   for (Index = 0; Index == 0;) {

  00004  48 c7 44 24 30
               00 00 00 00        mov      QWORD PTR Index$[rsp], 0
$LN10@CpuDeadLoo:

; 30   :     CpuPause ();

  0000d  48 8b 44 24 30   mov      rax, QWORD PTR Index$[rsp]
  00012  e8 00 00 00 00   call        CpuPause
  00017  eb f4                     jmp       SHORT $LN10@CpuDeadLoo
CpuDeadLoop ENDP
_TEXT    ENDS
END



[-- Attachment #2: Type: text/html, Size: 9465 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
  2023-05-18 15:36 ` Michael D Kinney
@ 2023-05-18 16:49   ` Andrew Fish
  2023-05-18 17:05     ` Michael D Kinney
  2023-05-18 17:36   ` Rebecca Cran
  1 sibling, 1 reply; 27+ messages in thread
From: Andrew Fish @ 2023-05-18 16:49 UTC (permalink / raw)
  To: edk2-devel-groups-io, Mike Kinney; +Cc: Ni, Ray, Rebecca Cran

[-- Attachment #1: Type: text/plain, Size: 3073 bytes --]

Mike,

I pinged some compiler experts to see if our code is correct, or if the compiler has an issue. Seems to be trending compiler issue right now, but I’ve NOT gotten feedback from anyone on the spec committee yet. 

If we move Index to a static global that would likely work around the compiler issue.

Thanks,

Andrew Fish

> On May 18, 2023, at 8:36 AM, Michael D Kinney <michael.d.kinney@intel.com> wrote:
> 
> Hi Ray,
>  
> So the code generated does deadloop, but is just not easy to resume from as we have been able to do in the past.
>  
> We use CpuDeadloop() for 2 purposes.  One is a terminal condition with no reason to ever continue.
>  
> The 2nd is a debug aide for developers to halt the system at a specific location and then continue from that point, usually with a debugger, to step through code to an area to evaluate unexpected behavior.
>  
> We may have to do a NASM implementation of CpuDeadloop() to make sure it meets both use cases.
>  
> Mike
>  
> From: Ni, Ray <ray.ni@intel.com <mailto:ray.ni@intel.com>> 
> Sent: Thursday, May 18, 2023 3:00 AM
> To: devel@edk2.groups.io <mailto:devel@edk2.groups.io>
> Cc: Kinney, Michael D <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>>; Rebecca Cran <rebecca@bsdio.com <mailto:rebecca@bsdio.com>>; Ni, Ray <ray.ni@intel.com <mailto:ray.ni@intel.com>>
> Subject: CpuDeadLoop() is optimized by compiler
>  
> Hi,
> Starting from certain version of Visual Studio C compiler (I don’t have the exact version. I am using VS2019), CpuDeadLoop is now optimized quite well by compiler.
>  
> The optimization is so “good” that it becomes harder for developers to break out of the deadloop.
>  
> I copied the assembly instructions as below for your reference.
> The compiler does not generate instructions that jump out of the loop when the Index is not zero.
> So in order to break out of the loop, developers need to:
> Manually adjust rsp by increasing 40
> Manually “ret”
>  
> I am not sure if anyone has interest to re-write this function so that compiler can be “fooled” again.
> Thanks,
> Ray
>  
> =======================
> ; Function compile flags: /Ogspy
> ; File e:\work\edk2\MdePkg\Library\BaseLib\CpuDeadLoop.c
> ;              COMDAT CpuDeadLoop
> _TEXT    SEGMENT
> Index$ = 48
> CpuDeadLoop PROC                                                                    ; COMDAT
>  
> ; 26   : {
>  
> $LN12:
>   00000  48 83 ec 28         sub        rsp, 40                                ; 00000028H
>  
> ; 27   :   volatile UINTN  Index;
> ; 28   : 
> ; 29   :   for (Index = 0; Index == 0;) {
>  
>   00004  48 c7 44 24 30
>                00 00 00 00        mov      QWORD PTR Index$[rsp], 0
> $LN10@CpuDeadLoo:
>  
> ; 30   :     CpuPause ();
>  
>   0000d  48 8b 44 24 30   mov      rax, QWORD PTR Index$[rsp]
>   00012  e8 00 00 00 00   call        CpuPause
>   00017  eb f4                     jmp       SHORT $LN10@CpuDeadLoo
> CpuDeadLoop ENDP
> _TEXT    ENDS
> END
>  
>  
> 


[-- Attachment #2: Type: text/html, Size: 11479 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
  2023-05-18 16:49   ` [edk2-devel] " Andrew Fish
@ 2023-05-18 17:05     ` Michael D Kinney
  2023-05-18 17:08       ` Andrew Fish
  0 siblings, 1 reply; 27+ messages in thread
From: Michael D Kinney @ 2023-05-18 17:05 UTC (permalink / raw)
  To: Andrew (EFI) Fish, edk2-devel-groups-io
  Cc: Ni, Ray, Rebecca Cran, Kinney, Michael D

[-- Attachment #1: Type: text/plain, Size: 3322 bytes --]

Static global will not work for XIP

Mike

From: Andrew (EFI) Fish <afish@apple.com>
Sent: Thursday, May 18, 2023 9:49 AM
To: edk2-devel-groups-io <devel@edk2.groups.io>; Kinney, Michael D <michael.d.kinney@intel.com>
Cc: Ni, Ray <ray.ni@intel.com>; Rebecca Cran <rebecca@bsdio.com>
Subject: Re: [edk2-devel] CpuDeadLoop() is optimized by compiler

Mike,

I pinged some compiler experts to see if our code is correct, or if the compiler has an issue. Seems to be trending compiler issue right now, but I’ve NOT gotten feedback from anyone on the spec committee yet.

If we move Index to a static global that would likely work around the compiler issue.

Thanks,

Andrew Fish


On May 18, 2023, at 8:36 AM, Michael D Kinney <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>> wrote:

Hi Ray,

So the code generated does deadloop, but is just not easy to resume from as we have been able to do in the past.

We use CpuDeadloop() for 2 purposes.  One is a terminal condition with no reason to ever continue.

The 2nd is a debug aide for developers to halt the system at a specific location and then continue from that point, usually with a debugger, to step through code to an area to evaluate unexpected behavior.

We may have to do a NASM implementation of CpuDeadloop() to make sure it meets both use cases.

Mike

From: Ni, Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>
Sent: Thursday, May 18, 2023 3:00 AM
To: devel@edk2.groups.io<mailto:devel@edk2.groups.io>
Cc: Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>; Rebecca Cran <rebecca@bsdio.com<mailto:rebecca@bsdio.com>>; Ni, Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>
Subject: CpuDeadLoop() is optimized by compiler

Hi,
Starting from certain version of Visual Studio C compiler (I don’t have the exact version. I am using VS2019), CpuDeadLoop is now optimized quite well by compiler.

The optimization is so “good” that it becomes harder for developers to break out of the deadloop.

I copied the assembly instructions as below for your reference.
The compiler does not generate instructions that jump out of the loop when the Index is not zero.
So in order to break out of the loop, developers need to:

  1.  Manually adjust rsp by increasing 40
  2.  Manually “ret”

I am not sure if anyone has interest to re-write this function so that compiler can be “fooled” again.
Thanks,
Ray

=======================
; Function compile flags: /Ogspy
; File e:\work\edk2\MdePkg\Library\BaseLib\CpuDeadLoop.c
;              COMDAT CpuDeadLoop
_TEXT    SEGMENT
Index$ = 48
CpuDeadLoop PROC                                                                    ; COMDAT

; 26   : {

$LN12:
  00000  48 83 ec 28         sub        rsp, 40                                ; 00000028H

; 27   :   volatile UINTN  Index;
; 28   :
; 29   :   for (Index = 0; Index == 0;) {

  00004  48 c7 44 24 30
               00 00 00 00        mov      QWORD PTR Index$[rsp], 0
$LN10@CpuDeadLoo:

; 30   :     CpuPause ();

  0000d  48 8b 44 24 30   mov      rax, QWORD PTR Index$[rsp]
  00012  e8 00 00 00 00   call        CpuPause
  00017  eb f4                     jmp       SHORT $LN10@CpuDeadLoo
CpuDeadLoop ENDP
_TEXT    ENDS
END





[-- Attachment #2: Type: text/html, Size: 12182 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
  2023-05-18 17:05     ` Michael D Kinney
@ 2023-05-18 17:08       ` Andrew Fish
  2023-05-18 17:19         ` Michael D Kinney
  0 siblings, 1 reply; 27+ messages in thread
From: Andrew Fish @ 2023-05-18 17:08 UTC (permalink / raw)
  To: Mike Kinney; +Cc: edk2-devel-groups-io, Ni, Ray, Rebecca Cran

[-- Attachment #1: Type: text/plain, Size: 3881 bytes --]

Mike,

Good point, that is why we are using the stack ….

The only other thing I can think of is to pass the address of Index to some inline assembler, or an asm no op function, to give it a side effect the compiler can’t resolve. 

Thanks,

Andrew Fish

> On May 18, 2023, at 10:05 AM, Kinney, Michael D <michael.d.kinney@intel.com> wrote:
> 
> Static global will not work for XIP
>  
> Mike
>  
> From: Andrew (EFI) Fish <afish@apple.com> 
> Sent: Thursday, May 18, 2023 9:49 AM
> To: edk2-devel-groups-io <devel@edk2.groups.io>; Kinney, Michael D <michael.d.kinney@intel.com>
> Cc: Ni, Ray <ray.ni@intel.com>; Rebecca Cran <rebecca@bsdio.com>
> Subject: Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
>  
> Mike,
>  
> I pinged some compiler experts to see if our code is correct, or if the compiler has an issue. Seems to be trending compiler issue right now, but I’ve NOT gotten feedback from anyone on the spec committee yet. 
>  
> If we move Index to a static global that would likely work around the compiler issue.
>  
> Thanks,
>  
> Andrew Fish
> 
> 
> On May 18, 2023, at 8:36 AM, Michael D Kinney <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>> wrote:
>  
> Hi Ray,
>  
> So the code generated does deadloop, but is just not easy to resume from as we have been able to do in the past.
>  
> We use CpuDeadloop() for 2 purposes.  One is a terminal condition with no reason to ever continue.
>  
> The 2nd is a debug aide for developers to halt the system at a specific location and then continue from that point, usually with a debugger, to step through code to an area to evaluate unexpected behavior.
>  
> We may have to do a NASM implementation of CpuDeadloop() to make sure it meets both use cases.
>  
> Mike
>  
> From: Ni, Ray <ray.ni@intel.com <mailto:ray.ni@intel.com>> 
> Sent: Thursday, May 18, 2023 3:00 AM
> To: devel@edk2.groups.io <mailto:devel@edk2.groups.io>
> Cc: Kinney, Michael D <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>>; Rebecca Cran <rebecca@bsdio.com <mailto:rebecca@bsdio.com>>; Ni, Ray <ray.ni@intel.com <mailto:ray.ni@intel.com>>
> Subject: CpuDeadLoop() is optimized by compiler
>  
> Hi,
> Starting from certain version of Visual Studio C compiler (I don’t have the exact version. I am using VS2019), CpuDeadLoop is now optimized quite well by compiler.
>  
> The optimization is so “good” that it becomes harder for developers to break out of the deadloop.
>  
> I copied the assembly instructions as below for your reference.
> The compiler does not generate instructions that jump out of the loop when the Index is not zero.
> So in order to break out of the loop, developers need to:
> Manually adjust rsp by increasing 40
> Manually “ret”
>  
> I am not sure if anyone has interest to re-write this function so that compiler can be “fooled” again.
> Thanks,
> Ray
>  
> =======================
> ; Function compile flags: /Ogspy
> ; File e:\work\edk2\MdePkg\Library\BaseLib\CpuDeadLoop.c
> ;              COMDAT CpuDeadLoop
> _TEXT    SEGMENT
> Index$ = 48
> CpuDeadLoop PROC                                                                    ; COMDAT
>  
> ; 26   : {
>  
> $LN12:
>   00000  48 83 ec 28         sub        rsp, 40                                ; 00000028H
>  
> ; 27   :   volatile UINTN  Index;
> ; 28   : 
> ; 29   :   for (Index = 0; Index == 0;) {
>  
>   00004  48 c7 44 24 30
>                00 00 00 00        mov      QWORD PTR Index$[rsp], 0
> $LN10@CpuDeadLoo:
>  
> ; 30   :     CpuPause ();
>  
>   0000d  48 8b 44 24 30   mov      rax, QWORD PTR Index$[rsp]
>   00012  e8 00 00 00 00   call        CpuPause
>   00017  eb f4                     jmp       SHORT $LN10@CpuDeadLoo
> CpuDeadLoop ENDP
> _TEXT    ENDS
> END
>  
>  
> 


[-- Attachment #2: Type: text/html, Size: 15873 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
  2023-05-18 17:08       ` Andrew Fish
@ 2023-05-18 17:19         ` Michael D Kinney
  2023-05-18 17:22           ` Andrew Fish
  2023-05-18 17:24           ` Andrew Fish
  0 siblings, 2 replies; 27+ messages in thread
From: Michael D Kinney @ 2023-05-18 17:19 UTC (permalink / raw)
  To: Andrew (EFI) Fish
  Cc: edk2-devel-groups-io, Ni, Ray, Rebecca Cran, Kinney, Michael D

[-- Attachment #1: Type: text/plain, Size: 4579 bytes --]

Andrew,

This might work for XIP.  Set non const global to initial value that is expected value to stay in dead loop.

UINTN  mDeadLoopCount = 0;

VOID
CpuDeadLoop(
  VOID
  )
{
  while (mDeadLoopCount == 0) {
      CpuPause();
  }
}

When deadloop is entered, developer can not change value of mDeadLoopCount, but they can use debugger to force exit loop and return from function.

Mike


From: Andrew (EFI) Fish <afish@apple.com>
Sent: Thursday, May 18, 2023 10:09 AM
To: Kinney, Michael D <michael.d.kinney@intel.com>
Cc: edk2-devel-groups-io <devel@edk2.groups.io>; Ni, Ray <ray.ni@intel.com>; Rebecca Cran <rebecca@bsdio.com>
Subject: Re: [edk2-devel] CpuDeadLoop() is optimized by compiler

Mike,

Good point, that is why we are using the stack ….

The only other thing I can think of is to pass the address of Index to some inline assembler, or an asm no op function, to give it a side effect the compiler can’t resolve.

Thanks,

Andrew Fish


On May 18, 2023, at 10:05 AM, Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>> wrote:

Static global will not work for XIP

Mike

From: Andrew (EFI) Fish <afish@apple.com<mailto:afish@apple.com>>
Sent: Thursday, May 18, 2023 9:49 AM
To: edk2-devel-groups-io <devel@edk2.groups.io<mailto:devel@edk2.groups.io>>; Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>
Cc: Ni, Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>; Rebecca Cran <rebecca@bsdio.com<mailto:rebecca@bsdio.com>>
Subject: Re: [edk2-devel] CpuDeadLoop() is optimized by compiler

Mike,

I pinged some compiler experts to see if our code is correct, or if the compiler has an issue. Seems to be trending compiler issue right now, but I’ve NOT gotten feedback from anyone on the spec committee yet.

If we move Index to a static global that would likely work around the compiler issue.

Thanks,

Andrew Fish



On May 18, 2023, at 8:36 AM, Michael D Kinney <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>> wrote:

Hi Ray,

So the code generated does deadloop, but is just not easy to resume from as we have been able to do in the past.

We use CpuDeadloop() for 2 purposes.  One is a terminal condition with no reason to ever continue.

The 2nd is a debug aide for developers to halt the system at a specific location and then continue from that point, usually with a debugger, to step through code to an area to evaluate unexpected behavior.

We may have to do a NASM implementation of CpuDeadloop() to make sure it meets both use cases.

Mike

From: Ni, Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>
Sent: Thursday, May 18, 2023 3:00 AM
To: devel@edk2.groups.io<mailto:devel@edk2.groups.io>
Cc: Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>; Rebecca Cran <rebecca@bsdio.com<mailto:rebecca@bsdio.com>>; Ni, Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>
Subject: CpuDeadLoop() is optimized by compiler

Hi,
Starting from certain version of Visual Studio C compiler (I don’t have the exact version. I am using VS2019), CpuDeadLoop is now optimized quite well by compiler.

The optimization is so “good” that it becomes harder for developers to break out of the deadloop.

I copied the assembly instructions as below for your reference.
The compiler does not generate instructions that jump out of the loop when the Index is not zero.
So in order to break out of the loop, developers need to:

  1.  Manually adjust rsp by increasing 40
  2.  Manually “ret”

I am not sure if anyone has interest to re-write this function so that compiler can be “fooled” again.
Thanks,
Ray

=======================
; Function compile flags: /Ogspy
; File e:\work\edk2\MdePkg\Library\BaseLib\CpuDeadLoop.c
;              COMDAT CpuDeadLoop
_TEXT    SEGMENT
Index$ = 48
CpuDeadLoop PROC                                                                    ; COMDAT

; 26   : {

$LN12:
  00000  48 83 ec 28         sub        rsp, 40                                ; 00000028H

; 27   :   volatile UINTN  Index;
; 28   :
; 29   :   for (Index = 0; Index == 0;) {

  00004  48 c7 44 24 30
               00 00 00 00        mov      QWORD PTR Index$[rsp], 0
$LN10@CpuDeadLoo:

; 30   :     CpuPause ();

  0000d  48 8b 44 24 30   mov      rax, QWORD PTR Index$[rsp]
  00012  e8 00 00 00 00   call        CpuPause
  00017  eb f4                     jmp       SHORT $LN10@CpuDeadLoo
CpuDeadLoop ENDP
_TEXT    ENDS
END





[-- Attachment #2: Type: text/html, Size: 17010 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
  2023-05-18 17:19         ` Michael D Kinney
@ 2023-05-18 17:22           ` Andrew Fish
  2023-05-18 17:24           ` Andrew Fish
  1 sibling, 0 replies; 27+ messages in thread
From: Andrew Fish @ 2023-05-18 17:22 UTC (permalink / raw)
  To: Mike Kinney; +Cc: edk2-devel-groups-io, Ni, Ray, Rebecca Cran

[-- Attachment #1: Type: text/plain, Size: 5224 bytes --]

Yea,  but I think you want static and volatile on the global. But good idea as for the non XIP case you can just modify the global. It might be a good idea to document the debugging flow in the header for CpuDeadLoop()...

Thanks,

Andrew Fish

> On May 18, 2023, at 10:19 AM, Kinney, Michael D <michael.d.kinney@intel.com> wrote:
> 
> Andrew,
>  
> This might work for XIP.  Set non const global to initial value that is expected value to stay in dead loop.
>  
> UINTN  mDeadLoopCount = 0;
>  
> VOID
> CpuDeadLoop(
>   VOID
>   ) 
> {
>   while (mDeadLoopCount == 0) {
>       CpuPause();
>   }
> }
>  
> When deadloop is entered, developer can not change value of mDeadLoopCount, but they can use debugger to force exit loop and return from function.
>  
> Mike
>  
>  
> From: Andrew (EFI) Fish <afish@apple.com> 
> Sent: Thursday, May 18, 2023 10:09 AM
> To: Kinney, Michael D <michael.d.kinney@intel.com>
> Cc: edk2-devel-groups-io <devel@edk2.groups.io>; Ni, Ray <ray.ni@intel.com>; Rebecca Cran <rebecca@bsdio.com>
> Subject: Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
>  
> Mike,
>  
> Good point, that is why we are using the stack ….
>  
> The only other thing I can think of is to pass the address of Index to some inline assembler, or an asm no op function, to give it a side effect the compiler can’t resolve. 
>  
> Thanks,
>  
> Andrew Fish
> 
> 
> On May 18, 2023, at 10:05 AM, Kinney, Michael D <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>> wrote:
>  
> Static global will not work for XIP
>  
> Mike
>  
> From: Andrew (EFI) Fish <afish@apple.com <mailto:afish@apple.com>> 
> Sent: Thursday, May 18, 2023 9:49 AM
> To: edk2-devel-groups-io <devel@edk2.groups.io <mailto:devel@edk2.groups.io>>; Kinney, Michael D <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>>
> Cc: Ni, Ray <ray.ni@intel.com <mailto:ray.ni@intel.com>>; Rebecca Cran <rebecca@bsdio.com <mailto:rebecca@bsdio.com>>
> Subject: Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
>  
> Mike,
>  
> I pinged some compiler experts to see if our code is correct, or if the compiler has an issue. Seems to be trending compiler issue right now, but I’ve NOT gotten feedback from anyone on the spec committee yet. 
>  
> If we move Index to a static global that would likely work around the compiler issue.
>  
> Thanks,
>  
> Andrew Fish
> 
> 
> 
> On May 18, 2023, at 8:36 AM, Michael D Kinney <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>> wrote:
>  
> Hi Ray,
>  
> So the code generated does deadloop, but is just not easy to resume from as we have been able to do in the past.
>  
> We use CpuDeadloop() for 2 purposes.  One is a terminal condition with no reason to ever continue.
>  
> The 2nd is a debug aide for developers to halt the system at a specific location and then continue from that point, usually with a debugger, to step through code to an area to evaluate unexpected behavior.
>  
> We may have to do a NASM implementation of CpuDeadloop() to make sure it meets both use cases.
>  
> Mike
>  
> From: Ni, Ray <ray.ni@intel.com <mailto:ray.ni@intel.com>> 
> Sent: Thursday, May 18, 2023 3:00 AM
> To: devel@edk2.groups.io <mailto:devel@edk2.groups.io>
> Cc: Kinney, Michael D <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>>; Rebecca Cran <rebecca@bsdio.com <mailto:rebecca@bsdio.com>>; Ni, Ray <ray.ni@intel.com <mailto:ray.ni@intel.com>>
> Subject: CpuDeadLoop() is optimized by compiler
>  
> Hi,
> Starting from certain version of Visual Studio C compiler (I don’t have the exact version. I am using VS2019), CpuDeadLoop is now optimized quite well by compiler.
>  
> The optimization is so “good” that it becomes harder for developers to break out of the deadloop.
>  
> I copied the assembly instructions as below for your reference.
> The compiler does not generate instructions that jump out of the loop when the Index is not zero.
> So in order to break out of the loop, developers need to:
> Manually adjust rsp by increasing 40
> Manually “ret”
>  
> I am not sure if anyone has interest to re-write this function so that compiler can be “fooled” again.
> Thanks,
> Ray
>  
> =======================
> ; Function compile flags: /Ogspy
> ; File e:\work\edk2\MdePkg\Library\BaseLib\CpuDeadLoop.c
> ;              COMDAT CpuDeadLoop
> _TEXT    SEGMENT
> Index$ = 48
> CpuDeadLoop PROC                                                                    ; COMDAT
>  
> ; 26   : {
>  
> $LN12:
>   00000  48 83 ec 28         sub        rsp, 40                                ; 00000028H
>  
> ; 27   :   volatile UINTN  Index;
> ; 28   : 
> ; 29   :   for (Index = 0; Index == 0;) {
>  
>   00004  48 c7 44 24 30
>                00 00 00 00        mov      QWORD PTR Index$[rsp], 0
> $LN10@CpuDeadLoo:
>  
> ; 30   :     CpuPause ();
>  
>   0000d  48 8b 44 24 30   mov      rax, QWORD PTR Index$[rsp]
>   00012  e8 00 00 00 00   call        CpuPause
>   00017  eb f4                     jmp       SHORT $LN10@CpuDeadLoo
> CpuDeadLoop ENDP
> _TEXT    ENDS
> END
>  
>  
> 


[-- Attachment #2: Type: text/html, Size: 22617 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
  2023-05-18 17:19         ` Michael D Kinney
  2023-05-18 17:22           ` Andrew Fish
@ 2023-05-18 17:24           ` Andrew Fish
  2023-05-18 18:45             ` Andrew Fish
       [not found]             ` <17605136DCF3E084.26337@groups.io>
  1 sibling, 2 replies; 27+ messages in thread
From: Andrew Fish @ 2023-05-18 17:24 UTC (permalink / raw)
  To: edk2-devel-groups-io, Mike Kinney; +Cc: Ni, Ray, Rebecca Cran

[-- Attachment #1: Type: text/plain, Size: 5362 bytes --]

Mike,

I guess my other question… If this turns out to be a compiler bug should we scope the change to the broken toolchain. I’m not sure what the right answer is for that, but I want to ask the question? 

Thanks,

Andrew Fish

> On May 18, 2023, at 10:19 AM, Michael D Kinney <michael.d.kinney@intel.com> wrote:
> 
> Andrew,
>  
> This might work for XIP.  Set non const global to initial value that is expected value to stay in dead loop.
>  
> UINTN  mDeadLoopCount = 0;
>  
> VOID
> CpuDeadLoop(
>   VOID
>   ) 
> {
>   while (mDeadLoopCount == 0) {
>       CpuPause();
>   }
> }
>  
> When deadloop is entered, developer can not change value of mDeadLoopCount, but they can use debugger to force exit loop and return from function.
>  
> Mike
>  
>  
> From: Andrew (EFI) Fish <afish@apple.com <mailto:afish@apple.com>> 
> Sent: Thursday, May 18, 2023 10:09 AM
> To: Kinney, Michael D <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>>
> Cc: edk2-devel-groups-io <devel@edk2.groups.io <mailto:devel@edk2.groups.io>>; Ni, Ray <ray.ni@intel.com <mailto:ray.ni@intel.com>>; Rebecca Cran <rebecca@bsdio.com <mailto:rebecca@bsdio.com>>
> Subject: Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
>  
> Mike,
>  
> Good point, that is why we are using the stack ….
>  
> The only other thing I can think of is to pass the address of Index to some inline assembler, or an asm no op function, to give it a side effect the compiler can’t resolve. 
>  
> Thanks,
>  
> Andrew Fish
> 
> 
> On May 18, 2023, at 10:05 AM, Kinney, Michael D <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>> wrote:
>  
> Static global will not work for XIP
>  
> Mike
>  
> From: Andrew (EFI) Fish <afish@apple.com <mailto:afish@apple.com>> 
> Sent: Thursday, May 18, 2023 9:49 AM
> To: edk2-devel-groups-io <devel@edk2.groups.io <mailto:devel@edk2.groups.io>>; Kinney, Michael D <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>>
> Cc: Ni, Ray <ray.ni@intel.com <mailto:ray.ni@intel.com>>; Rebecca Cran <rebecca@bsdio.com <mailto:rebecca@bsdio.com>>
> Subject: Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
>  
> Mike,
>  
> I pinged some compiler experts to see if our code is correct, or if the compiler has an issue. Seems to be trending compiler issue right now, but I’ve NOT gotten feedback from anyone on the spec committee yet. 
>  
> If we move Index to a static global that would likely work around the compiler issue.
>  
> Thanks,
>  
> Andrew Fish
> 
> 
> 
> On May 18, 2023, at 8:36 AM, Michael D Kinney <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>> wrote:
>  
> Hi Ray,
>  
> So the code generated does deadloop, but is just not easy to resume from as we have been able to do in the past.
>  
> We use CpuDeadloop() for 2 purposes.  One is a terminal condition with no reason to ever continue.
>  
> The 2nd is a debug aide for developers to halt the system at a specific location and then continue from that point, usually with a debugger, to step through code to an area to evaluate unexpected behavior.
>  
> We may have to do a NASM implementation of CpuDeadloop() to make sure it meets both use cases.
>  
> Mike
>  
> From: Ni, Ray <ray.ni@intel.com <mailto:ray.ni@intel.com>> 
> Sent: Thursday, May 18, 2023 3:00 AM
> To: devel@edk2.groups.io <mailto:devel@edk2.groups.io>
> Cc: Kinney, Michael D <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>>; Rebecca Cran <rebecca@bsdio.com <mailto:rebecca@bsdio.com>>; Ni, Ray <ray.ni@intel.com <mailto:ray.ni@intel.com>>
> Subject: CpuDeadLoop() is optimized by compiler
>  
> Hi,
> Starting from certain version of Visual Studio C compiler (I don’t have the exact version. I am using VS2019), CpuDeadLoop is now optimized quite well by compiler.
>  
> The optimization is so “good” that it becomes harder for developers to break out of the deadloop.
>  
> I copied the assembly instructions as below for your reference.
> The compiler does not generate instructions that jump out of the loop when the Index is not zero.
> So in order to break out of the loop, developers need to:
> Manually adjust rsp by increasing 40
> Manually “ret”
>  
> I am not sure if anyone has interest to re-write this function so that compiler can be “fooled” again.
> Thanks,
> Ray
>  
> =======================
> ; Function compile flags: /Ogspy
> ; File e:\work\edk2\MdePkg\Library\BaseLib\CpuDeadLoop.c
> ;              COMDAT CpuDeadLoop
> _TEXT    SEGMENT
> Index$ = 48
> CpuDeadLoop PROC                                                                    ; COMDAT
>  
> ; 26   : {
>  
> $LN12:
>   00000  48 83 ec 28         sub        rsp, 40                                ; 00000028H
>  
> ; 27   :   volatile UINTN  Index;
> ; 28   : 
> ; 29   :   for (Index = 0; Index == 0;) {
>  
>   00004  48 c7 44 24 30
>                00 00 00 00        mov      QWORD PTR Index$[rsp], 0
> $LN10@CpuDeadLoo:
>  
> ; 30   :     CpuPause ();
>  
>   0000d  48 8b 44 24 30   mov      rax, QWORD PTR Index$[rsp]
>   00012  e8 00 00 00 00   call        CpuPause
>   00017  eb f4                     jmp       SHORT $LN10@CpuDeadLoo
> CpuDeadLoop ENDP
> _TEXT    ENDS
> END
>  
>  
>  
> 


[-- Attachment #2: Type: text/html, Size: 23188 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
  2023-05-18 15:36 ` Michael D Kinney
  2023-05-18 16:49   ` [edk2-devel] " Andrew Fish
@ 2023-05-18 17:36   ` Rebecca Cran
  2023-05-18 18:21     ` Andrew Fish
  1 sibling, 1 reply; 27+ messages in thread
From: Rebecca Cran @ 2023-05-18 17:36 UTC (permalink / raw)
  To: devel, Kinney, Michael D, Ray' 'Ni

When I use CpuDeadLoop for debugging on Aarch64 I have symbols loaded so I can just do ‘set Index=1’ and resume, but it sounds like the issue is that people want to sometimes debug without symbols/source, and the generated assembly is making that difficult.

Rebecca

On Thu, May 18, 2023, at 9:36 AM, Michael D Kinney wrote:
> Hi Ray,
> 
> So the code generated does deadloop, but is just not easy to resume 
> from as we have been able to do in the past.
> 
> We use CpuDeadloop() for 2 purposes.  One is a terminal condition with 
> no reason to ever continue.
> 
> The 2nd is a debug aide for developers to halt the system at a specific 
> location and then continue from that point, usually with a debugger, to 
> step through code to an area to evaluate unexpected behavior.
> 
> We may have to do a NASM implementation of CpuDeadloop() to make sure 
> it meets both use cases.
> 
> Mike
> 
> *From:* Ni, Ray <ray.ni@intel.com> 
> *Sent:* Thursday, May 18, 2023 3:00 AM
> *To:* devel@edk2.groups.io
> *Cc:* Kinney, Michael D <michael.d.kinney@intel.com>; Rebecca Cran 
> <rebecca@bsdio.com>; Ni, Ray <ray.ni@intel.com>
> *Subject:* CpuDeadLoop() is optimized by compiler
> 
> Hi,
> Starting from certain version of Visual Studio C compiler (I don’t have 
> the exact version. I am using VS2019), CpuDeadLoop is now optimized 
> quite well by compiler.
> 
> The optimization is so “good” that it becomes harder for developers to 
> break out of the deadloop.
> 
> I copied the assembly instructions as below for your reference.
> The compiler does not generate instructions that jump out of the loop 
> when the Index is not zero.
> So in order to break out of the loop, developers need to:
>  1. Manually adjust rsp by increasing 40
>  2. Manually “ret”
> 
> I am not sure if anyone has interest to re-write this function so that 
> compiler can be “fooled” again.
> Thanks,
> Ray
> 
> =======================
> ; Function compile flags: /Ogspy
> ; File e:\work\edk2\MdePkg\Library\BaseLib\CpuDeadLoop.c
> ;              COMDAT CpuDeadLoop
> _TEXT    SEGMENT
> Index$ = 48
> CpuDeadLoop PROC                                                        
>             ; COMDAT
> 
> ; 26   : {
> 
> $LN12:
>   00000  48 83 ec 28         sub        rsp, 40                         
>        ; 00000028H
> 
> ; 27   :   volatile UINTN  Index;
> ; 28   : 
> ; 29   :   for (Index = 0; Index == 0;) {
> 
>   00004  48 c7 44 24 30
>                00 00 00 00        mov      QWORD PTR Index$[rsp], 0
> $LN10@CpuDeadLoo:
> 
> ; 30   :     CpuPause ();
> 
>   0000d  48 8b 44 24 30   mov      rax, QWORD PTR Index$[rsp]
>   00012  e8 00 00 00 00   call        CpuPause
>   00017  eb f4                     jmp       SHORT $LN10@CpuDeadLoo
> CpuDeadLoop ENDP
> _TEXT    ENDS
> END
> 
> 
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
  2023-05-18 17:36   ` Rebecca Cran
@ 2023-05-18 18:21     ` Andrew Fish
  0 siblings, 0 replies; 27+ messages in thread
From: Andrew Fish @ 2023-05-18 18:21 UTC (permalink / raw)
  To: devel, rebecca; +Cc: Mike Kinney, Ray' 'Ni

[-- Attachment #1: Type: text/plain, Size: 3893 bytes --]

Rebecca,

It looks like VC++ is trying to honor the volatile by reading the variable, incase that has side effects. But the loop is not checking the value of the variable and it is just doing an unconditional jump. This is why I think it is likely a compiler bug. Since the compiler emitted a hard code jmp in a loop it optimized out the return instruction….

$LN10@CpuDeadLoo:
mov       rax, QWORD PTR Index$[rsp]
call      CpuPause
jmp       SHORT $LN10@CpuDeadLoo
….

So changing the variable does not break you out of the loop. If you pc += 2 when you are at the jmp instruction that will not return you from CpuDeadLoop() that will just fall into the next function. That might work if CpuDeadLoop() was inlined, but if it was a call you would start running the next function in the binary. 

Thanks,

Andrew Fish


> On May 18, 2023, at 10:36 AM, Rebecca Cran <rebecca@bsdio.com> wrote:
> 
> When I use CpuDeadLoop for debugging on Aarch64 I have symbols loaded so I can just do ‘set Index=1’ and resume, but it sounds like the issue is that people want to sometimes debug without symbols/source, and the generated assembly is making that difficult.
> 
> Rebecca
> 
> On Thu, May 18, 2023, at 9:36 AM, Michael D Kinney wrote:
>> Hi Ray,
>> 
>> So the code generated does deadloop, but is just not easy to resume 
>> from as we have been able to do in the past.
>> 
>> We use CpuDeadloop() for 2 purposes.  One is a terminal condition with 
>> no reason to ever continue.
>> 
>> The 2nd is a debug aide for developers to halt the system at a specific 
>> location and then continue from that point, usually with a debugger, to 
>> step through code to an area to evaluate unexpected behavior.
>> 
>> We may have to do a NASM implementation of CpuDeadloop() to make sure 
>> it meets both use cases.
>> 
>> Mike
>> 
>> *From:* Ni, Ray <ray.ni@intel.com> 
>> *Sent:* Thursday, May 18, 2023 3:00 AM
>> *To:* devel@edk2.groups.io
>> *Cc:* Kinney, Michael D <michael.d.kinney@intel.com>; Rebecca Cran 
>> <rebecca@bsdio.com>; Ni, Ray <ray.ni@intel.com>
>> *Subject:* CpuDeadLoop() is optimized by compiler
>> 
>> Hi,
>> Starting from certain version of Visual Studio C compiler (I don’t have 
>> the exact version. I am using VS2019), CpuDeadLoop is now optimized 
>> quite well by compiler.
>> 
>> The optimization is so “good” that it becomes harder for developers to 
>> break out of the deadloop.
>> 
>> I copied the assembly instructions as below for your reference.
>> The compiler does not generate instructions that jump out of the loop 
>> when the Index is not zero.
>> So in order to break out of the loop, developers need to:
>> 1. Manually adjust rsp by increasing 40
>> 2. Manually “ret”
>> 
>> I am not sure if anyone has interest to re-write this function so that 
>> compiler can be “fooled” again.
>> Thanks,
>> Ray
>> 
>> =======================
>> ; Function compile flags: /Ogspy
>> ; File e:\work\edk2\MdePkg\Library\BaseLib\CpuDeadLoop.c
>> ;              COMDAT CpuDeadLoop
>> _TEXT    SEGMENT
>> Index$ = 48
>> CpuDeadLoop PROC                                                        
>>            ; COMDAT
>> 
>> ; 26   : {
>> 
>> $LN12:
>>  00000  48 83 ec 28         sub        rsp, 40                         
>>       ; 00000028H
>> 
>> ; 27   :   volatile UINTN  Index;
>> ; 28   : 
>> ; 29   :   for (Index = 0; Index == 0;) {
>> 
>>  00004  48 c7 44 24 30
>>               00 00 00 00        mov      QWORD PTR Index$[rsp], 0
>> $LN10@CpuDeadLoo:
>> 
>> ; 30   :     CpuPause ();
>> 
>>  0000d  48 8b 44 24 30   mov      rax, QWORD PTR Index$[rsp]
>>  00012  e8 00 00 00 00   call        CpuPause
>>  00017  eb f4                     jmp       SHORT $LN10@CpuDeadLoo
>> CpuDeadLoop ENDP
>> _TEXT    ENDS
>> END
>> 
>> 
>> 
> 
> 
> 


[-- Attachment #2: Type: text/html, Size: 10243 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
  2023-05-18 17:24           ` Andrew Fish
@ 2023-05-18 18:45             ` Andrew Fish
       [not found]             ` <17605136DCF3E084.26337@groups.io>
  1 sibling, 0 replies; 27+ messages in thread
From: Andrew Fish @ 2023-05-18 18:45 UTC (permalink / raw)
  To: edk2-devel-groups-io, Mike Kinney; +Cc: Ni, Ray, Rebecca Cran

[-- Attachment #1: Type: text/plain, Size: 7232 bytes --]

Mike,

This is a good way to play around with fixes, and to report bugs. You can see the assembler for different compilers with different flag. 

https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA

Sorry I’m traveling and in Cupertino with lots of meetings so I did not have time to adjust the compiler flags….

Thanks,

Andrew Fish

> On May 18, 2023, at 10:24 AM, Andrew (EFI) Fish <afish@apple.com> wrote:
> 
> Mike,
> 
> I guess my other question… If this turns out to be a compiler bug should we scope the change to the broken toolchain. I’m not sure what the right answer is for that, but I want to ask the question? 
> 
> Thanks,
> 
> Andrew Fish
> 
>> On May 18, 2023, at 10:19 AM, Michael D Kinney <michael.d.kinney@intel.com> wrote:
>> 
>> Andrew,
>>  
>> This might work for XIP.  Set non const global to initial value that is expected value to stay in dead loop.
>>  
>> UINTN  mDeadLoopCount = 0;
>>  
>> VOID
>> CpuDeadLoop(
>>   VOID
>>   ) 
>> {
>>   while (mDeadLoopCount == 0) {
>>       CpuPause();
>>   }
>> }
>>  
>> When deadloop is entered, developer can not change value of mDeadLoopCount, but they can use debugger to force exit loop and return from function.
>>  
>> Mike
>>  
>>  
>> From: Andrew (EFI) Fish <afish@apple.com <mailto:afish@apple.com>> 
>> Sent: Thursday, May 18, 2023 10:09 AM
>> To: Kinney, Michael D <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>>
>> Cc: edk2-devel-groups-io <devel@edk2.groups.io <mailto:devel@edk2.groups.io>>; Ni, Ray <ray.ni@intel.com <mailto:ray.ni@intel.com>>; Rebecca Cran <rebecca@bsdio.com <mailto:rebecca@bsdio.com>>
>> Subject: Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
>>  
>> Mike,
>>  
>> Good point, that is why we are using the stack ….
>>  
>> The only other thing I can think of is to pass the address of Index to some inline assembler, or an asm no op function, to give it a side effect the compiler can’t resolve. 
>>  
>> Thanks,
>>  
>> Andrew Fish
>> 
>> 
>> On May 18, 2023, at 10:05 AM, Kinney, Michael D <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>> wrote:
>>  
>> Static global will not work for XIP
>>  
>> Mike
>>  
>> From: Andrew (EFI) Fish <afish@apple.com <mailto:afish@apple.com>> 
>> Sent: Thursday, May 18, 2023 9:49 AM
>> To: edk2-devel-groups-io <devel@edk2.groups.io <mailto:devel@edk2.groups.io>>; Kinney, Michael D <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>>
>> Cc: Ni, Ray <ray.ni@intel.com <mailto:ray.ni@intel.com>>; Rebecca Cran <rebecca@bsdio.com <mailto:rebecca@bsdio.com>>
>> Subject: Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
>>  
>> Mike,
>>  
>> I pinged some compiler experts to see if our code is correct, or if the compiler has an issue. Seems to be trending compiler issue right now, but I’ve NOT gotten feedback from anyone on the spec committee yet. 
>>  
>> If we move Index to a static global that would likely work around the compiler issue.
>>  
>> Thanks,
>>  
>> Andrew Fish
>> 
>> 
>> 
>> On May 18, 2023, at 8:36 AM, Michael D Kinney <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>> wrote:
>>  
>> Hi Ray,
>>  
>> So the code generated does deadloop, but is just not easy to resume from as we have been able to do in the past.
>>  
>> We use CpuDeadloop() for 2 purposes.  One is a terminal condition with no reason to ever continue.
>>  
>> The 2nd is a debug aide for developers to halt the system at a specific location and then continue from that point, usually with a debugger, to step through code to an area to evaluate unexpected behavior.
>>  
>> We may have to do a NASM implementation of CpuDeadloop() to make sure it meets both use cases.
>>  
>> Mike
>>  
>> From: Ni, Ray <ray.ni@intel.com <mailto:ray.ni@intel.com>> 
>> Sent: Thursday, May 18, 2023 3:00 AM
>> To: devel@edk2.groups.io <mailto:devel@edk2.groups.io>
>> Cc: Kinney, Michael D <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>>; Rebecca Cran <rebecca@bsdio.com <mailto:rebecca@bsdio.com>>; Ni, Ray <ray.ni@intel.com <mailto:ray.ni@intel.com>>
>> Subject: CpuDeadLoop() is optimized by compiler
>>  
>> Hi,
>> Starting from certain version of Visual Studio C compiler (I don’t have the exact version. I am using VS2019), CpuDeadLoop is now optimized quite well by compiler.
>>  
>> The optimization is so “good” that it becomes harder for developers to break out of the deadloop.
>>  
>> I copied the assembly instructions as below for your reference.
>> The compiler does not generate instructions that jump out of the loop when the Index is not zero.
>> So in order to break out of the loop, developers need to:
>> Manually adjust rsp by increasing 40
>> Manually “ret”
>>  
>> I am not sure if anyone has interest to re-write this function so that compiler can be “fooled” again.
>> Thanks,
>> Ray
>>  
>> =======================
>> ; Function compile flags: /Ogspy
>> ; File e:\work\edk2\MdePkg\Library\BaseLib\CpuDeadLoop.c
>> ;              COMDAT CpuDeadLoop
>> _TEXT    SEGMENT
>> Index$ = 48
>> CpuDeadLoop PROC                                                                    ; COMDAT
>>  
>> ; 26   : {
>>  
>> $LN12:
>>   00000  48 83 ec 28         sub        rsp, 40                                ; 00000028H
>>  
>> ; 27   :   volatile UINTN  Index;
>> ; 28   : 
>> ; 29   :   for (Index = 0; Index == 0;) {
>>  
>>   00004  48 c7 44 24 30
>>                00 00 00 00        mov      QWORD PTR Index$[rsp], 0
>> $LN10@CpuDeadLoo:
>>  
>> ; 30   :     CpuPause ();
>>  
>>   0000d  48 8b 44 24 30   mov      rax, QWORD PTR Index$[rsp]
>>   00012  e8 00 00 00 00   call        CpuPause
>>   00017  eb f4                     jmp       SHORT $LN10@CpuDeadLoo
>> CpuDeadLoop ENDP
>> _TEXT    ENDS
>> END
>>  
>>  
>>  
>> 
> 


[-- Attachment #2.1: Type: text/html, Size: 33049 bytes --]

[-- Attachment #2.2: favicon.png --]
[-- Type: image/png, Size: 12765 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
       [not found]             ` <17605136DCF3E084.26337@groups.io>
@ 2023-05-18 20:45               ` Andrew Fish
  2023-05-18 21:42                 ` Michael D Kinney
  0 siblings, 1 reply; 27+ messages in thread
From: Andrew Fish @ 2023-05-18 20:45 UTC (permalink / raw)
  To: edk2-devel-groups-io, Andrew Fish; +Cc: Mike Kinney, Ni, Ray, Rebecca Cran

[-- Attachment #1: Type: text/plain, Size: 8961 bytes --]

Whoops wrong compiler. Here is an update. I added the flags so this one reproduces the issue.

https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA

Thanks,

Andrew Fish

> On May 18, 2023, at 11:45 AM, Andrew Fish via groups.io <afish=apple.com@groups.io> wrote:
> 
> Mike,
> 
> This is a good way to play around with fixes, and to report bugs. You can see the assembler for different compilers with different flag. 
> 
> https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA
> 
> Sorry I’m traveling and in Cupertino with lots of meetings so I did not have time to adjust the compiler flags….
> 
> Thanks,
> 
> Andrew Fish
> 
>> On May 18, 2023, at 10:24 AM, Andrew (EFI) Fish <afish@apple.com> wrote:
>> 
>> Mike,
>> 
>> I guess my other question… If this turns out to be a compiler bug should we scope the change to the broken toolchain. I’m not sure what the right answer is for that, but I want to ask the question? 
>> 
>> Thanks,
>> 
>> Andrew Fish
>> 
>>> On May 18, 2023, at 10:19 AM, Michael D Kinney <michael.d.kinney@intel.com> wrote:
>>> 
>>> Andrew,
>>>  
>>> This might work for XIP.  Set non const global to initial value that is expected value to stay in dead loop.
>>>  
>>> UINTN  mDeadLoopCount = 0;
>>>  
>>> VOID
>>> CpuDeadLoop(
>>>   VOID
>>>   ) 
>>> {
>>>   while (mDeadLoopCount == 0) {
>>>       CpuPause();
>>>   }
>>> }
>>>  
>>> When deadloop is entered, developer can not change value of mDeadLoopCount, but they can use debugger to force exit loop and return from function.
>>>  
>>> Mike
>>>  
>>>  
>>> From: Andrew (EFI) Fish <afish@apple.com <mailto:afish@apple.com>> 
>>> Sent: Thursday, May 18, 2023 10:09 AM
>>> To: Kinney, Michael D <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>>
>>> Cc: edk2-devel-groups-io <devel@edk2.groups.io <mailto:devel@edk2.groups.io>>; Ni, Ray <ray.ni@intel.com <mailto:ray.ni@intel.com>>; Rebecca Cran <rebecca@bsdio.com <mailto:rebecca@bsdio.com>>
>>> Subject: Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
>>>  
>>> Mike,
>>>  
>>> Good point, that is why we are using the stack ….
>>>  
>>> The only other thing I can think of is to pass the address of Index to some inline assembler, or an asm no op function, to give it a side effect the compiler can’t resolve. 
>>>  
>>> Thanks,
>>>  
>>> Andrew Fish
>>> 
>>> 
>>> On May 18, 2023, at 10:05 AM, Kinney, Michael D <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>> wrote:
>>>  
>>> Static global will not work for XIP
>>>  
>>> Mike
>>>  
>>> From: Andrew (EFI) Fish <afish@apple.com <mailto:afish@apple.com>> 
>>> Sent: Thursday, May 18, 2023 9:49 AM
>>> To: edk2-devel-groups-io <devel@edk2.groups.io <mailto:devel@edk2.groups.io>>; Kinney, Michael D <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>>
>>> Cc: Ni, Ray <ray.ni@intel.com <mailto:ray.ni@intel.com>>; Rebecca Cran <rebecca@bsdio.com <mailto:rebecca@bsdio.com>>
>>> Subject: Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
>>>  
>>> Mike,
>>>  
>>> I pinged some compiler experts to see if our code is correct, or if the compiler has an issue. Seems to be trending compiler issue right now, but I’ve NOT gotten feedback from anyone on the spec committee yet. 
>>>  
>>> If we move Index to a static global that would likely work around the compiler issue.
>>>  
>>> Thanks,
>>>  
>>> Andrew Fish
>>> 
>>> 
>>> 
>>> On May 18, 2023, at 8:36 AM, Michael D Kinney <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>> wrote:
>>>  
>>> Hi Ray,
>>>  
>>> So the code generated does deadloop, but is just not easy to resume from as we have been able to do in the past.
>>>  
>>> We use CpuDeadloop() for 2 purposes.  One is a terminal condition with no reason to ever continue.
>>>  
>>> The 2nd is a debug aide for developers to halt the system at a specific location and then continue from that point, usually with a debugger, to step through code to an area to evaluate unexpected behavior.
>>>  
>>> We may have to do a NASM implementation of CpuDeadloop() to make sure it meets both use cases.
>>>  
>>> Mike
>>>  
>>> From: Ni, Ray <ray.ni@intel.com <mailto:ray.ni@intel.com>> 
>>> Sent: Thursday, May 18, 2023 3:00 AM
>>> To: devel@edk2.groups.io <mailto:devel@edk2.groups.io>
>>> Cc: Kinney, Michael D <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>>; Rebecca Cran <rebecca@bsdio.com <mailto:rebecca@bsdio.com>>; Ni, Ray <ray.ni@intel.com <mailto:ray.ni@intel.com>>
>>> Subject: CpuDeadLoop() is optimized by compiler
>>>  
>>> Hi,
>>> Starting from certain version of Visual Studio C compiler (I don’t have the exact version. I am using VS2019), CpuDeadLoop is now optimized quite well by compiler.
>>>  
>>> The optimization is so “good” that it becomes harder for developers to break out of the deadloop.
>>>  
>>> I copied the assembly instructions as below for your reference.
>>> The compiler does not generate instructions that jump out of the loop when the Index is not zero.
>>> So in order to break out of the loop, developers need to:
>>> Manually adjust rsp by increasing 40
>>> Manually “ret”
>>>  
>>> I am not sure if anyone has interest to re-write this function so that compiler can be “fooled” again.
>>> Thanks,
>>> Ray
>>>  
>>> =======================
>>> ; Function compile flags: /Ogspy
>>> ; File e:\work\edk2\MdePkg\Library\BaseLib\CpuDeadLoop.c
>>> ;              COMDAT CpuDeadLoop
>>> _TEXT    SEGMENT
>>> Index$ = 48
>>> CpuDeadLoop PROC                                                                    ; COMDAT
>>>  
>>> ; 26   : {
>>>  
>>> $LN12:
>>>   00000  48 83 ec 28         sub        rsp, 40                                ; 00000028H
>>>  
>>> ; 27   :   volatile UINTN  Index;
>>> ; 28   : 
>>> ; 29   :   for (Index = 0; Index == 0;) {
>>>  
>>>   00004  48 c7 44 24 30
>>>                00 00 00 00        mov      QWORD PTR Index$[rsp], 0
>>> $LN10@CpuDeadLoo:
>>>  
>>> ; 30   :     CpuPause ();
>>>  
>>>   0000d  48 8b 44 24 30   mov      rax, QWORD PTR Index$[rsp]
>>>   00012  e8 00 00 00 00   call        CpuPause
>>>   00017  eb f4                     jmp       SHORT $LN10@CpuDeadLoo
>>> CpuDeadLoop ENDP
>>> _TEXT    ENDS
>>> END
>>>  
>>>  
>>>  
>> 
> 
> 


[-- Attachment #2.1: Type: text/html, Size: 42488 bytes --]

[-- Attachment #2.2: favicon.png --]
[-- Type: image/png, Size: 12765 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
  2023-05-18 20:45               ` Andrew Fish
@ 2023-05-18 21:42                 ` Michael D Kinney
  2023-05-19  0:42                   ` Andrew Fish
  0 siblings, 1 reply; 27+ messages in thread
From: Michael D Kinney @ 2023-05-18 21:42 UTC (permalink / raw)
  To: Andrew (EFI) Fish, edk2-devel-groups-io
  Cc: Ni, Ray, Rebecca Cran, Kinney, Michael D


[-- Attachment #1.1: Type: text/plain, Size: 14871 bytes --]

Using that tool, the following fragment seems to generate the right code.  Volatile is required.  Static is optional.

static volatile int  mDeadLoopCount = 0;

void
CpuDeadLoop(
  void
  )
{
  while (mDeadLoopCount == 0);
}


GCC
===
CpuDeadLoop():
.L2:
        mov     eax, DWORD PTR mDeadLoopCount[rip]
        test    eax, eax
        je      .L2
        ret


CLANG
=====
CpuDeadLoop():                       # @CpuDeadLoop()
.LBB0_1:                                # =>This Inner Loop Header: Depth=1
        cmp     dword ptr [rip + _ZL14mDeadLoopCount], 0
        je      .LBB0_1
        ret


Mike


From: Andrew (EFI) Fish <afish@apple.com>
Sent: Thursday, May 18, 2023 1:45 PM
To: edk2-devel-groups-io <devel@edk2.groups.io>; Andrew Fish <afish@apple.com>
Cc: Kinney, Michael D <michael.d.kinney@intel.com>; Ni, Ray <ray.ni@intel.com>; Rebecca Cran <rebecca@bsdio.com>
Subject: Re: [edk2-devel] CpuDeadLoop() is optimized by compiler

Whoops wrong compiler. Here is an update. I added the flags so this one reproduces the issue.

Compiler Explorer<https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>
godbolt.org<https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>
[cid:image001.png@01D98997.0589BC40]<https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>

Thanks,

Andrew Fish


On May 18, 2023, at 11:45 AM, Andrew Fish via groups.io <afish=apple.com@groups.io<mailto:afish=apple.com@groups.io>> wrote:

Mike,

This is a good way to play around with fixes, and to report bugs. You can see the assembler for different compilers with different flag.

Compiler Explorer<https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>
godbolt.org<https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>
<favicon.png><https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>

Sorry I’m traveling and in Cupertino with lots of meetings so I did not have time to adjust the compiler flags….

Thanks,

Andrew Fish


On May 18, 2023, at 10:24 AM, Andrew (EFI) Fish <afish@apple.com<mailto:afish@apple.com>> wrote:

Mike,

I guess my other question… If this turns out to be a compiler bug should we scope the change to the broken toolchain. I’m not sure what the right answer is for that, but I want to ask the question?

Thanks,

Andrew Fish


On May 18, 2023, at 10:19 AM, Michael D Kinney <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>> wrote:

Andrew,

This might work for XIP.  Set non const global to initial value that is expected value to stay in dead loop.

UINTN  mDeadLoopCount = 0;

VOID
CpuDeadLoop(
  VOID
  )
{
  while (mDeadLoopCount == 0) {
      CpuPause();
  }
}

When deadloop is entered, developer can not change value of mDeadLoopCount, but they can use debugger to force exit loop and return from function.

Mike


From: Andrew (EFI) Fish <afish@apple.com<mailto:afish@apple.com>>
Sent: Thursday, May 18, 2023 10:09 AM
To: Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>
Cc: edk2-devel-groups-io <devel@edk2.groups.io<mailto:devel@edk2.groups.io>>; Ni, Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>; Rebecca Cran <rebecca@bsdio.com<mailto:rebecca@bsdio.com>>
Subject: Re: [edk2-devel] CpuDeadLoop() is optimized by compiler

Mike,

Good point, that is why we are using the stack ….

The only other thing I can think of is to pass the address of Index to some inline assembler, or an asm no op function, to give it a side effect the compiler can’t resolve.

Thanks,

Andrew Fish



On May 18, 2023, at 10:05 AM, Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>> wrote:

Static global will not work for XIP

Mike

From: Andrew (EFI) Fish <afish@apple.com<mailto:afish@apple.com>>
Sent: Thursday, May 18, 2023 9:49 AM
To: edk2-devel-groups-io <devel@edk2.groups.io<mailto:devel@edk2.groups.io>>; Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>
Cc: Ni, Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>; Rebecca Cran <rebecca@bsdio.com<mailto:rebecca@bsdio.com>>
Subject: Re: [edk2-devel] CpuDeadLoop() is optimized by compiler

Mike,

I pinged some compiler experts to see if our code is correct, or if the compiler has an issue. Seems to be trending compiler issue right now, but I’ve NOT gotten feedback from anyone on the spec committee yet.

If we move Index to a static global that would likely work around the compiler issue.

Thanks,

Andrew Fish




On May 18, 2023, at 8:36 AM, Michael D Kinney <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>> wrote:

Hi Ray,

So the code generated does deadloop, but is just not easy to resume from as we have been able to do in the past.

We use CpuDeadloop() for 2 purposes.  One is a terminal condition with no reason to ever continue.

The 2nd is a debug aide for developers to halt the system at a specific location and then continue from that point, usually with a debugger, to step through code to an area to evaluate unexpected behavior.

We may have to do a NASM implementation of CpuDeadloop() to make sure it meets both use cases.

Mike

From: Ni, Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>
Sent: Thursday, May 18, 2023 3:00 AM
To: devel@edk2.groups.io<mailto:devel@edk2.groups.io>
Cc: Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>; Rebecca Cran <rebecca@bsdio.com<mailto:rebecca@bsdio.com>>; Ni, Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>
Subject: CpuDeadLoop() is optimized by compiler

Hi,
Starting from certain version of Visual Studio C compiler (I don’t have the exact version. I am using VS2019), CpuDeadLoop is now optimized quite well by compiler.

The optimization is so “good” that it becomes harder for developers to break out of the deadloop.

I copied the assembly instructions as below for your reference.
The compiler does not generate instructions that jump out of the loop when the Index is not zero.
So in order to break out of the loop, developers need to:

  1.  Manually adjust rsp by increasing 40
  2.  Manually “ret”

I am not sure if anyone has interest to re-write this function so that compiler can be “fooled” again.
Thanks,
Ray

=======================
; Function compile flags: /Ogspy
; File e:\work\edk2\MdePkg\Library\BaseLib\CpuDeadLoop.c
;              COMDAT CpuDeadLoop
_TEXT    SEGMENT
Index$ = 48
CpuDeadLoop PROC                                                                    ; COMDAT

; 26   : {

$LN12:
  00000  48 83 ec 28         sub        rsp, 40                                ; 00000028H

; 27   :   volatile UINTN  Index;
; 28   :
; 29   :   for (Index = 0; Index == 0;) {

  00004  48 c7 44 24 30
               00 00 00 00        mov      QWORD PTR Index$[rsp], 0
$LN10@CpuDeadLoo:

; 30   :     CpuPause ();

  0000d  48 8b 44 24 30   mov      rax, QWORD PTR Index$[rsp]
  00012  e8 00 00 00 00   call        CpuPause
  00017  eb f4                     jmp       SHORT $LN10@CpuDeadLoo
CpuDeadLoop ENDP
_TEXT    ENDS
END








[-- Attachment #1.2: Type: text/html, Size: 43933 bytes --]

[-- Attachment #2: image001.png --]
[-- Type: image/png, Size: 12765 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
  2023-05-18 21:42                 ` Michael D Kinney
@ 2023-05-19  0:42                   ` Andrew Fish
  2023-05-19  2:53                     ` Ni, Ray
  0 siblings, 1 reply; 27+ messages in thread
From: Andrew Fish @ 2023-05-19  0:42 UTC (permalink / raw)
  To: devel, Mike Kinney; +Cc: Ni, Ray, Rebecca Cran

[-- Attachment #1: Type: text/plain, Size: 15948 bytes --]

Mike,

Sorry static was just to scope the name to the file since it is a lib, not to make it work.

That is a cool site. I learned about it complaining about stuff to the compiler team on our internal clang Slack channel as they use it to answer my questions.

Thanks,

Andrew Fish

> On May 18, 2023, at 2:42 PM, Michael D Kinney <michael.d.kinney@intel.com> wrote:
> 
> Using that tool, the following fragment seems to generate the right code.  Volatile is required.  Static is optional.
>  
> static volatile int  mDeadLoopCount = 0;
>  
> void
> CpuDeadLoop(
>   void
>   )
> {
>   while (mDeadLoopCount == 0);
> }
>  
>  
> GCC
> ===
> CpuDeadLoop():
> .L2:
>         mov     eax, DWORD PTR mDeadLoopCount[rip]
>         test    eax, eax
>         je      .L2
>         ret
>  
>  
> CLANG
> =====
> CpuDeadLoop():                       # @CpuDeadLoop()
> .LBB0_1:                                # =>This Inner Loop Header: Depth=1
>         cmp     dword ptr [rip + _ZL14mDeadLoopCount], 0
>         je      .LBB0_1
>         ret
>  
>  
> Mike
>  
>  
> From: Andrew (EFI) Fish <afish@apple.com <mailto:afish@apple.com>> 
> Sent: Thursday, May 18, 2023 1:45 PM
> To: edk2-devel-groups-io <devel@edk2.groups.io <mailto:devel@edk2.groups.io>>; Andrew Fish <afish@apple.com <mailto:afish@apple.com>>
> Cc: Kinney, Michael D <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>>; Ni, Ray <ray.ni@intel.com <mailto:ray.ni@intel.com>>; Rebecca Cran <rebecca@bsdio.com <mailto:rebecca@bsdio.com>>
> Subject: Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
>  
> Whoops wrong compiler. Here is an update. I added the flags so this one reproduces the issue.
>  
> Compiler Explorer <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>
> godbolt.org <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>	
> <image001.png> <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>
>  
> Thanks,
>  
> Andrew Fish
> 
> 
> On May 18, 2023, at 11:45 AM, Andrew Fish via groups.io <http://groups.io/><afish=apple.com@groups.io <mailto:afish=apple.com@groups.io>> wrote:
>  
> Mike,
>  
> This is a good way to play around with fixes, and to report bugs. You can see the assembler for different compilers with different flag. 
>  
> Compiler Explorer <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>
> godbolt.org <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>	
> <favicon.png> <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>
>  
> Sorry I’m traveling and in Cupertino with lots of meetings so I did not have time to adjust the compiler flags….
>  
> Thanks,
>  
> Andrew Fish
> 
> 
> On May 18, 2023, at 10:24 AM, Andrew (EFI) Fish <afish@apple.com <mailto:afish@apple.com>> wrote:
>  
> Mike,
>  
> I guess my other question… If this turns out to be a compiler bug should we scope the change to the broken toolchain. I’m not sure what the right answer is for that, but I want to ask the question? 
>  
> Thanks,
>  
> Andrew Fish
> 
> 
> On May 18, 2023, at 10:19 AM, Michael D Kinney <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>> wrote:
>  
> Andrew,
>  
> This might work for XIP.  Set non const global to initial value that is expected value to stay in dead loop.
>  
> UINTN  mDeadLoopCount = 0;
>  
> VOID
> CpuDeadLoop(
>   VOID
>   ) 
> {
>   while (mDeadLoopCount == 0) {
>       CpuPause();
>   }
> }
>  
> When deadloop is entered, developer can not change value of mDeadLoopCount, but they can use debugger to force exit loop and return from function.
>  
> Mike
>  
>  
> From: Andrew (EFI) Fish <afish@apple.com <mailto:afish@apple.com>> 
> Sent: Thursday, May 18, 2023 10:09 AM
> To: Kinney, Michael D <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>>
> Cc: edk2-devel-groups-io <devel@edk2.groups.io <mailto:devel@edk2.groups.io>>; Ni, Ray <ray.ni@intel.com <mailto:ray.ni@intel.com>>; Rebecca Cran <rebecca@bsdio.com <mailto:rebecca@bsdio.com>>
> Subject: Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
>  
> Mike,
>  
> Good point, that is why we are using the stack ….
>  
> The only other thing I can think of is to pass the address of Index to some inline assembler, or an asm no op function, to give it a side effect the compiler can’t resolve. 
>  
> Thanks,
>  
> Andrew Fish
> 
> 
> 
> On May 18, 2023, at 10:05 AM, Kinney, Michael D <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>> wrote:
>  
> Static global will not work for XIP
>  
> Mike
>  
> From: Andrew (EFI) Fish <afish@apple.com <mailto:afish@apple.com>> 
> Sent: Thursday, May 18, 2023 9:49 AM
> To: edk2-devel-groups-io <devel@edk2.groups.io <mailto:devel@edk2.groups.io>>; Kinney, Michael D <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>>
> Cc: Ni, Ray <ray.ni@intel.com <mailto:ray.ni@intel.com>>; Rebecca Cran <rebecca@bsdio.com <mailto:rebecca@bsdio.com>>
> Subject: Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
>  
> Mike,
>  
> I pinged some compiler experts to see if our code is correct, or if the compiler has an issue. Seems to be trending compiler issue right now, but I’ve NOT gotten feedback from anyone on the spec committee yet. 
>  
> If we move Index to a static global that would likely work around the compiler issue.
>  
> Thanks,
>  
> Andrew Fish
> 
> 
> 
> 
> On May 18, 2023, at 8:36 AM, Michael D Kinney <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>> wrote:
>  
> Hi Ray,
>  
> So the code generated does deadloop, but is just not easy to resume from as we have been able to do in the past.
>  
> We use CpuDeadloop() for 2 purposes.  One is a terminal condition with no reason to ever continue.
>  
> The 2nd is a debug aide for developers to halt the system at a specific location and then continue from that point, usually with a debugger, to step through code to an area to evaluate unexpected behavior.
>  
> We may have to do a NASM implementation of CpuDeadloop() to make sure it meets both use cases.
>  
> Mike
>  
> From: Ni, Ray <ray.ni@intel.com <mailto:ray.ni@intel.com>> 
> Sent: Thursday, May 18, 2023 3:00 AM
> To: devel@edk2.groups.io <mailto:devel@edk2.groups.io>
> Cc: Kinney, Michael D <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>>; Rebecca Cran <rebecca@bsdio.com <mailto:rebecca@bsdio.com>>; Ni, Ray <ray.ni@intel.com <mailto:ray.ni@intel.com>>
> Subject: CpuDeadLoop() is optimized by compiler
>  
> Hi,
> Starting from certain version of Visual Studio C compiler (I don’t have the exact version. I am using VS2019), CpuDeadLoop is now optimized quite well by compiler.
>  
> The optimization is so “good” that it becomes harder for developers to break out of the deadloop.
>  
> I copied the assembly instructions as below for your reference.
> The compiler does not generate instructions that jump out of the loop when the Index is not zero.
> So in order to break out of the loop, developers need to:
> Manually adjust rsp by increasing 40
> Manually “ret”
>  
> I am not sure if anyone has interest to re-write this function so that compiler can be “fooled” again.
> Thanks,
> Ray
>  
> =======================
> ; Function compile flags: /Ogspy
> ; File e:\work\edk2\MdePkg\Library\BaseLib\CpuDeadLoop.c
> ;              COMDAT CpuDeadLoop
> _TEXT    SEGMENT
> Index$ = 48
> CpuDeadLoop PROC                                                                    ; COMDAT
>  
> ; 26   : {
>  
> $LN12:
>   00000  48 83 ec 28         sub        rsp, 40                                ; 00000028H
>  
> ; 27   :   volatile UINTN  Index;
> ; 28   : 
> ; 29   :   for (Index = 0; Index == 0;) {
>  
>   00004  48 c7 44 24 30
>                00 00 00 00        mov      QWORD PTR Index$[rsp], 0
> $LN10@CpuDeadLoo:
>  
> ; 30   :     CpuPause ();
>  
>   0000d  48 8b 44 24 30   mov      rax, QWORD PTR Index$[rsp]
>   00012  e8 00 00 00 00   call        CpuPause
>   00017  eb f4                     jmp       SHORT $LN10@CpuDeadLoo
> CpuDeadLoop ENDP
> _TEXT    ENDS
> END
>  
>  
>  
>  
>  
>  
> 


[-- Attachment #2: Type: text/html, Size: 56153 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
  2023-05-19  0:42                   ` Andrew Fish
@ 2023-05-19  2:53                     ` Ni, Ray
  2023-05-19  3:03                       ` Jeff Fan
                                         ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: Ni, Ray @ 2023-05-19  2:53 UTC (permalink / raw)
  To: Andrew (EFI) Fish, devel@edk2.groups.io, Kinney, Michael D; +Cc: Rebecca Cran

[-- Attachment #1: Type: text/plain, Size: 16130 bytes --]

I think all the options we considered are workarounds. These might break again if compiler is “cleverer” in future. Unless some Cxx spec clearly guarantees that.

I like Mike’s idea to use assembly implementation for CpuDeadLoop. The assembly can simply “jmp $” then “ret”.

I didn’t find a dead-loop intrinsic function in MSVC.
Any better idea?

Thanks,
Ray

From: Andrew (EFI) Fish <afish@apple.com>
Sent: Friday, May 19, 2023 8:42 AM
To: devel@edk2.groups.io; Kinney, Michael D <michael.d.kinney@intel.com>
Cc: Ni, Ray <ray.ni@intel.com>; Rebecca Cran <rebecca@bsdio.com>
Subject: Re: [edk2-devel] CpuDeadLoop() is optimized by compiler

Mike,

Sorry static was just to scope the name to the file since it is a lib, not to make it work.

That is a cool site. I learned about it complaining about stuff to the compiler team on our internal clang Slack channel as they use it to answer my questions.

Thanks,

Andrew Fish


On May 18, 2023, at 2:42 PM, Michael D Kinney <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>> wrote:

Using that tool, the following fragment seems to generate the right code.  Volatile is required.  Static is optional.

static volatile int  mDeadLoopCount = 0;

void
CpuDeadLoop(
  void
  )
{
  while (mDeadLoopCount == 0);
}


GCC
===
CpuDeadLoop():
.L2:
        mov     eax, DWORD PTR mDeadLoopCount[rip]
        test    eax, eax
        je      .L2
        ret


CLANG
=====
CpuDeadLoop():                       # @CpuDeadLoop()
.LBB0_1:                                # =>This Inner Loop Header: Depth=1
        cmp     dword ptr [rip + _ZL14mDeadLoopCount], 0
        je      .LBB0_1
        ret


Mike


From: Andrew (EFI) Fish <afish@apple.com<mailto:afish@apple.com>>
Sent: Thursday, May 18, 2023 1:45 PM
To: edk2-devel-groups-io <devel@edk2.groups.io<mailto:devel@edk2.groups.io>>; Andrew Fish <afish@apple.com<mailto:afish@apple.com>>
Cc: Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>; Ni, Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>; Rebecca Cran <rebecca@bsdio.com<mailto:rebecca@bsdio.com>>
Subject: Re: [edk2-devel] CpuDeadLoop() is optimized by compiler

Whoops wrong compiler. Here is an update. I added the flags so this one reproduces the issue.

Compiler Explorer<https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>
godbolt.org<https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>
<image001.png><https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>

Thanks,

Andrew Fish



On May 18, 2023, at 11:45 AM, Andrew Fish via groups.io<http://groups.io/><afish=apple.com@groups.io<mailto:afish=apple.com@groups.io>> wrote:

Mike,

This is a good way to play around with fixes, and to report bugs. You can see the assembler for different compilers with different flag.

Compiler Explorer<https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>
godbolt.org<https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>
<favicon.png><https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>

Sorry I’m traveling and in Cupertino with lots of meetings so I did not have time to adjust the compiler flags….

Thanks,

Andrew Fish



On May 18, 2023, at 10:24 AM, Andrew (EFI) Fish <afish@apple.com<mailto:afish@apple.com>> wrote:

Mike,

I guess my other question… If this turns out to be a compiler bug should we scope the change to the broken toolchain. I’m not sure what the right answer is for that, but I want to ask the question?

Thanks,

Andrew Fish



On May 18, 2023, at 10:19 AM, Michael D Kinney <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>> wrote:

Andrew,

This might work for XIP.  Set non const global to initial value that is expected value to stay in dead loop.

UINTN  mDeadLoopCount = 0;

VOID
CpuDeadLoop(
  VOID
  )
{
  while (mDeadLoopCount == 0) {
      CpuPause();
  }
}

When deadloop is entered, developer can not change value of mDeadLoopCount, but they can use debugger to force exit loop and return from function.

Mike


From: Andrew (EFI) Fish <afish@apple.com<mailto:afish@apple.com>>
Sent: Thursday, May 18, 2023 10:09 AM
To: Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>
Cc: edk2-devel-groups-io <devel@edk2.groups.io<mailto:devel@edk2.groups.io>>; Ni, Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>; Rebecca Cran <rebecca@bsdio.com<mailto:rebecca@bsdio.com>>
Subject: Re: [edk2-devel] CpuDeadLoop() is optimized by compiler

Mike,

Good point, that is why we are using the stack ….

The only other thing I can think of is to pass the address of Index to some inline assembler, or an asm no op function, to give it a side effect the compiler can’t resolve.

Thanks,

Andrew Fish




On May 18, 2023, at 10:05 AM, Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>> wrote:

Static global will not work for XIP

Mike

From: Andrew (EFI) Fish <afish@apple.com<mailto:afish@apple.com>>
Sent: Thursday, May 18, 2023 9:49 AM
To: edk2-devel-groups-io <devel@edk2.groups.io<mailto:devel@edk2.groups.io>>; Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>
Cc: Ni, Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>; Rebecca Cran <rebecca@bsdio.com<mailto:rebecca@bsdio.com>>
Subject: Re: [edk2-devel] CpuDeadLoop() is optimized by compiler

Mike,

I pinged some compiler experts to see if our code is correct, or if the compiler has an issue. Seems to be trending compiler issue right now, but I’ve NOT gotten feedback from anyone on the spec committee yet.

If we move Index to a static global that would likely work around the compiler issue.

Thanks,

Andrew Fish





On May 18, 2023, at 8:36 AM, Michael D Kinney <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>> wrote:

Hi Ray,

So the code generated does deadloop, but is just not easy to resume from as we have been able to do in the past.

We use CpuDeadloop() for 2 purposes.  One is a terminal condition with no reason to ever continue.

The 2nd is a debug aide for developers to halt the system at a specific location and then continue from that point, usually with a debugger, to step through code to an area to evaluate unexpected behavior.

We may have to do a NASM implementation of CpuDeadloop() to make sure it meets both use cases.

Mike

From: Ni, Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>
Sent: Thursday, May 18, 2023 3:00 AM
To: devel@edk2.groups.io<mailto:devel@edk2.groups.io>
Cc: Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>; Rebecca Cran <rebecca@bsdio.com<mailto:rebecca@bsdio.com>>; Ni, Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>
Subject: CpuDeadLoop() is optimized by compiler

Hi,
Starting from certain version of Visual Studio C compiler (I don’t have the exact version. I am using VS2019), CpuDeadLoop is now optimized quite well by compiler.

The optimization is so “good” that it becomes harder for developers to break out of the deadloop.

I copied the assembly instructions as below for your reference.
The compiler does not generate instructions that jump out of the loop when the Index is not zero.
So in order to break out of the loop, developers need to:

  1.  Manually adjust rsp by increasing 40
  2.  Manually “ret”

I am not sure if anyone has interest to re-write this function so that compiler can be “fooled” again.
Thanks,
Ray

=======================
; Function compile flags: /Ogspy
; File e:\work\edk2\MdePkg\Library\BaseLib\CpuDeadLoop.c
;              COMDAT CpuDeadLoop
_TEXT    SEGMENT
Index$ = 48
CpuDeadLoop PROC                                                                    ; COMDAT

; 26   : {

$LN12:
  00000  48 83 ec 28         sub        rsp, 40                                ; 00000028H

; 27   :   volatile UINTN  Index;
; 28   :
; 29   :   for (Index = 0; Index == 0;) {

  00004  48 c7 44 24 30
               00 00 00 00        mov      QWORD PTR Index$[rsp], 0
$LN10@CpuDeadLoo:

; 30   :     CpuPause ();

  0000d  48 8b 44 24 30   mov      rax, QWORD PTR Index$[rsp]
  00012  e8 00 00 00 00   call        CpuPause
  00017  eb f4                     jmp       SHORT $LN10@CpuDeadLoo
CpuDeadLoop ENDP
_TEXT    ENDS
END









[-- Attachment #2: Type: text/html, Size: 51219 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
  2023-05-19  2:53                     ` Ni, Ray
@ 2023-05-19  3:03                       ` Jeff Fan
  2023-05-19 15:31                       ` Rebecca Cran
       [not found]                       ` <1760952DCE55DF8D.29365@groups.io>
  2 siblings, 0 replies; 27+ messages in thread
From: Jeff Fan @ 2023-05-19  3:03 UTC (permalink / raw)
  To: devel@edk2.groups.io, Ni, Ray, 'Andrew Fish',
	Kinney, Michael D
  Cc: Rebecca Cran

[-- Attachment #1: Type: text/plain, Size: 7796 bytes --]

Ray,

If you chooses assembly solution, you'd better to consider stack adjust to follow calling convention. Otherwise, it may break some debugger tools to do call stack trace.

Jeff


fanjianfeng@byosoft.com.cn
 
From: Ni, Ray
Date: 2023-05-19 10:53
To: Andrew (EFI) Fish; devel@edk2.groups.io; Kinney, Michael D
CC: Rebecca Cran
Subject: Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
I think all the options we considered are workarounds. These might break again if compiler is “cleverer” in future. Unless some Cxx spec clearly guarantees that.
 
I like Mike’s idea to use assembly implementation for CpuDeadLoop. The assembly can simply “jmp $” then “ret”.
 
I didn’t find a dead-loop intrinsic function in MSVC.
Any better idea?
 
Thanks,
Ray
 
From: Andrew (EFI) Fish <afish@apple.com> 
Sent: Friday, May 19, 2023 8:42 AM
To: devel@edk2.groups.io; Kinney, Michael D <michael.d.kinney@intel.com>
Cc: Ni, Ray <ray.ni@intel.com>; Rebecca Cran <rebecca@bsdio.com>
Subject: Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
 
Mike,
 
Sorry static was just to scope the name to the file since it is a lib, not to make it work.
 
That is a cool site. I learned about it complaining about stuff to the compiler team on our internal clang Slack channel as they use it to answer my questions.
 
Thanks,
 
Andrew Fish


On May 18, 2023, at 2:42 PM, Michael D Kinney <michael.d.kinney@intel.com> wrote:
 
Using that tool, the following fragment seems to generate the right code.  Volatile is required.  Static is optional.
 
static volatile int  mDeadLoopCount = 0;
 
void
CpuDeadLoop(
  void
  )
{
  while (mDeadLoopCount == 0);
}
 
 
GCC
===
CpuDeadLoop():
.L2:
        mov     eax, DWORD PTR mDeadLoopCount[rip]
        test    eax, eax
        je      .L2
        ret
 
 
CLANG
=====
CpuDeadLoop():                       # @CpuDeadLoop()
.LBB0_1:                                # =>This Inner Loop Header: Depth=1
        cmp     dword ptr [rip + _ZL14mDeadLoopCount], 0
        je      .LBB0_1
        ret
 
 
Mike
 
 
From: Andrew (EFI) Fish <afish@apple.com> 
Sent: Thursday, May 18, 2023 1:45 PM
To: edk2-devel-groups-io <devel@edk2.groups.io>; Andrew Fish <afish@apple.com>
Cc: Kinney, Michael D <michael.d.kinney@intel.com>; Ni, Ray <ray.ni@intel.com>; Rebecca Cran <rebecca@bsdio.com>
Subject: Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
 
Whoops wrong compiler. Here is an update. I added the flags so this one reproduces the issue.
 
Compiler Explorer
godbolt.org
<image001.png>
 
Thanks,
 
Andrew Fish



On May 18, 2023, at 11:45 AM, Andrew Fish via groups.io<afish=apple.com@groups.io> wrote:
 
Mike,
 
This is a good way to play around with fixes, and to report bugs. You can see the assembler for different compilers with different flag. 
 
Compiler Explorer
godbolt.org
<favicon.png>
 
Sorry I’m traveling and in Cupertino with lots of meetings so I did not have time to adjust the compiler flags….
 
Thanks,
 
Andrew Fish



On May 18, 2023, at 10:24 AM, Andrew (EFI) Fish <afish@apple.com> wrote:
 
Mike,
 
I guess my other question… If this turns out to be a compiler bug should we scope the change to the broken toolchain. I’m not sure what the right answer is for that, but I want to ask the question? 
 
Thanks,
 
Andrew Fish



On May 18, 2023, at 10:19 AM, Michael D Kinney <michael.d.kinney@intel.com> wrote:
 
Andrew,
 
This might work for XIP.  Set non const global to initial value that is expected value to stay in dead loop.
 
UINTN  mDeadLoopCount = 0;
 
VOID
CpuDeadLoop(
  VOID
  ) 
{
  while (mDeadLoopCount == 0) {
      CpuPause();
  }
}
 
When deadloop is entered, developer can not change value of mDeadLoopCount, but they can use debugger to force exit loop and return from function.
 
Mike
 
 
From: Andrew (EFI) Fish <afish@apple.com> 
Sent: Thursday, May 18, 2023 10:09 AM
To: Kinney, Michael D <michael.d.kinney@intel.com>
Cc: edk2-devel-groups-io <devel@edk2.groups.io>; Ni, Ray <ray.ni@intel.com>; Rebecca Cran <rebecca@bsdio.com>
Subject: Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
 
Mike,
 
Good point, that is why we are using the stack ….
 
The only other thing I can think of is to pass the address of Index to some inline assembler, or an asm no op function, to give it a side effect the compiler can’t resolve. 
 
Thanks,
 
Andrew Fish




On May 18, 2023, at 10:05 AM, Kinney, Michael D <michael.d.kinney@intel.com> wrote:
 
Static global will not work for XIP
 
Mike
 
From: Andrew (EFI) Fish <afish@apple.com> 
Sent: Thursday, May 18, 2023 9:49 AM
To: edk2-devel-groups-io <devel@edk2.groups.io>; Kinney, Michael D <michael.d.kinney@intel.com>
Cc: Ni, Ray <ray.ni@intel.com>; Rebecca Cran <rebecca@bsdio.com>
Subject: Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
 
Mike,
 
I pinged some compiler experts to see if our code is correct, or if the compiler has an issue. Seems to be trending compiler issue right now, but I’ve NOT gotten feedback from anyone on the spec committee yet. 
 
If we move Index to a static global that would likely work around the compiler issue.
 
Thanks,
 
Andrew Fish





On May 18, 2023, at 8:36 AM, Michael D Kinney <michael.d.kinney@intel.com> wrote:
 
Hi Ray,
 
So the code generated does deadloop, but is just not easy to resume from as we have been able to do in the past.
 
We use CpuDeadloop() for 2 purposes.  One is a terminal condition with no reason to ever continue.
 
The 2nd is a debug aide for developers to halt the system at a specific location and then continue from that point, usually with a debugger, to step through code to an area to evaluate unexpected behavior.
 
We may have to do a NASM implementation of CpuDeadloop() to make sure it meets both use cases.
 
Mike
 
From: Ni, Ray <ray.ni@intel.com> 
Sent: Thursday, May 18, 2023 3:00 AM
To: devel@edk2.groups.io
Cc: Kinney, Michael D <michael.d.kinney@intel.com>; Rebecca Cran <rebecca@bsdio.com>; Ni, Ray <ray.ni@intel.com>
Subject: CpuDeadLoop() is optimized by compiler
 
Hi,
Starting from certain version of Visual Studio C compiler (I don’t have the exact version. I am using VS2019), CpuDeadLoop is now optimized quite well by compiler.
 
The optimization is so “good” that it becomes harder for developers to break out of the deadloop.
 
I copied the assembly instructions as below for your reference.
The compiler does not generate instructions that jump out of the loop when the Index is not zero.
So in order to break out of the loop, developers need to:
Manually adjust rsp by increasing 40
Manually “ret”
 
I am not sure if anyone has interest to re-write this function so that compiler can be “fooled” again.
Thanks,
Ray
 
=======================
; Function compile flags: /Ogspy
; File e:\work\edk2\MdePkg\Library\BaseLib\CpuDeadLoop.c
;              COMDAT CpuDeadLoop
_TEXT    SEGMENT
Index$ = 48
CpuDeadLoop PROC                                                                    ; COMDAT
 
; 26   : {
 
$LN12:
  00000  48 83 ec 28         sub        rsp, 40                                ; 00000028H
 
; 27   :   volatile UINTN  Index;
; 28   : 
; 29   :   for (Index = 0; Index == 0;) {
 
  00004  48 c7 44 24 30
               00 00 00 00        mov      QWORD PTR Index$[rsp], 0
$LN10@CpuDeadLoo:
 
; 30   :     CpuPause ();
 
  0000d  48 8b 44 24 30   mov      rax, QWORD PTR Index$[rsp]
  00012  e8 00 00 00 00   call        CpuPause
  00017  eb f4                     jmp       SHORT $LN10@CpuDeadLoo
CpuDeadLoop ENDP
_TEXT    ENDS
END
 
 
 
 
 
 
 


[-- Attachment #2: Type: text/html, Size: 68906 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
  2023-05-19  2:53                     ` Ni, Ray
  2023-05-19  3:03                       ` Jeff Fan
@ 2023-05-19 15:31                       ` Rebecca Cran
  2023-05-19 16:31                         ` Andrew Fish
       [not found]                       ` <1760952DCE55DF8D.29365@groups.io>
  2 siblings, 1 reply; 27+ messages in thread
From: Rebecca Cran @ 2023-05-19 15:31 UTC (permalink / raw)
  To: Ni, Ray, Andrew (EFI) Fish, devel@edk2.groups.io,
	Kinney, Michael D

Just to add more data, I also tried with "volatile sig_atomic_t" as 
someone suggested and both "/volatile:iso" and "/volatile:ms" with no 
change in results.


-- 

Rebecca Cran


On 5/18/23 20:53, Ni, Ray wrote:
>
> I think all the options we considered are workarounds. These might 
> break again if compiler is “cleverer” in future. Unless some Cxx spec 
> clearly guarantees that.
>
> I like Mike’s idea to use assembly implementation for CpuDeadLoop. The 
> assembly can simply “jmp $” then “ret”.
>
> I didn’t find a dead-loop intrinsic function in MSVC.
>
> Any better idea?
>
> Thanks,
>
> Ray
>
> *From:* Andrew (EFI) Fish <afish@apple.com>
> *Sent:* Friday, May 19, 2023 8:42 AM
> *To:* devel@edk2.groups.io; Kinney, Michael D <michael.d.kinney@intel.com>
> *Cc:* Ni, Ray <ray.ni@intel.com>; Rebecca Cran <rebecca@bsdio.com>
> *Subject:* Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
>
> Mike,
>
> Sorry static was just to scope the name to the file since it is a lib, 
> not to make it work.
>
> That is a cool site. I learned about it complaining about stuff to the 
> compiler team on our internal clang Slack channel as they use it to 
> answer my questions.
>
> Thanks,
>
> Andrew Fish
>
>
>
>     On May 18, 2023, at 2:42 PM, Michael D Kinney
>     <michael.d.kinney@intel.com> wrote:
>
>     Using that tool, the following fragment seems to generate the
>     right code. Volatile is required.  Static is optional.
>
>     staticvolatileint mDeadLoopCount =0;
>
>     void
>
>     CpuDeadLoop(
>
>     void
>
>       )
>
>     {
>
>     while(mDeadLoopCount ==0);
>
>     }
>
>     GCC
>
>     ===
>
>     CpuDeadLoop():
>
>     .L2:
>
>     moveax,DWORDPTRmDeadLoopCount[rip]
>
>     testeax,eax
>
>     je.L2
>
>     ret
>
>     CLANG
>
>     =====
>
>     CpuDeadLoop():# @CpuDeadLoop()
>
>     .LBB0_1:                          # =>This Inner Loop Header:Depth=1
>
>     cmpdwordptr[rip+_ZL14mDeadLoopCount],0
>
>     je.LBB0_1
>
>     ret
>
>     Mike
>
>     *From:*Andrew (EFI) Fish <afish@apple.com>
>     *Sent:*Thursday, May 18, 2023 1:45 PM
>     *To:*edk2-devel-groups-io <devel@edk2.groups.io>; Andrew Fish
>     <afish@apple.com>
>     *Cc:*Kinney, Michael D <michael.d.kinney@intel.com>; Ni, Ray
>     <ray.ni@intel.com>; Rebecca Cran <rebecca@bsdio.com>
>     *Subject:*Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
>
>     Whoops wrong compiler. Here is an update. I added the flags so
>     this one reproduces the issue.
>
>     Compiler Explorer
>     <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>
>
>     godbolt.org
>     <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>
>
>     	
>
>     <image001.png>
>     <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>
>
>     Thanks,
>
>     Andrew Fish
>
>
>
>
>         On May 18, 2023, at 11:45 AM, Andrew Fish viagroups.io
>         <http://groups.io/><afish=apple.com@groups.io> wrote:
>
>         Mike,
>
>         This is a good way to play around with fixes, and to report
>         bugs. You can see the assembler for different compilers with
>         different flag.
>
>         Compiler Explorer
>         <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>
>
>         godbolt.org
>         <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>
>
>         	
>
>         <favicon.png>
>         <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>
>
>         Sorry I’m traveling and in Cupertino with lots of meetings so
>         I did not have time to adjust the compiler flags….
>
>         Thanks,
>
>         Andrew Fish
>
>
>
>
>             On May 18, 2023, at 10:24 AM, Andrew (EFI) Fish
>             <afish@apple.com> wrote:
>
>             Mike,
>
>             I guess my other question… If this turns out to be a
>             compiler bug should we scope the change to the broken
>             toolchain. I’m not sure what the right answer is for that,
>             but I want to ask the question?
>
>             Thanks,
>
>             Andrew Fish
>
>
>
>
>                 On May 18, 2023, at 10:19 AM, Michael D Kinney
>                 <michael.d.kinney@intel.com> wrote:
>
>                 Andrew,
>
>                 This might work for XIP.  Set non const global to
>                 initial value that is expected value to stay in dead loop.
>
>                 UINTN  mDeadLoopCount = 0;
>
>                 VOID
>
>                 CpuDeadLoop(
>
>                 VOID
>
>                 )
>
>                 {
>
>                 while (mDeadLoopCount == 0) {
>
>                     CpuPause();
>
>                 }
>
>                 }
>
>                 When deadloop is entered, developer can not change
>                 value of mDeadLoopCount, but they can use debugger to
>                 force exit loop and return from function.
>
>                 Mike
>
>                 *From:*Andrew (EFI) Fish <afish@apple.com>
>                 *Sent:*Thursday, May 18, 2023 10:09 AM
>                 *To:*Kinney, Michael D <michael.d.kinney@intel.com>
>                 *Cc:*edk2-devel-groups-io <devel@edk2.groups.io>; Ni,
>                 Ray <ray.ni@intel.com>; Rebecca Cran <rebecca@bsdio.com>
>                 *Subject:*Re: [edk2-devel] CpuDeadLoop() is optimized
>                 by compiler
>
>                 Mike,
>
>                 Good point, that is why we are using the stack ….
>
>                 The only other thing I can think of is to pass the
>                 address of Index to some inline assembler, or an asm
>                 no op function, to give it a side effect the compiler
>                 can’t resolve.
>
>                 Thanks,
>
>                 Andrew Fish
>
>
>
>
>
>                     On May 18, 2023, at 10:05 AM, Kinney, Michael D
>                     <michael.d.kinney@intel.com> wrote:
>
>                     Static global will not work for XIP
>
>                     Mike
>
>                     *From:*Andrew (EFI) Fish <afish@apple.com>
>                     *Sent:*Thursday, May 18, 2023 9:49 AM
>                     *To:*edk2-devel-groups-io <devel@edk2.groups.io>;
>                     Kinney, Michael D <michael.d.kinney@intel.com>
>                     *Cc:*Ni, Ray <ray.ni@intel.com>; Rebecca Cran
>                     <rebecca@bsdio.com>
>                     *Subject:*Re: [edk2-devel] CpuDeadLoop() is
>                     optimized by compiler
>
>                     Mike,
>
>                     I pinged some compiler experts to see if our code
>                     is correct, or if the compiler has an issue. Seems
>                     to be trending compiler issue right now, but I’ve
>                     NOT gotten feedback from anyone on the spec
>                     committee yet.
>
>                     If we move Index to a static global that would
>                     likely work around the compiler issue.
>
>                     Thanks,
>
>                     Andrew Fish
>
>
>
>
>
>
>                         On May 18, 2023, at 8:36 AM, Michael D Kinney
>                         <michael.d.kinney@intel.com> wrote:
>
>                         Hi Ray,
>
>                         So the code generated does deadloop, but is
>                         just not easy to resume from as we have been
>                         able to do in the past.
>
>                         We use CpuDeadloop() for 2 purposes.  One is a
>                         terminal condition with no reason to ever
>                         continue.
>
>                         The 2^nd is a debug aide for developers to
>                         halt the system at a specific location and
>                         then continue from that point, usually with a
>                         debugger, to step through code to an area to
>                         evaluate unexpected behavior.
>
>                         We may have to do a NASM implementation of
>                         CpuDeadloop() to make sure it meets both use
>                         cases.
>
>                         Mike
>
>                         *From:*Ni, Ray <ray.ni@intel.com>
>                         *Sent:*Thursday, May 18, 2023 3:00 AM
>                         *To:*devel@edk2.groups.io
>                         *Cc:*Kinney, Michael D
>                         <michael.d.kinney@intel.com>; Rebecca Cran
>                         <rebecca@bsdio.com>; Ni, Ray <ray.ni@intel.com>
>                         *Subject:*CpuDeadLoop() is optimized by compiler
>
>                         Hi,
>
>                         Starting from certain version of Visual Studio
>                         C compiler (I don’t have the exact version. I
>                         am using VS2019), CpuDeadLoop is now optimized
>                         quite well by compiler.
>
>                         The optimization is so “good” that it becomes
>                         harder for developers to break out of the
>                         deadloop.
>
>                         I copied the assembly instructions as below
>                         for your reference.
>
>                         The compiler does not generate instructions
>                         that jump out of the loop when the Index is
>                         not zero.
>
>                         So in order to break out of the loop,
>                         developers need to:
>
>                          1. Manually adjust rsp by increasing 40
>                          2. Manually “ret”
>
>                         I am not sure if anyone has interest to
>                         re-write this function so that compiler can be
>                         “fooled” again.
>
>                         Thanks,
>                         Ray
>
>                         =======================
>
>                         ; Function compile flags: /Ogspy
>
>                         ; File
>                         e:\work\edk2\MdePkg\Library\BaseLib\CpuDeadLoop.c
>
>                         ; COMDAT CpuDeadLoop
>
>                         _TEXT SEGMENT
>
>                         Index$ = 48
>
>                         CpuDeadLoop PROC ; COMDAT
>
>                         ; 26   : {
>
>                         $LN12:
>
>                         00000  48 83 ec 28 sub rsp, 40 ; 00000028H
>
>                         ; 27   : volatile UINTN  Index;
>
>                         ; 28   :
>
>                         ; 29   :   for (Index = 0; Index == 0;) {
>
>                         00004  48 c7 44 24 30
>
>                         00 00 00 00 mov      QWORD PTR Index$[rsp], 0
>
>                         $LN10@CpuDeadLoo:
>
>                         ; 30   : CpuPause ();
>
>                         0000d  48 8b 44 24 30 mov      rax, QWORD PTR
>                         Index$[rsp]
>
>                         00012  e8 00 00 00 00 call CpuPause
>
>                         00017  eb f4 jmp SHORT $LN10@CpuDeadLoo
>
>                         CpuDeadLoop ENDP
>
>                         _TEXT ENDS
>
>                         END
>
>     
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
       [not found]                       ` <1760952DCE55DF8D.29365@groups.io>
@ 2023-05-19 16:09                         ` Rebecca Cran
  0 siblings, 0 replies; 27+ messages in thread
From: Rebecca Cran @ 2023-05-19 16:09 UTC (permalink / raw)
  To: Ni, Ray, Andrew (EFI) Fish, devel@edk2.groups.io,
	Kinney, Michael D

I've submitted a bug report at 
https://developercommunity.visualstudio.com/t/x64-codegen-with-optimizations-wrong-for/10369557 
.


-- 

Rebecca Cran


On 5/19/23 09:31, Rebecca Cran wrote:
> Just to add more data, I also tried with "volatile sig_atomic_t" as 
> someone suggested and both "/volatile:iso" and "/volatile:ms" with no 
> change in results.
>
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
  2023-05-19 15:31                       ` Rebecca Cran
@ 2023-05-19 16:31                         ` Andrew Fish
  2023-10-31  2:51                           ` Ni, Ray
  0 siblings, 1 reply; 27+ messages in thread
From: Andrew Fish @ 2023-05-19 16:31 UTC (permalink / raw)
  To: edk2-devel-groups-io, Rebecca Cran; +Cc: Ni, Ray, Mike Kinney

[-- Attachment #1: Type: text/plain, Size: 23402 bytes --]

I don’t think the atomic is going to help. The compiler honored the volatile by doing a read, but assumed it would never change due to scoping. As you can see in my example if the compiler thinks DeadLoopCount can be changed it will put the check back in and assume the function can return. So an assembly function that does nothing called IncreaseScope()  would fix this issue too. 

void IncreaseScope(int *ptr);

void CpuDeadLoopFix(void) {
  volatile int DeadLoopCount = 0;
  while(DeadLoopCount == 0) {
    IncreaseScope(&DeadLoopCount);
  }
}

void CpuDeadLoop(void) {
  volatile int DeadLoopCount = 0;
  while(DeadLoopCount == 0);
}

Gives us:

voltbl  SEGMENT
voltbl  ENDS
voltbl  SEGMENT
voltbl  ENDS

DeadLoopCount$ = 48
CpuDeadLoopFix PROC                           ; COMDAT
$LN12:
        sub     rsp, 40                             ; 00000028H
        mov     DWORD PTR DeadLoopCount$[rsp], 0
        jmp     SHORT $LN10@CpuDeadLoo
$LL2@CpuDeadLoo:
        lea     rcx, QWORD PTR DeadLoopCount$[rsp]
        call    IncreaseScope
$LN10@CpuDeadLoo:
        mov     eax, DWORD PTR DeadLoopCount$[rsp]
        test    eax, eax
        je      SHORT $LL2@CpuDeadLoo
        add     rsp, 40                             ; 00000028H
        ret     0
CpuDeadLoopFix ENDP

DeadLoopCount$ = 8
CpuDeadLoop PROC                                        ; COMDAT
        mov     DWORD PTR DeadLoopCount$[rsp], 0
$LL2@CpuDeadLoo:
        mov     eax, DWORD PTR DeadLoopCount$[rsp]
        jmp     SHORT $LL2@CpuDeadLoo
CpuDeadLoop ENDP


https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA

Thanks,

Andrew Fish

PS I’m still not 100% sure it is a compiler bug. Some times things like this are due to the order the compiler applies the optimizations, and changing the order can change the behavior. 


> On May 19, 2023, at 8:31 AM, Rebecca Cran <rebecca@bsdio.com> wrote:
> 
> Just to add more data, I also tried with "volatile sig_atomic_t" as someone suggested and both "/volatile:iso" and "/volatile:ms" with no change in results.
> 
> 
> -- 
> 
> Rebecca Cran
> 
> 
> On 5/18/23 20:53, Ni, Ray wrote:
>> 
>> I think all the options we considered are workarounds. These might break again if compiler is “cleverer” in future. Unless some Cxx spec clearly guarantees that.
>> 
>> I like Mike’s idea to use assembly implementation for CpuDeadLoop. The assembly can simply “jmp $” then “ret”.
>> 
>> I didn’t find a dead-loop intrinsic function in MSVC.
>> 
>> Any better idea?
>> 
>> Thanks,
>> 
>> Ray
>> 
>> *From:* Andrew (EFI) Fish <afish@apple.com>
>> *Sent:* Friday, May 19, 2023 8:42 AM
>> *To:* devel@edk2.groups.io; Kinney, Michael D <michael.d.kinney@intel.com>
>> *Cc:* Ni, Ray <ray.ni@intel.com>; Rebecca Cran <rebecca@bsdio.com>
>> *Subject:* Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
>> 
>> Mike,
>> 
>> Sorry static was just to scope the name to the file since it is a lib, not to make it work.
>> 
>> That is a cool site. I learned about it complaining about stuff to the compiler team on our internal clang Slack channel as they use it to answer my questions.
>> 
>> Thanks,
>> 
>> Andrew Fish
>> 
>> 
>> 
>>    On May 18, 2023, at 2:42 PM, Michael D Kinney
>>    <michael.d.kinney@intel.com> wrote:
>> 
>>    Using that tool, the following fragment seems to generate the
>>    right code. Volatile is required.  Static is optional.
>> 
>>    staticvolatileint mDeadLoopCount =0;
>> 
>>    void
>> 
>>    CpuDeadLoop(
>> 
>>    void
>> 
>>      )
>> 
>>    {
>> 
>>    while(mDeadLoopCount ==0);
>> 
>>    }
>> 
>>    GCC
>> 
>>    ===
>> 
>>    CpuDeadLoop():
>> 
>>    .L2:
>> 
>>    moveax,DWORDPTRmDeadLoopCount[rip]
>> 
>>    testeax,eax
>> 
>>    je.L2
>> 
>>    ret
>> 
>>    CLANG
>> 
>>    =====
>> 
>>    CpuDeadLoop():# @CpuDeadLoop()
>> 
>>    .LBB0_1:                          # =>This Inner Loop Header:Depth=1
>> 
>>    cmpdwordptr[rip+_ZL14mDeadLoopCount],0
>> 
>>    je.LBB0_1
>> 
>>    ret
>> 
>>    Mike
>> 
>>    *From:*Andrew (EFI) Fish <afish@apple.com>
>>    *Sent:*Thursday, May 18, 2023 1:45 PM
>>    *To:*edk2-devel-groups-io <devel@edk2.groups.io>; Andrew Fish
>>    <afish@apple.com>
>>    *Cc:*Kinney, Michael D <michael.d.kinney@intel.com>; Ni, Ray
>>    <ray.ni@intel.com>; Rebecca Cran <rebecca@bsdio.com>
>>    *Subject:*Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
>> 
>>    Whoops wrong compiler. Here is an update. I added the flags so
>>    this one reproduces the issue.
>> 
>>    Compiler Explorer
>>    <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>
>> 
>>    godbolt.org <http://godbolt.org/>
>>    <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>
>> 
>>    	
>> 
>>    <image001.png>
>>    <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>
>> 
>>    Thanks,
>> 
>>    Andrew Fish
>> 
>> 
>> 
>> 
>>        On May 18, 2023, at 11:45 AM, Andrew Fish viagroups.io <http://viagroups.io/>
>>        <http://groups.io/><afish=apple.com@groups.io <mailto:afish=apple.com@groups.io>> wrote:
>> 
>>        Mike,
>> 
>>        This is a good way to play around with fixes, and to report
>>        bugs. You can see the assembler for different compilers with
>>        different flag.
>> 
>>        Compiler Explorer
>>        <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>
>> 
>>        godbolt.org <http://godbolt.org/>
>>        <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>
>> 
>>        	
>> 
>>        <favicon.png>
>>        <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>
>> 
>>        Sorry I’m traveling and in Cupertino with lots of meetings so
>>        I did not have time to adjust the compiler flags….
>> 
>>        Thanks,
>> 
>>        Andrew Fish
>> 
>> 
>> 
>> 
>>            On May 18, 2023, at 10:24 AM, Andrew (EFI) Fish
>>            <afish@apple.com <mailto:afish@apple.com>> wrote:
>> 
>>            Mike,
>> 
>>            I guess my other question… If this turns out to be a
>>            compiler bug should we scope the change to the broken
>>            toolchain. I’m not sure what the right answer is for that,
>>            but I want to ask the question?
>> 
>>            Thanks,
>> 
>>            Andrew Fish
>> 
>> 
>> 
>> 
>>                On May 18, 2023, at 10:19 AM, Michael D Kinney
>>                <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>> wrote:
>> 
>>                Andrew,
>> 
>>                This might work for XIP.  Set non const global to
>>                initial value that is expected value to stay in dead loop.
>> 
>>                UINTN  mDeadLoopCount = 0;
>> 
>>                VOID
>> 
>>                CpuDeadLoop(
>> 
>>                VOID
>> 
>>                )
>> 
>>                {
>> 
>>                while (mDeadLoopCount == 0) {
>> 
>>                    CpuPause();
>> 
>>                }
>> 
>>                }
>> 
>>                When deadloop is entered, developer can not change
>>                value of mDeadLoopCount, but they can use debugger to
>>                force exit loop and return from function.
>> 
>>                Mike
>> 
>>                *From:*Andrew (EFI) Fish <afish@apple.com <mailto:afish@apple.com>>
>>                *Sent:*Thursday, May 18, 2023 10:09 AM
>>                *To:*Kinney, Michael D <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>>
>>                *Cc:*edk2-devel-groups-io <devel@edk2.groups.io <mailto:devel@edk2.groups.io>>; Ni,
>>                Ray <ray.ni@intel.com <mailto:ray.ni@intel.com>>; Rebecca Cran <rebecca@bsdio.com <mailto:rebecca@bsdio.com>>
>>                *Subject:*Re: [edk2-devel] CpuDeadLoop() is optimized
>>                by compiler
>> 
>>                Mike,
>> 
>>                Good point, that is why we are using the stack ….
>> 
>>                The only other thing I can think of is to pass the
>>                address of Index to some inline assembler, or an asm
>>                no op function, to give it a side effect the compiler
>>                can’t resolve.
>> 
>>                Thanks,
>> 
>>                Andrew Fish
>> 
>> 
>> 
>> 
>> 
>>                    On May 18, 2023, at 10:05 AM, Kinney, Michael D
>>                    <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>> wrote:
>> 
>>                    Static global will not work for XIP
>> 
>>                    Mike
>> 
>>                    *From:*Andrew (EFI) Fish <afish@apple.com <mailto:afish@apple.com>>
>>                    *Sent:*Thursday, May 18, 2023 9:49 AM
>>                    *To:*edk2-devel-groups-io <devel@edk2.groups.io <mailto:devel@edk2.groups.io>>;
>>                    Kinney, Michael D <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>>
>>                    *Cc:*Ni, Ray <ray.ni@intel.com <mailto:ray.ni@intel.com>>; Rebecca Cran
>>                    <rebecca@bsdio.com <mailto:rebecca@bsdio.com>>
>>                    *Subject:*Re: [edk2-devel] CpuDeadLoop() is
>>                    optimized by compiler
>> 
>>                    Mike,
>> 
>>                    I pinged some compiler experts to see if our code
>>                    is correct, or if the compiler has an issue. Seems
>>                    to be trending compiler issue right now, but I’ve
>>                    NOT gotten feedback from anyone on the spec
>>                    committee yet.
>> 
>>                    If we move Index to a static global that would
>>                    likely work around the compiler issue.
>> 
>>                    Thanks,
>> 
>>                    Andrew Fish
>> 
>> 
>> 
>> 
>> 
>> 
>>                        On May 18, 2023, at 8:36 AM, Michael D Kinney
>>                        <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>> wrote:
>> 
>>                        Hi Ray,
>> 
>>                        So the code generated does deadloop, but is
>>                        just not easy to resume from as we have been
>>                        able to do in the past.
>> 
>>                        We use CpuDeadloop() for 2 purposes.  One is a
>>                        terminal condition with no reason to ever
>>                        continue.
>> 
>>                        The 2^nd is a debug aide for developers to
>>                        halt the system at a specific location and
>>                        then continue from that point, usually with a
>>                        debugger, to step through code to an area to
>>                        evaluate unexpected behavior.
>> 
>>                        We may have to do a NASM implementation of
>>                        CpuDeadloop() to make sure it meets both use
>>                        cases.
>> 
>>                        Mike
>> 
>>                        *From:*Ni, Ray <ray.ni@intel.com <mailto:ray.ni@intel.com>>
>>                        *Sent:*Thursday, May 18, 2023 3:00 AM
>>                        *To:*devel@edk2.groups.io <mailto:devel@edk2.groups.io>
>>                        *Cc:*Kinney, Michael D
>>                        <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>>; Rebecca Cran
>>                        <rebecca@bsdio.com <mailto:rebecca@bsdio.com>>; Ni, Ray <ray.ni@intel.com <mailto:ray.ni@intel.com>>
>>                        *Subject:*CpuDeadLoop() is optimized by compiler
>> 
>>                        Hi,
>> 
>>                        Starting from certain version of Visual Studio
>>                        C compiler (I don’t have the exact version. I
>>                        am using VS2019), CpuDeadLoop is now optimized
>>                        quite well by compiler.
>> 
>>                        The optimization is so “good” that it becomes
>>                        harder for developers to break out of the
>>                        deadloop.
>> 
>>                        I copied the assembly instructions as below
>>                        for your reference.
>> 
>>                        The compiler does not generate instructions
>>                        that jump out of the loop when the Index is
>>                        not zero.
>> 
>>                        So in order to break out of the loop,
>>                        developers need to:
>> 
>>                         1. Manually adjust rsp by increasing 40
>>                         2. Manually “ret”
>> 
>>                        I am not sure if anyone has interest to
>>                        re-write this function so that compiler can be
>>                        “fooled” again.
>> 
>>                        Thanks,
>>                        Ray
>> 
>>                        =======================
>> 
>>                        ; Function compile flags: /Ogspy
>> 
>>                        ; File
>>                        e:\work\edk2\MdePkg\Library\BaseLib\CpuDeadLoop.c
>> 
>>                        ; COMDAT CpuDeadLoop
>> 
>>                        _TEXT SEGMENT
>> 
>>                        Index$ = 48
>> 
>>                        CpuDeadLoop PROC ; COMDAT
>> 
>>                        ; 26   : {
>> 
>>                        $LN12:
>> 
>>                        00000  48 83 ec 28 sub rsp, 40 ; 00000028H
>> 
>>                        ; 27   : volatile UINTN  Index;
>> 
>>                        ; 28   :
>> 
>>                        ; 29   :   for (Index = 0; Index == 0;) {
>> 
>>                        00004  48 c7 44 24 30
>> 
>>                        00 00 00 00 mov      QWORD PTR Index$[rsp], 0
>> 
>>                        $LN10@CpuDeadLoo:
>> 
>>                        ; 30   : CpuPause ();
>> 
>>                        0000d  48 8b 44 24 30 mov      rax, QWORD PTR
>>                        Index$[rsp]
>> 
>>                        00012  e8 00 00 00 00 call CpuPause
>> 
>>                        00017  eb f4 jmp SHORT $LN10@CpuDeadLoo
>> 
>>                        CpuDeadLoop ENDP
>> 
>>                        _TEXT ENDS
>> 
>>                        END
>> 
>>    
> 
> 
> 


[-- Attachment #2.1: Type: text/html, Size: 66309 bytes --]

[-- Attachment #2.2: favicon.png --]
[-- Type: image/png, Size: 12765 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
  2023-05-19 16:31                         ` Andrew Fish
@ 2023-10-31  2:51                           ` Ni, Ray
  2023-10-31  3:37                             ` Michael D Kinney
  0 siblings, 1 reply; 27+ messages in thread
From: Ni, Ray @ 2023-10-31  2:51 UTC (permalink / raw)
  To: Andrew (EFI) Fish, edk2-devel-groups-io, Rebecca Cran,
	Hernandez Miramontes, Jose Miguel
  Cc: Kinney, Michael D


[-- Attachment #1.1: Type: text/plain, Size: 26790 bytes --]

It's been a while.

Is there any better solution? Can we go with assembly solution?

Thanks,
Ray
________________________________
From: Andrew (EFI) Fish <afish@apple.com>
Sent: Saturday, May 20, 2023 12:31 AM
To: edk2-devel-groups-io <devel@edk2.groups.io>; Rebecca Cran <rebecca@bsdio.com>
Cc: Ni, Ray <ray.ni@intel.com>; Kinney, Michael D <michael.d.kinney@intel.com>
Subject: Re: [edk2-devel] CpuDeadLoop() is optimized by compiler

I don’t think the atomic is going to help. The compiler honored the volatile by doing a read, but assumed it would never change due to scoping. As you can see in my example if the compiler thinks DeadLoopCount can be changed it will put the check back in and assume the function can return. So an assembly function that does nothing called IncreaseScope()  would fix this issue too.

void IncreaseScope(int *ptr);

void CpuDeadLoopFix(void) {
volatile int DeadLoopCount = 0;
while(DeadLoopCount == 0) {
IncreaseScope(&DeadLoopCount);
}
}

void CpuDeadLoop(void) {
volatile int DeadLoopCount = 0;
while(DeadLoopCount == 0);
}

Gives us:

voltbl SEGMENT
voltbl ENDS
voltbl SEGMENT
voltbl ENDS

DeadLoopCount$ = 48
CpuDeadLoopFix PROC ; COMDAT
$LN12:
sub rsp, 40 ; 00000028H
mov DWORD PTR DeadLoopCount$[rsp], 0
jmp SHORT $LN10@CpuDeadLoo
$LL2@CpuDeadLoo:
lea rcx, QWORD PTR DeadLoopCount$[rsp]
call IncreaseScope
$LN10@CpuDeadLoo:
mov eax, DWORD PTR DeadLoopCount$[rsp]
test eax, eax
je SHORT $LL2@CpuDeadLoo
add rsp, 40 ; 00000028H
ret 0
CpuDeadLoopFix ENDP

DeadLoopCount$ = 8
CpuDeadLoop PROC ; COMDAT
mov DWORD PTR DeadLoopCount$[rsp], 0
$LL2@CpuDeadLoo:
mov eax, DWORD PTR DeadLoopCount$[rsp]
jmp SHORT $LL2@CpuDeadLoo
CpuDeadLoop ENDP


<https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>
Compiler Explorer<https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>
godbolt.org<https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>
[favicon.png]<https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>

Thanks,

Andrew Fish

PS I’m still not 100% sure it is a compiler bug. Some times things like this are due to the order the compiler applies the optimizations, and changing the order can change the behavior.


On May 19, 2023, at 8:31 AM, Rebecca Cran <rebecca@bsdio.com> wrote:

Just to add more data, I also tried with "volatile sig_atomic_t" as someone suggested and both "/volatile:iso" and "/volatile:ms" with no change in results.


--

Rebecca Cran


On 5/18/23 20:53, Ni, Ray wrote:

I think all the options we considered are workarounds. These might break again if compiler is “cleverer” in future. Unless some Cxx spec clearly guarantees that.

I like Mike’s idea to use assembly implementation for CpuDeadLoop. The assembly can simply “jmp $” then “ret”.

I didn’t find a dead-loop intrinsic function in MSVC.

Any better idea?

Thanks,

Ray

*From:* Andrew (EFI) Fish <afish@apple.com>
*Sent:* Friday, May 19, 2023 8:42 AM
*To:* devel@edk2.groups.io; Kinney, Michael D <michael.d.kinney@intel.com>
*Cc:* Ni, Ray <ray.ni@intel.com>; Rebecca Cran <rebecca@bsdio.com>
*Subject:* Re: [edk2-devel] CpuDeadLoop() is optimized by compiler

Mike,

Sorry static was just to scope the name to the file since it is a lib, not to make it work.

That is a cool site. I learned about it complaining about stuff to the compiler team on our internal clang Slack channel as they use it to answer my questions.

Thanks,

Andrew Fish



   On May 18, 2023, at 2:42 PM, Michael D Kinney
   <michael.d.kinney@intel.com> wrote:

   Using that tool, the following fragment seems to generate the
   right code. Volatile is required.  Static is optional.

   staticvolatileint mDeadLoopCount =0;

   void

   CpuDeadLoop(

   void

     )

   {

   while(mDeadLoopCount ==0);

   }

   GCC

   ===

   CpuDeadLoop():

   .L2:

   moveax,DWORDPTRmDeadLoopCount[rip]

   testeax,eax

   je.L2

   ret

   CLANG

   =====

   CpuDeadLoop():# @CpuDeadLoop()

   .LBB0_1:                          # =>This Inner Loop Header:Depth=1

   cmpdwordptr[rip+_ZL14mDeadLoopCount],0

   je.LBB0_1

   ret

   Mike

   *From:*Andrew (EFI) Fish <afish@apple.com>
   *Sent:*Thursday, May 18, 2023 1:45 PM
   *To:*edk2-devel-groups-io <devel@edk2.groups.io>; Andrew Fish
   <afish@apple.com>
   *Cc:*Kinney, Michael D <michael.d.kinney@intel.com>; Ni, Ray
   <ray.ni@intel.com>; Rebecca Cran <rebecca@bsdio.com>
   *Subject:*Re: [edk2-devel] CpuDeadLoop() is optimized by compiler

   Whoops wrong compiler. Here is an update. I added the flags so
   this one reproduces the issue.

   Compiler Explorer
   <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>

   godbolt.org<http://godbolt.org/>
   <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>



   <image001.png>
   <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>

   Thanks,

   Andrew Fish




       On May 18, 2023, at 11:45 AM, Andrew Fish viagroups.io<http://viagroups.io/>
       <http://groups.io/><afish=apple.com@groups.io<mailto:afish=apple.com@groups.io>> wrote:

       Mike,

       This is a good way to play around with fixes, and to report
       bugs. You can see the assembler for different compilers with
       different flag.

       Compiler Explorer
       <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>

       godbolt.org<http://godbolt.org/>
       <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>



       <favicon.png>
       <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>

       Sorry I’m traveling and in Cupertino with lots of meetings so
       I did not have time to adjust the compiler flags….

       Thanks,

       Andrew Fish




           On May 18, 2023, at 10:24 AM, Andrew (EFI) Fish
           <afish@apple.com<mailto:afish@apple.com>> wrote:

           Mike,

           I guess my other question… If this turns out to be a
           compiler bug should we scope the change to the broken
           toolchain. I’m not sure what the right answer is for that,
           but I want to ask the question?

           Thanks,

           Andrew Fish




               On May 18, 2023, at 10:19 AM, Michael D Kinney
               <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>> wrote:

               Andrew,

               This might work for XIP.  Set non const global to
               initial value that is expected value to stay in dead loop.

               UINTN  mDeadLoopCount = 0;

               VOID

               CpuDeadLoop(

               VOID

               )

               {

               while (mDeadLoopCount == 0) {

                   CpuPause();

               }

               }

               When deadloop is entered, developer can not change
               value of mDeadLoopCount, but they can use debugger to
               force exit loop and return from function.

               Mike

               *From:*Andrew (EFI) Fish <afish@apple.com<mailto:afish@apple.com>>
               *Sent:*Thursday, May 18, 2023 10:09 AM
               *To:*Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>
               *Cc:*edk2-devel-groups-io <devel@edk2.groups.io<mailto:devel@edk2.groups.io>>; Ni,
               Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>; Rebecca Cran <rebecca@bsdio.com<mailto:rebecca@bsdio.com>>
               *Subject:*Re: [edk2-devel] CpuDeadLoop() is optimized
               by compiler

               Mike,

               Good point, that is why we are using the stack ….

               The only other thing I can think of is to pass the
               address of Index to some inline assembler, or an asm
               no op function, to give it a side effect the compiler
               can’t resolve.

               Thanks,

               Andrew Fish





                   On May 18, 2023, at 10:05 AM, Kinney, Michael D
                   <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>> wrote:

                   Static global will not work for XIP

                   Mike

                   *From:*Andrew (EFI) Fish <afish@apple.com<mailto:afish@apple.com>>
                   *Sent:*Thursday, May 18, 2023 9:49 AM
                   *To:*edk2-devel-groups-io <devel@edk2.groups.io<mailto:devel@edk2.groups.io>>;
                   Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>
                   *Cc:*Ni, Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>; Rebecca Cran
                   <rebecca@bsdio.com<mailto:rebecca@bsdio.com>>
                   *Subject:*Re: [edk2-devel] CpuDeadLoop() is
                   optimized by compiler

                   Mike,

                   I pinged some compiler experts to see if our code
                   is correct, or if the compiler has an issue. Seems
                   to be trending compiler issue right now, but I’ve
                   NOT gotten feedback from anyone on the spec
                   committee yet.

                   If we move Index to a static global that would
                   likely work around the compiler issue.

                   Thanks,

                   Andrew Fish






                       On May 18, 2023, at 8:36 AM, Michael D Kinney
                       <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>> wrote:

                       Hi Ray,

                       So the code generated does deadloop, but is
                       just not easy to resume from as we have been
                       able to do in the past.

                       We use CpuDeadloop() for 2 purposes.  One is a
                       terminal condition with no reason to ever
                       continue.

                       The 2^nd is a debug aide for developers to
                       halt the system at a specific location and
                       then continue from that point, usually with a
                       debugger, to step through code to an area to
                       evaluate unexpected behavior.

                       We may have to do a NASM implementation of
                       CpuDeadloop() to make sure it meets both use
                       cases.

                       Mike

                       *From:*Ni, Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>
                       *Sent:*Thursday, May 18, 2023 3:00 AM
                       *To:*devel@edk2.groups.io<mailto:devel@edk2.groups.io>
                       *Cc:*Kinney, Michael D
                       <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>; Rebecca Cran
                       <rebecca@bsdio.com<mailto:rebecca@bsdio.com>>; Ni, Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>
                       *Subject:*CpuDeadLoop() is optimized by compiler

                       Hi,

                       Starting from certain version of Visual Studio
                       C compiler (I don’t have the exact version. I
                       am using VS2019), CpuDeadLoop is now optimized
                       quite well by compiler.

                       The optimization is so “good” that it becomes
                       harder for developers to break out of the
                       deadloop.

                       I copied the assembly instructions as below
                       for your reference.

                       The compiler does not generate instructions
                       that jump out of the loop when the Index is
                       not zero.

                       So in order to break out of the loop,
                       developers need to:

                        1. Manually adjust rsp by increasing 40
                        2. Manually “ret”

                       I am not sure if anyone has interest to
                       re-write this function so that compiler can be
                       “fooled” again.

                       Thanks,
                       Ray

                       =======================

                       ; Function compile flags: /Ogspy

                       ; File
                       e:\work\edk2\MdePkg\Library\BaseLib\CpuDeadLoop.c

                       ; COMDAT CpuDeadLoop

                       _TEXT SEGMENT

                       Index$ = 48

                       CpuDeadLoop PROC ; COMDAT

                       ; 26   : {

                       $LN12:

                       00000  48 83 ec 28 sub rsp, 40 ; 00000028H

                       ; 27   : volatile UINTN  Index;

                       ; 28   :

                       ; 29   :   for (Index = 0; Index == 0;) {

                       00004  48 c7 44 24 30

                       00 00 00 00 mov      QWORD PTR Index$[rsp], 0

                       $LN10@CpuDeadLoo:

                       ; 30   : CpuPause ();

                       0000d  48 8b 44 24 30 mov      rax, QWORD PTR
                       Index$[rsp]

                       00012  e8 00 00 00 00 call CpuPause

                       00017  eb f4 jmp SHORT $LN10@CpuDeadLoo

                       CpuDeadLoop ENDP

                       _TEXT ENDS

                       END








-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#110356): https://edk2.groups.io/g/devel/message/110356
Mute This Topic: https://groups.io/mt/98987896/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/leave/12367111/7686176/1913456212/xyzzy [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



[-- Attachment #1.2: Type: text/html, Size: 67633 bytes --]

[-- Attachment #2: favicon.png --]
[-- Type: image/png, Size: 12765 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
  2023-10-31  2:51                           ` Ni, Ray
@ 2023-10-31  3:37                             ` Michael D Kinney
  2023-10-31  8:30                               ` Ni, Ray
  0 siblings, 1 reply; 27+ messages in thread
From: Michael D Kinney @ 2023-10-31  3:37 UTC (permalink / raw)
  To: Ni, Ray, Andrew (EFI) Fish, edk2-devel-groups-io, Rebecca Cran,
	Hernandez Miramontes, Jose Miguel
  Cc: Kinney, Michael D


[-- Attachment #1.1: Type: text/plain, Size: 26586 bytes --]

Does using a static volatile global instead of a volatile local work?

Mike

From: Ni, Ray <ray.ni@intel.com>
Sent: Monday, October 30, 2023 7:52 PM
To: Andrew (EFI) Fish <afish@apple.com>; edk2-devel-groups-io <devel@edk2.groups.io>; Rebecca Cran <rebecca@bsdio.com>; Hernandez Miramontes, Jose Miguel <jose.miguel.hernandez.miramontes@intel.com>
Cc: Kinney, Michael D <michael.d.kinney@intel.com>
Subject: Re: [edk2-devel] CpuDeadLoop() is optimized by compiler

It's been a while.

Is there any better solution? Can we go with assembly solution?

Thanks,
Ray
________________________________
From: Andrew (EFI) Fish <afish@apple.com<mailto:afish@apple.com>>
Sent: Saturday, May 20, 2023 12:31 AM
To: edk2-devel-groups-io <devel@edk2.groups.io<mailto:devel@edk2.groups.io>>; Rebecca Cran <rebecca@bsdio.com<mailto:rebecca@bsdio.com>>
Cc: Ni, Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>; Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>
Subject: Re: [edk2-devel] CpuDeadLoop() is optimized by compiler

I don't think the atomic is going to help. The compiler honored the volatile by doing a read, but assumed it would never change due to scoping. As you can see in my example if the compiler thinks DeadLoopCount can be changed it will put the check back in and assume the function can return. So an assembly function that does nothing called IncreaseScope()
 would fix this issue too.


void IncreaseScope(int *ptr);



void CpuDeadLoopFix(void) {

volatile
int DeadLoopCount = 0;

while(DeadLoopCount ==
0) {

IncreaseScope(&DeadLoopCount);

}

}



void CpuDeadLoop(void) {

volatile
int DeadLoopCount = 0;

while(DeadLoopCount ==
0);

}



Gives us:


voltbl
SEGMENT

voltbl
ENDS

voltbl
SEGMENT

voltbl
ENDS



DeadLoopCount$ =
48

CpuDeadLoopFix PROC
; COMDAT

$LN12:

sub
rsp, 40
; 00000028H

mov
DWORD PTR
DeadLoopCount$[rsp],
0

jmp
SHORT $LN10@CpuDeadLoo

$LL2@CpuDeadLoo:

lea
rcx, QWORD
PTR DeadLoopCount$[rsp]

call
IncreaseScope

$LN10@CpuDeadLoo:

mov
eax, DWORD
PTR DeadLoopCount$[rsp]

test
eax, eax

je
SHORT $LL2@CpuDeadLoo

add
rsp, 40
; 00000028H

ret
0

CpuDeadLoopFix ENDP



DeadLoopCount$ =
8

CpuDeadLoop PROC
; COMDAT

mov
DWORD PTR
DeadLoopCount$[rsp],
0

$LL2@CpuDeadLoo:

mov
eax, DWORD
PTR DeadLoopCount$[rsp]

jmp
SHORT $LL2@CpuDeadLoo

CpuDeadLoop ENDP



Compiler Explorer<https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>
godbolt.org<https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>
[cid:image001.png@01DA0B70.D989F270]<https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>

Thanks,

Andrew Fish

PS I'm still not 100% sure it is a compiler bug. Some times things like this are due to the order the compiler applies the optimizations, and changing the order can change the behavior.



On May 19, 2023, at 8:31 AM, Rebecca Cran <rebecca@bsdio.com<mailto:rebecca@bsdio.com>> wrote:

Just to add more data, I also tried with "volatile sig_atomic_t" as someone suggested and both "/volatile:iso" and "/volatile:ms" with no change in results.


--

Rebecca Cran


On 5/18/23 20:53, Ni, Ray wrote:


I think all the options we considered are workarounds. These might break again if compiler is "cleverer" in future. Unless some Cxx spec clearly guarantees that.

I like Mike's idea to use assembly implementation for CpuDeadLoop. The assembly can simply "jmp $" then "ret".

I didn't find a dead-loop intrinsic function in MSVC.

Any better idea?

Thanks,

Ray

*From:* Andrew (EFI) Fish <afish@apple.com<mailto:afish@apple.com>>
*Sent:* Friday, May 19, 2023 8:42 AM
*To:* devel@edk2.groups.io<mailto:devel@edk2.groups.io>; Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>
*Cc:* Ni, Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>; Rebecca Cran <rebecca@bsdio.com<mailto:rebecca@bsdio.com>>
*Subject:* Re: [edk2-devel] CpuDeadLoop() is optimized by compiler

Mike,

Sorry static was just to scope the name to the file since it is a lib, not to make it work.

That is a cool site. I learned about it complaining about stuff to the compiler team on our internal clang Slack channel as they use it to answer my questions.

Thanks,

Andrew Fish



   On May 18, 2023, at 2:42 PM, Michael D Kinney
   <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>> wrote:

   Using that tool, the following fragment seems to generate the
   right code. Volatile is required.  Static is optional.

   staticvolatileint mDeadLoopCount =0;

   void

   CpuDeadLoop(

   void

     )

   {

   while(mDeadLoopCount ==0);

   }

   GCC

   ===

   CpuDeadLoop():

   .L2:

   moveax,DWORDPTRmDeadLoopCount[rip]

   testeax,eax

   je.L2

   ret

   CLANG

   =====

   CpuDeadLoop():# @CpuDeadLoop()

   .LBB0_1:                          # =>This Inner Loop Header:Depth=1

   cmpdwordptr[rip+_ZL14mDeadLoopCount],0

   je.LBB0_1

   ret

   Mike

   *From:*Andrew (EFI) Fish <afish@apple.com<mailto:afish@apple.com>>
   *Sent:*Thursday, May 18, 2023 1:45 PM
   *To:*edk2-devel-groups-io <devel@edk2.groups.io<mailto:devel@edk2.groups.io>>; Andrew Fish
   <afish@apple.com<mailto:afish@apple.com>>
   *Cc:*Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>; Ni, Ray
   <ray.ni@intel.com<mailto:ray.ni@intel.com>>; Rebecca Cran <rebecca@bsdio.com<mailto:rebecca@bsdio.com>>
   *Subject:*Re: [edk2-devel] CpuDeadLoop() is optimized by compiler

   Whoops wrong compiler. Here is an update. I added the flags so
   this one reproduces the issue.

   Compiler Explorer
   <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>

   godbolt.org<http://godbolt.org/>
   <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>



   <image001.png>
   <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>

   Thanks,

   Andrew Fish




       On May 18, 2023, at 11:45 AM, Andrew Fish viagroups.io<http://viagroups.io/>
       <http://groups.io/><afish=apple.com@groups.io<mailto:afish=apple.com@groups.io>> wrote:

       Mike,

       This is a good way to play around with fixes, and to report
       bugs. You can see the assembler for different compilers with
       different flag.

       Compiler Explorer
       <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>

       godbolt.org<http://godbolt.org/>
       <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>



       <favicon.png>
       <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>

       Sorry I'm traveling and in Cupertino with lots of meetings so
       I did not have time to adjust the compiler flags....

       Thanks,

       Andrew Fish




           On May 18, 2023, at 10:24 AM, Andrew (EFI) Fish
           <afish@apple.com<mailto:afish@apple.com>> wrote:

           Mike,

           I guess my other question... If this turns out to be a
           compiler bug should we scope the change to the broken
           toolchain. I'm not sure what the right answer is for that,
           but I want to ask the question?

           Thanks,

           Andrew Fish




               On May 18, 2023, at 10:19 AM, Michael D Kinney
               <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>> wrote:

               Andrew,

               This might work for XIP.  Set non const global to
               initial value that is expected value to stay in dead loop.

               UINTN  mDeadLoopCount = 0;

               VOID

               CpuDeadLoop(

               VOID

               )

               {

               while (mDeadLoopCount == 0) {

                   CpuPause();

               }

               }

               When deadloop is entered, developer can not change
               value of mDeadLoopCount, but they can use debugger to
               force exit loop and return from function.

               Mike

               *From:*Andrew (EFI) Fish <afish@apple.com<mailto:afish@apple.com>>
               *Sent:*Thursday, May 18, 2023 10:09 AM
               *To:*Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>
               *Cc:*edk2-devel-groups-io <devel@edk2.groups.io<mailto:devel@edk2.groups.io>>; Ni,
               Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>; Rebecca Cran <rebecca@bsdio.com<mailto:rebecca@bsdio.com>>
               *Subject:*Re: [edk2-devel] CpuDeadLoop() is optimized
               by compiler

               Mike,

               Good point, that is why we are using the stack ....

               The only other thing I can think of is to pass the
               address of Index to some inline assembler, or an asm
               no op function, to give it a side effect the compiler
               can't resolve.

               Thanks,

               Andrew Fish





                   On May 18, 2023, at 10:05 AM, Kinney, Michael D
                   <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>> wrote:

                   Static global will not work for XIP

                   Mike

                   *From:*Andrew (EFI) Fish <afish@apple.com<mailto:afish@apple.com>>
                   *Sent:*Thursday, May 18, 2023 9:49 AM
                   *To:*edk2-devel-groups-io <devel@edk2.groups.io<mailto:devel@edk2.groups.io>>;
                   Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>
                   *Cc:*Ni, Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>; Rebecca Cran
                   <rebecca@bsdio.com<mailto:rebecca@bsdio.com>>
                   *Subject:*Re: [edk2-devel] CpuDeadLoop() is
                   optimized by compiler

                   Mike,

                   I pinged some compiler experts to see if our code
                   is correct, or if the compiler has an issue. Seems
                   to be trending compiler issue right now, but I've
                   NOT gotten feedback from anyone on the spec
                   committee yet.

                   If we move Index to a static global that would
                   likely work around the compiler issue.

                   Thanks,

                   Andrew Fish






                       On May 18, 2023, at 8:36 AM, Michael D Kinney
                       <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>> wrote:

                       Hi Ray,

                       So the code generated does deadloop, but is
                       just not easy to resume from as we have been
                       able to do in the past.

                       We use CpuDeadloop() for 2 purposes.  One is a
                       terminal condition with no reason to ever
                       continue.

                       The 2^nd is a debug aide for developers to
                       halt the system at a specific location and
                       then continue from that point, usually with a
                       debugger, to step through code to an area to
                       evaluate unexpected behavior.

                       We may have to do a NASM implementation of
                       CpuDeadloop() to make sure it meets both use
                       cases.

                       Mike

                       *From:*Ni, Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>
                       *Sent:*Thursday, May 18, 2023 3:00 AM
                       *To:*devel@edk2.groups.io<mailto:devel@edk2.groups.io>
                       *Cc:*Kinney, Michael D
                       <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>; Rebecca Cran
                       <rebecca@bsdio.com<mailto:rebecca@bsdio.com>>; Ni, Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>
                       *Subject:*CpuDeadLoop() is optimized by compiler

                       Hi,

                       Starting from certain version of Visual Studio
                       C compiler (I don't have the exact version. I
                       am using VS2019), CpuDeadLoop is now optimized
                       quite well by compiler.

                       The optimization is so "good" that it becomes
                       harder for developers to break out of the
                       deadloop.

                       I copied the assembly instructions as below
                       for your reference.

                       The compiler does not generate instructions
                       that jump out of the loop when the Index is
                       not zero.

                       So in order to break out of the loop,
                       developers need to:

                        1. Manually adjust rsp by increasing 40
                        2. Manually "ret"

                       I am not sure if anyone has interest to
                       re-write this function so that compiler can be
                       "fooled" again.

                       Thanks,
                       Ray

                       =======================

                       ; Function compile flags: /Ogspy

                       ; File
                       e:\work\edk2\MdePkg\Library\BaseLib\CpuDeadLoop.c

                       ; COMDAT CpuDeadLoop

                       _TEXT SEGMENT

                       Index$ = 48

                       CpuDeadLoop PROC ; COMDAT

                       ; 26   : {

                       $LN12:

                       00000  48 83 ec 28 sub rsp, 40 ; 00000028H

                       ; 27   : volatile UINTN  Index;

                       ; 28   :

                       ; 29   :   for (Index = 0; Index == 0;) {

                       00004  48 c7 44 24 30

                       00 00 00 00 mov      QWORD PTR Index$[rsp], 0

                       $LN10@CpuDeadLoo:

                       ; 30   : CpuPause ();

                       0000d  48 8b 44 24 30 mov      rax, QWORD PTR
                       Index$[rsp]

                       00012  e8 00 00 00 00 call CpuPause

                       00017  eb f4 jmp SHORT $LN10@CpuDeadLoo

                       CpuDeadLoop ENDP

                       _TEXT ENDS

                       END








-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#110359): https://edk2.groups.io/g/devel/message/110359
Mute This Topic: https://groups.io/mt/98987896/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/leave/12367111/7686176/1913456212/xyzzy [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



[-- Attachment #1.2: Type: text/html, Size: 89527 bytes --]

[-- Attachment #2: image001.png --]
[-- Type: image/png, Size: 12765 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
  2023-10-31  3:37                             ` Michael D Kinney
@ 2023-10-31  8:30                               ` Ni, Ray
  2023-10-31 14:19                                 ` Michael D Kinney
  0 siblings, 1 reply; 27+ messages in thread
From: Ni, Ray @ 2023-10-31  8:30 UTC (permalink / raw)
  To: Kinney, Michael D, Andrew (EFI) Fish, edk2-devel-groups-io,
	Rebecca Cran, Hernandez Miramontes, Jose Miguel


[-- Attachment #1.1: Type: text/plain, Size: 27509 bytes --]

Mike,
This is not friendly for XIP code. With XIP code, the global variable is not able to be updated as it sits in read-only SPI flash.

Thanks,
Ray
________________________________
From: Kinney, Michael D <michael.d.kinney@intel.com>
Sent: Tuesday, October 31, 2023 11:37 AM
To: Ni, Ray <ray.ni@intel.com>; Andrew (EFI) Fish <afish@apple.com>; edk2-devel-groups-io <devel@edk2.groups.io>; Rebecca Cran <rebecca@bsdio.com>; Hernandez Miramontes, Jose Miguel <jose.miguel.hernandez.miramontes@intel.com>
Cc: Kinney, Michael D <michael.d.kinney@intel.com>
Subject: RE: [edk2-devel] CpuDeadLoop() is optimized by compiler


Does using a static volatile global instead of a volatile local work?



Mike



From: Ni, Ray <ray.ni@intel.com>
Sent: Monday, October 30, 2023 7:52 PM
To: Andrew (EFI) Fish <afish@apple.com>; edk2-devel-groups-io <devel@edk2.groups.io>; Rebecca Cran <rebecca@bsdio.com>; Hernandez Miramontes, Jose Miguel <jose.miguel.hernandez.miramontes@intel.com>
Cc: Kinney, Michael D <michael.d.kinney@intel.com>
Subject: Re: [edk2-devel] CpuDeadLoop() is optimized by compiler



It's been a while.



Is there any better solution? Can we go with assembly solution?



Thanks,

Ray

________________________________

From: Andrew (EFI) Fish <afish@apple.com<mailto:afish@apple.com>>
Sent: Saturday, May 20, 2023 12:31 AM
To: edk2-devel-groups-io <devel@edk2.groups.io<mailto:devel@edk2.groups.io>>; Rebecca Cran <rebecca@bsdio.com<mailto:rebecca@bsdio.com>>
Cc: Ni, Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>; Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>
Subject: Re: [edk2-devel] CpuDeadLoop() is optimized by compiler



I don’t think the atomic is going to help. The compiler honored the volatile by doing a read, but assumed it would never change due to scoping. As you can see in my example if the compiler thinks DeadLoopCount can be changed it will put the check back in and assume the function can return. So an assembly function that does nothing called IncreaseScope()

 would fix this issue too.





void IncreaseScope(int *ptr);





void CpuDeadLoopFix(void) {



volatile

int DeadLoopCount = 0;



while(DeadLoopCount ==

0) {



IncreaseScope(&DeadLoopCount);



}



}





void CpuDeadLoop(void) {



volatile

int DeadLoopCount = 0;



while(DeadLoopCount ==

0);



}





Gives us:





voltbl

SEGMENT



voltbl

ENDS



voltbl

SEGMENT



voltbl

ENDS





DeadLoopCount$ =

48



CpuDeadLoopFix PROC

; COMDAT



$LN12:



sub

rsp, 40

; 00000028H



mov

DWORD PTR

DeadLoopCount$[rsp],

0



jmp

SHORT $LN10@CpuDeadLoo



$LL2@CpuDeadLoo:



lea

rcx, QWORD

PTR DeadLoopCount$[rsp]



call

IncreaseScope



$LN10@CpuDeadLoo:



mov

eax, DWORD

PTR DeadLoopCount$[rsp]



test

eax, eax



je

SHORT $LL2@CpuDeadLoo



add

rsp, 40

; 00000028H



ret

0



CpuDeadLoopFix ENDP





DeadLoopCount$ =

8



CpuDeadLoop PROC

; COMDAT



mov

DWORD PTR

DeadLoopCount$[rsp],

0



$LL2@CpuDeadLoo:



mov

eax, DWORD

PTR DeadLoopCount$[rsp]



jmp

SHORT $LL2@CpuDeadLoo



CpuDeadLoop ENDP







Compiler Explorer<https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>

godbolt.org<https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>

[cid:image001.png@01DA0B70.D989F270]<https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>



Thanks,



Andrew Fish



PS I’m still not 100% sure it is a compiler bug. Some times things like this are due to the order the compiler applies the optimizations, and changing the order can change the behavior.





On May 19, 2023, at 8:31 AM, Rebecca Cran <rebecca@bsdio.com<mailto:rebecca@bsdio.com>> wrote:



Just to add more data, I also tried with "volatile sig_atomic_t" as someone suggested and both "/volatile:iso" and "/volatile:ms" with no change in results.


--

Rebecca Cran


On 5/18/23 20:53, Ni, Ray wrote:


I think all the options we considered are workarounds. These might break again if compiler is “cleverer” in future. Unless some Cxx spec clearly guarantees that.

I like Mike’s idea to use assembly implementation for CpuDeadLoop. The assembly can simply “jmp $” then “ret”.

I didn’t find a dead-loop intrinsic function in MSVC.

Any better idea?

Thanks,

Ray

*From:* Andrew (EFI) Fish <afish@apple.com<mailto:afish@apple.com>>
*Sent:* Friday, May 19, 2023 8:42 AM
*To:* devel@edk2.groups.io<mailto:devel@edk2.groups.io>; Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>
*Cc:* Ni, Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>; Rebecca Cran <rebecca@bsdio.com<mailto:rebecca@bsdio.com>>
*Subject:* Re: [edk2-devel] CpuDeadLoop() is optimized by compiler

Mike,

Sorry static was just to scope the name to the file since it is a lib, not to make it work.

That is a cool site. I learned about it complaining about stuff to the compiler team on our internal clang Slack channel as they use it to answer my questions.

Thanks,

Andrew Fish



   On May 18, 2023, at 2:42 PM, Michael D Kinney
   <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>> wrote:

   Using that tool, the following fragment seems to generate the
   right code. Volatile is required.  Static is optional.

   staticvolatileint mDeadLoopCount =0;

   void

   CpuDeadLoop(

   void

     )

   {

   while(mDeadLoopCount ==0);

   }

   GCC

   ===

   CpuDeadLoop():

   .L2:

   moveax,DWORDPTRmDeadLoopCount[rip]

   testeax,eax

   je.L2

   ret

   CLANG

   =====

   CpuDeadLoop():# @CpuDeadLoop()

   .LBB0_1:                          # =>This Inner Loop Header:Depth=1

   cmpdwordptr[rip+_ZL14mDeadLoopCount],0

   je.LBB0_1

   ret

   Mike

   *From:*Andrew (EFI) Fish <afish@apple.com<mailto:afish@apple.com>>
   *Sent:*Thursday, May 18, 2023 1:45 PM
   *To:*edk2-devel-groups-io <devel@edk2.groups.io<mailto:devel@edk2.groups.io>>; Andrew Fish
   <afish@apple.com<mailto:afish@apple.com>>
   *Cc:*Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>; Ni, Ray
   <ray.ni@intel.com<mailto:ray.ni@intel.com>>; Rebecca Cran <rebecca@bsdio.com<mailto:rebecca@bsdio.com>>
   *Subject:*Re: [edk2-devel] CpuDeadLoop() is optimized by compiler

   Whoops wrong compiler. Here is an update. I added the flags so
   this one reproduces the issue.

   Compiler Explorer
   <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>

   godbolt.org<http://godbolt.org/>
   <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>



   <image001.png>
   <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>

   Thanks,

   Andrew Fish




       On May 18, 2023, at 11:45 AM, Andrew Fish viagroups.io<http://viagroups.io/>
       <http://groups.io/><afish=apple.com@groups.io<mailto:afish=apple.com@groups.io>> wrote:

       Mike,

       This is a good way to play around with fixes, and to report
       bugs. You can see the assembler for different compilers with
       different flag.

       Compiler Explorer
       <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>

       godbolt.org<http://godbolt.org/>
       <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>



       <favicon.png>
       <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>

       Sorry I’m traveling and in Cupertino with lots of meetings so
       I did not have time to adjust the compiler flags….

       Thanks,

       Andrew Fish




           On May 18, 2023, at 10:24 AM, Andrew (EFI) Fish
           <afish@apple.com<mailto:afish@apple.com>> wrote:

           Mike,

           I guess my other question… If this turns out to be a
           compiler bug should we scope the change to the broken
           toolchain. I’m not sure what the right answer is for that,
           but I want to ask the question?

           Thanks,

           Andrew Fish




               On May 18, 2023, at 10:19 AM, Michael D Kinney
               <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>> wrote:

               Andrew,

               This might work for XIP.  Set non const global to
               initial value that is expected value to stay in dead loop.

               UINTN  mDeadLoopCount = 0;

               VOID

               CpuDeadLoop(

               VOID

               )

               {

               while (mDeadLoopCount == 0) {

                   CpuPause();

               }

               }

               When deadloop is entered, developer can not change
               value of mDeadLoopCount, but they can use debugger to
               force exit loop and return from function.

               Mike

               *From:*Andrew (EFI) Fish <afish@apple.com<mailto:afish@apple.com>>
               *Sent:*Thursday, May 18, 2023 10:09 AM
               *To:*Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>
               *Cc:*edk2-devel-groups-io <devel@edk2.groups.io<mailto:devel@edk2.groups.io>>; Ni,
               Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>; Rebecca Cran <rebecca@bsdio.com<mailto:rebecca@bsdio.com>>
               *Subject:*Re: [edk2-devel] CpuDeadLoop() is optimized
               by compiler

               Mike,

               Good point, that is why we are using the stack ….

               The only other thing I can think of is to pass the
               address of Index to some inline assembler, or an asm
               no op function, to give it a side effect the compiler
               can’t resolve.

               Thanks,

               Andrew Fish





                   On May 18, 2023, at 10:05 AM, Kinney, Michael D
                   <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>> wrote:

                   Static global will not work for XIP

                   Mike

                   *From:*Andrew (EFI) Fish <afish@apple.com<mailto:afish@apple.com>>
                   *Sent:*Thursday, May 18, 2023 9:49 AM
                   *To:*edk2-devel-groups-io <devel@edk2.groups.io<mailto:devel@edk2.groups.io>>;
                   Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>
                   *Cc:*Ni, Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>; Rebecca Cran
                   <rebecca@bsdio.com<mailto:rebecca@bsdio.com>>
                   *Subject:*Re: [edk2-devel] CpuDeadLoop() is
                   optimized by compiler

                   Mike,

                   I pinged some compiler experts to see if our code
                   is correct, or if the compiler has an issue. Seems
                   to be trending compiler issue right now, but I’ve
                   NOT gotten feedback from anyone on the spec
                   committee yet.

                   If we move Index to a static global that would
                   likely work around the compiler issue.

                   Thanks,

                   Andrew Fish






                       On May 18, 2023, at 8:36 AM, Michael D Kinney
                       <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>> wrote:

                       Hi Ray,

                       So the code generated does deadloop, but is
                       just not easy to resume from as we have been
                       able to do in the past.

                       We use CpuDeadloop() for 2 purposes.  One is a
                       terminal condition with no reason to ever
                       continue.

                       The 2^nd is a debug aide for developers to
                       halt the system at a specific location and
                       then continue from that point, usually with a
                       debugger, to step through code to an area to
                       evaluate unexpected behavior.

                       We may have to do a NASM implementation of
                       CpuDeadloop() to make sure it meets both use
                       cases.

                       Mike

                       *From:*Ni, Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>
                       *Sent:*Thursday, May 18, 2023 3:00 AM
                       *To:*devel@edk2.groups.io<mailto:devel@edk2.groups.io>
                       *Cc:*Kinney, Michael D
                       <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>; Rebecca Cran
                       <rebecca@bsdio.com<mailto:rebecca@bsdio.com>>; Ni, Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>
                       *Subject:*CpuDeadLoop() is optimized by compiler

                       Hi,

                       Starting from certain version of Visual Studio
                       C compiler (I don’t have the exact version. I
                       am using VS2019), CpuDeadLoop is now optimized
                       quite well by compiler.

                       The optimization is so “good” that it becomes
                       harder for developers to break out of the
                       deadloop.

                       I copied the assembly instructions as below
                       for your reference.

                       The compiler does not generate instructions
                       that jump out of the loop when the Index is
                       not zero.

                       So in order to break out of the loop,
                       developers need to:

                        1. Manually adjust rsp by increasing 40
                        2. Manually “ret”

                       I am not sure if anyone has interest to
                       re-write this function so that compiler can be
                       “fooled” again.

                       Thanks,
                       Ray

                       =======================

                       ; Function compile flags: /Ogspy

                       ; File
                       e:\work\edk2\MdePkg\Library\BaseLib\CpuDeadLoop.c

                       ; COMDAT CpuDeadLoop

                       _TEXT SEGMENT

                       Index$ = 48

                       CpuDeadLoop PROC ; COMDAT

                       ; 26   : {

                       $LN12:

                       00000  48 83 ec 28 sub rsp, 40 ; 00000028H

                       ; 27   : volatile UINTN  Index;

                       ; 28   :

                       ; 29   :   for (Index = 0; Index == 0;) {

                       00004  48 c7 44 24 30

                       00 00 00 00 mov      QWORD PTR Index$[rsp], 0

                       $LN10@CpuDeadLoo:

                       ; 30   : CpuPause ();

                       0000d  48 8b 44 24 30 mov      rax, QWORD PTR
                       Index$[rsp]

                       00012  e8 00 00 00 00 call CpuPause

                       00017  eb f4 jmp SHORT $LN10@CpuDeadLoo

                       CpuDeadLoop ENDP

                       _TEXT ENDS

                       END









-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#110386): https://edk2.groups.io/g/devel/message/110386
Mute This Topic: https://groups.io/mt/98987896/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/leave/12367111/7686176/1913456212/xyzzy [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



[-- Attachment #1.2: Type: text/html, Size: 89483 bytes --]

[-- Attachment #2: image001.png --]
[-- Type: image/png, Size: 12765 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
  2023-10-31  8:30                               ` Ni, Ray
@ 2023-10-31 14:19                                 ` Michael D Kinney
  2024-06-05  1:07                                   ` Michael D Kinney
  0 siblings, 1 reply; 27+ messages in thread
From: Michael D Kinney @ 2023-10-31 14:19 UTC (permalink / raw)
  To: Ni, Ray, Andrew (EFI) Fish, edk2-devel-groups-io, Rebecca Cran,
	Hernandez Miramontes, Jose Miguel
  Cc: Kinney, Michael D


[-- Attachment #1.1: Type: text/plain, Size: 28582 bytes --]

Right. But if you break in with debugger, you can still skip over the jmp instruction and continue.

I agree XIP does not allow variable value to be updated, but we would never want to do that or all future dead loops in non XIP code would not loop.

Mike

From: Ni, Ray <ray.ni@intel.com>
Sent: Tuesday, October 31, 2023 1:31 AM
To: Kinney, Michael D <michael.d.kinney@intel.com>; Andrew (EFI) Fish <afish@apple.com>; edk2-devel-groups-io <devel@edk2.groups.io>; Rebecca Cran <rebecca@bsdio.com>; Hernandez Miramontes, Jose Miguel <jose.miguel.hernandez.miramontes@intel.com>
Subject: Re: [edk2-devel] CpuDeadLoop() is optimized by compiler

Mike,
This is not friendly for XIP code. With XIP code, the global variable is not able to be updated as it sits in read-only SPI flash.

Thanks,
Ray
________________________________
From: Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>
Sent: Tuesday, October 31, 2023 11:37 AM
To: Ni, Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>; Andrew (EFI) Fish <afish@apple.com<mailto:afish@apple.com>>; edk2-devel-groups-io <devel@edk2.groups.io<mailto:devel@edk2.groups.io>>; Rebecca Cran <rebecca@bsdio.com<mailto:rebecca@bsdio.com>>; Hernandez Miramontes, Jose Miguel <jose.miguel.hernandez.miramontes@intel.com<mailto:jose.miguel.hernandez.miramontes@intel.com>>
Cc: Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>
Subject: RE: [edk2-devel] CpuDeadLoop() is optimized by compiler


Does using a static volatile global instead of a volatile local work?



Mike



From: Ni, Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>
Sent: Monday, October 30, 2023 7:52 PM
To: Andrew (EFI) Fish <afish@apple.com<mailto:afish@apple.com>>; edk2-devel-groups-io <devel@edk2.groups.io<mailto:devel@edk2.groups.io>>; Rebecca Cran <rebecca@bsdio.com<mailto:rebecca@bsdio.com>>; Hernandez Miramontes, Jose Miguel <jose.miguel.hernandez.miramontes@intel.com<mailto:jose.miguel.hernandez.miramontes@intel.com>>
Cc: Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>
Subject: Re: [edk2-devel] CpuDeadLoop() is optimized by compiler



It's been a while.



Is there any better solution? Can we go with assembly solution?



Thanks,

Ray

________________________________

From: Andrew (EFI) Fish <afish@apple.com<mailto:afish@apple.com>>
Sent: Saturday, May 20, 2023 12:31 AM
To: edk2-devel-groups-io <devel@edk2.groups.io<mailto:devel@edk2.groups.io>>; Rebecca Cran <rebecca@bsdio.com<mailto:rebecca@bsdio.com>>
Cc: Ni, Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>; Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>
Subject: Re: [edk2-devel] CpuDeadLoop() is optimized by compiler



I don't think the atomic is going to help. The compiler honored the volatile by doing a read, but assumed it would never change due to scoping. As you can see in my example if the compiler thinks DeadLoopCount can be changed it will put the check back in and assume the function can return. So an assembly function that does nothing called IncreaseScope()

 would fix this issue too.





void IncreaseScope(int *ptr);





void CpuDeadLoopFix(void) {



volatile

int DeadLoopCount = 0;



while(DeadLoopCount ==

0) {



IncreaseScope(&DeadLoopCount);



}



}





void CpuDeadLoop(void) {



volatile

int DeadLoopCount = 0;



while(DeadLoopCount ==

0);



}





Gives us:





voltbl

SEGMENT



voltbl

ENDS



voltbl

SEGMENT



voltbl

ENDS





DeadLoopCount$ =

48



CpuDeadLoopFix PROC

; COMDAT



$LN12:



sub

rsp, 40

; 00000028H



mov

DWORD PTR

DeadLoopCount$[rsp],

0



jmp

SHORT $LN10@CpuDeadLoo



$LL2@CpuDeadLoo:



lea

rcx, QWORD

PTR DeadLoopCount$[rsp]



call

IncreaseScope



$LN10@CpuDeadLoo:



mov

eax, DWORD

PTR DeadLoopCount$[rsp]



test

eax, eax



je

SHORT $LL2@CpuDeadLoo



add

rsp, 40

; 00000028H



ret

0



CpuDeadLoopFix ENDP





DeadLoopCount$ =

8



CpuDeadLoop PROC

; COMDAT



mov

DWORD PTR

DeadLoopCount$[rsp],

0



$LL2@CpuDeadLoo:



mov

eax, DWORD

PTR DeadLoopCount$[rsp]



jmp

SHORT $LL2@CpuDeadLoo



CpuDeadLoop ENDP







Compiler Explorer<https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>

godbolt.org<https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>

[cid:image001.png@01DA0BCA.8E3DB900]<https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>



Thanks,



Andrew Fish



PS I'm still not 100% sure it is a compiler bug. Some times things like this are due to the order the compiler applies the optimizations, and changing the order can change the behavior.





On May 19, 2023, at 8:31 AM, Rebecca Cran <rebecca@bsdio.com<mailto:rebecca@bsdio.com>> wrote:



Just to add more data, I also tried with "volatile sig_atomic_t" as someone suggested and both "/volatile:iso" and "/volatile:ms" with no change in results.


--

Rebecca Cran


On 5/18/23 20:53, Ni, Ray wrote:

I think all the options we considered are workarounds. These might break again if compiler is "cleverer" in future. Unless some Cxx spec clearly guarantees that.

I like Mike's idea to use assembly implementation for CpuDeadLoop. The assembly can simply "jmp $" then "ret".

I didn't find a dead-loop intrinsic function in MSVC.

Any better idea?

Thanks,

Ray

*From:* Andrew (EFI) Fish <afish@apple.com<mailto:afish@apple.com>>
*Sent:* Friday, May 19, 2023 8:42 AM
*To:* devel@edk2.groups.io<mailto:devel@edk2.groups.io>; Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>
*Cc:* Ni, Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>; Rebecca Cran <rebecca@bsdio.com<mailto:rebecca@bsdio.com>>
*Subject:* Re: [edk2-devel] CpuDeadLoop() is optimized by compiler

Mike,

Sorry static was just to scope the name to the file since it is a lib, not to make it work.

That is a cool site. I learned about it complaining about stuff to the compiler team on our internal clang Slack channel as they use it to answer my questions.

Thanks,

Andrew Fish



   On May 18, 2023, at 2:42 PM, Michael D Kinney
   <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>> wrote:

   Using that tool, the following fragment seems to generate the
   right code. Volatile is required.  Static is optional.

   staticvolatileint mDeadLoopCount =0;

   void

   CpuDeadLoop(

   void

     )

   {

   while(mDeadLoopCount ==0);

   }

   GCC

   ===

   CpuDeadLoop():

   .L2:

   moveax,DWORDPTRmDeadLoopCount[rip]

   testeax,eax

   je.L2

   ret

   CLANG

   =====

   CpuDeadLoop():# @CpuDeadLoop()

   .LBB0_1:                          # =>This Inner Loop Header:Depth=1

   cmpdwordptr[rip+_ZL14mDeadLoopCount],0

   je.LBB0_1

   ret

   Mike

   *From:*Andrew (EFI) Fish <afish@apple.com<mailto:afish@apple.com>>
   *Sent:*Thursday, May 18, 2023 1:45 PM
   *To:*edk2-devel-groups-io <devel@edk2.groups.io<mailto:devel@edk2.groups.io>>; Andrew Fish
   <afish@apple.com<mailto:afish@apple.com>>
   *Cc:*Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>; Ni, Ray
   <ray.ni@intel.com<mailto:ray.ni@intel.com>>; Rebecca Cran <rebecca@bsdio.com<mailto:rebecca@bsdio.com>>
   *Subject:*Re: [edk2-devel] CpuDeadLoop() is optimized by compiler

   Whoops wrong compiler. Here is an update. I added the flags so
   this one reproduces the issue.

   Compiler Explorer
   <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>

   godbolt.org<http://godbolt.org/>
   <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>



   <image001.png>
   <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>

   Thanks,

   Andrew Fish




       On May 18, 2023, at 11:45 AM, Andrew Fish viagroups.io<http://viagroups.io/>
       <http://groups.io/><afish=apple.com@groups.io<mailto:afish=apple.com@groups.io>> wrote:

       Mike,

       This is a good way to play around with fixes, and to report
       bugs. You can see the assembler for different compilers with
       different flag.

       Compiler Explorer
       <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>

       godbolt.org<http://godbolt.org/>
       <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>



       <favicon.png>
       <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>

       Sorry I'm traveling and in Cupertino with lots of meetings so
       I did not have time to adjust the compiler flags....

       Thanks,

       Andrew Fish




           On May 18, 2023, at 10:24 AM, Andrew (EFI) Fish
           <afish@apple.com<mailto:afish@apple.com>> wrote:

           Mike,

           I guess my other question... If this turns out to be a
           compiler bug should we scope the change to the broken
           toolchain. I'm not sure what the right answer is for that,
           but I want to ask the question?

           Thanks,

           Andrew Fish




               On May 18, 2023, at 10:19 AM, Michael D Kinney
               <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>> wrote:

               Andrew,

               This might work for XIP.  Set non const global to
               initial value that is expected value to stay in dead loop.

               UINTN  mDeadLoopCount = 0;

               VOID

               CpuDeadLoop(

               VOID

               )

               {

               while (mDeadLoopCount == 0) {

                   CpuPause();

               }

               }

               When deadloop is entered, developer can not change
               value of mDeadLoopCount, but they can use debugger to
               force exit loop and return from function.

               Mike

               *From:*Andrew (EFI) Fish <afish@apple.com<mailto:afish@apple.com>>
               *Sent:*Thursday, May 18, 2023 10:09 AM
               *To:*Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>
               *Cc:*edk2-devel-groups-io <devel@edk2.groups.io<mailto:devel@edk2.groups.io>>; Ni,
               Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>; Rebecca Cran <rebecca@bsdio.com<mailto:rebecca@bsdio.com>>
               *Subject:*Re: [edk2-devel] CpuDeadLoop() is optimized
               by compiler

               Mike,

               Good point, that is why we are using the stack ....

               The only other thing I can think of is to pass the
               address of Index to some inline assembler, or an asm
               no op function, to give it a side effect the compiler
               can't resolve.

               Thanks,

               Andrew Fish





                   On May 18, 2023, at 10:05 AM, Kinney, Michael D
                   <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>> wrote:

                   Static global will not work for XIP

                   Mike

                   *From:*Andrew (EFI) Fish <afish@apple.com<mailto:afish@apple.com>>
                   *Sent:*Thursday, May 18, 2023 9:49 AM
                   *To:*edk2-devel-groups-io <devel@edk2.groups.io<mailto:devel@edk2.groups.io>>;
                   Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>
                   *Cc:*Ni, Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>; Rebecca Cran
                   <rebecca@bsdio.com<mailto:rebecca@bsdio.com>>
                   *Subject:*Re: [edk2-devel] CpuDeadLoop() is
                   optimized by compiler

                   Mike,

                   I pinged some compiler experts to see if our code
                   is correct, or if the compiler has an issue. Seems
                   to be trending compiler issue right now, but I've
                   NOT gotten feedback from anyone on the spec
                   committee yet.

                   If we move Index to a static global that would
                   likely work around the compiler issue.

                   Thanks,

                   Andrew Fish






                       On May 18, 2023, at 8:36 AM, Michael D Kinney
                       <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>> wrote:

                       Hi Ray,

                       So the code generated does deadloop, but is
                       just not easy to resume from as we have been
                       able to do in the past.

                       We use CpuDeadloop() for 2 purposes.  One is a
                       terminal condition with no reason to ever
                       continue.

                       The 2^nd is a debug aide for developers to
                       halt the system at a specific location and
                       then continue from that point, usually with a
                       debugger, to step through code to an area to
                       evaluate unexpected behavior.

                       We may have to do a NASM implementation of
                       CpuDeadloop() to make sure it meets both use
                       cases.

                       Mike

                       *From:*Ni, Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>
                       *Sent:*Thursday, May 18, 2023 3:00 AM
                       *To:*devel@edk2.groups.io<mailto:devel@edk2.groups.io>
                       *Cc:*Kinney, Michael D
                       <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>; Rebecca Cran
                       <rebecca@bsdio.com<mailto:rebecca@bsdio.com>>; Ni, Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>
                       *Subject:*CpuDeadLoop() is optimized by compiler

                       Hi,

                       Starting from certain version of Visual Studio
                       C compiler (I don't have the exact version. I
                       am using VS2019), CpuDeadLoop is now optimized
                       quite well by compiler.

                       The optimization is so "good" that it becomes
                       harder for developers to break out of the
                       deadloop.

                       I copied the assembly instructions as below
                       for your reference.

                       The compiler does not generate instructions
                       that jump out of the loop when the Index is
                       not zero.

                       So in order to break out of the loop,
                       developers need to:

                        1. Manually adjust rsp by increasing 40
                        2. Manually "ret"

                       I am not sure if anyone has interest to
                       re-write this function so that compiler can be
                       "fooled" again.

                       Thanks,
                       Ray

                       =======================

                       ; Function compile flags: /Ogspy

                       ; File
                       e:\work\edk2\MdePkg\Library\BaseLib\CpuDeadLoop.c

                       ; COMDAT CpuDeadLoop

                       _TEXT SEGMENT

                       Index$ = 48

                       CpuDeadLoop PROC ; COMDAT

                       ; 26   : {

                       $LN12:

                       00000  48 83 ec 28 sub rsp, 40 ; 00000028H

                       ; 27   : volatile UINTN  Index;

                       ; 28   :

                       ; 29   :   for (Index = 0; Index == 0;) {

                       00004  48 c7 44 24 30

                       00 00 00 00 mov      QWORD PTR Index$[rsp], 0

                       $LN10@CpuDeadLoo:

                       ; 30   : CpuPause ();

                       0000d  48 8b 44 24 30 mov      rax, QWORD PTR
                       Index$[rsp]

                       00012  e8 00 00 00 00 call CpuPause

                       00017  eb f4 jmp SHORT $LN10@CpuDeadLoo

                       CpuDeadLoop ENDP

                       _TEXT ENDS

                       END









-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#110418): https://edk2.groups.io/g/devel/message/110418
Mute This Topic: https://groups.io/mt/98987896/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/leave/12367111/7686176/1913456212/xyzzy [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



[-- Attachment #1.2: Type: text/html, Size: 88440 bytes --]

[-- Attachment #2: image001.png --]
[-- Type: image/png, Size: 12765 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
  2023-10-31 14:19                                 ` Michael D Kinney
@ 2024-06-05  1:07                                   ` Michael D Kinney
  2024-06-05 16:48                                     ` Oliver Smith-Denny
  0 siblings, 1 reply; 27+ messages in thread
From: Michael D Kinney @ 2024-06-05  1:07 UTC (permalink / raw)
  To: Ni, Ray, Andrew (EFI) Fish, edk2-devel-groups-io, Rebecca Cran,
	Hernandez Miramontes, Jose Miguel
  Cc: Kinney, Michael D


[-- Attachment #1.1: Type: text/plain, Size: 30205 bytes --]

Hi Ray,

I know this is an old topic, but I think I have a new idea that works for XIP code.  We can update loop to compare a volatile global to a volatile local.  This forces 2 reads and a comparison on every loop iteration.  The local variable can be set to 1 to exit the loop without modifying the global variable.  I tried this with VS2019 with max opt enabled and I was able to exit loop by setting Index to 1 in a debugger.

diff --git a/MdePkg/Library/BaseLib/CpuDeadLoop.c b/MdePkg/Library/BaseLib/CpuDeadLoop.c
index b3b7548fa5..393c4290ed 100644
--- a/MdePkg/Library/BaseLib/CpuDeadLoop.c
+++ b/MdePkg/Library/BaseLib/CpuDeadLoop.c
@@ -9,6 +9,8 @@
#include <Base.h>
#include <Library/BaseLib.h>

+static volatile UINTN  mDeadLoopComparator = 0;^M
+^M
/**
   Executes an infinite loop.

@@ -26,7 +28,7 @@ CpuDeadLoop (
{
   volatile UINTN  Index;

-  for (Index = 0; Index == 0;) {
+  for (Index = mDeadLoopComparator; Index == mDeadLoopComparator;) {^M
     CpuPause ();
   }
}

Mike

From: Kinney, Michael D <michael.d.kinney@intel.com>
Sent: Tuesday, October 31, 2023 7:19 AM
To: Ni, Ray <ray.ni@intel.com>; Andrew (EFI) Fish <afish@apple.com>; edk2-devel-groups-io <devel@edk2.groups.io>; Rebecca Cran <rebecca@bsdio.com>; Hernandez Miramontes, Jose Miguel <jose.miguel.hernandez.miramontes@intel.com>
Cc: Kinney, Michael D <michael.d.kinney@intel.com>
Subject: RE: [edk2-devel] CpuDeadLoop() is optimized by compiler

Right. But if you break in with debugger, you can still skip over the jmp instruction and continue.

I agree XIP does not allow variable value to be updated, but we would never want to do that or all future dead loops in non XIP code would not loop.

Mike

From: Ni, Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>
Sent: Tuesday, October 31, 2023 1:31 AM
To: Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>; Andrew (EFI) Fish <afish@apple.com<mailto:afish@apple.com>>; edk2-devel-groups-io <devel@edk2.groups.io<mailto:devel@edk2.groups.io>>; Rebecca Cran <rebecca@bsdio.com<mailto:rebecca@bsdio.com>>; Hernandez Miramontes, Jose Miguel <jose.miguel.hernandez.miramontes@intel.com<mailto:jose.miguel.hernandez.miramontes@intel.com>>
Subject: Re: [edk2-devel] CpuDeadLoop() is optimized by compiler

Mike,
This is not friendly for XIP code. With XIP code, the global variable is not able to be updated as it sits in read-only SPI flash.

Thanks,
Ray
________________________________
From: Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>
Sent: Tuesday, October 31, 2023 11:37 AM
To: Ni, Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>; Andrew (EFI) Fish <afish@apple.com<mailto:afish@apple.com>>; edk2-devel-groups-io <devel@edk2.groups.io<mailto:devel@edk2.groups.io>>; Rebecca Cran <rebecca@bsdio.com<mailto:rebecca@bsdio.com>>; Hernandez Miramontes, Jose Miguel <jose.miguel.hernandez.miramontes@intel.com<mailto:jose.miguel.hernandez.miramontes@intel.com>>
Cc: Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>
Subject: RE: [edk2-devel] CpuDeadLoop() is optimized by compiler


Does using a static volatile global instead of a volatile local work?



Mike



From: Ni, Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>
Sent: Monday, October 30, 2023 7:52 PM
To: Andrew (EFI) Fish <afish@apple.com<mailto:afish@apple.com>>; edk2-devel-groups-io <devel@edk2.groups.io<mailto:devel@edk2.groups.io>>; Rebecca Cran <rebecca@bsdio.com<mailto:rebecca@bsdio.com>>; Hernandez Miramontes, Jose Miguel <jose.miguel.hernandez.miramontes@intel.com<mailto:jose.miguel.hernandez.miramontes@intel.com>>
Cc: Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>
Subject: Re: [edk2-devel] CpuDeadLoop() is optimized by compiler



It's been a while.



Is there any better solution? Can we go with assembly solution?



Thanks,

Ray

________________________________

From: Andrew (EFI) Fish <afish@apple.com<mailto:afish@apple.com>>
Sent: Saturday, May 20, 2023 12:31 AM
To: edk2-devel-groups-io <devel@edk2.groups.io<mailto:devel@edk2.groups.io>>; Rebecca Cran <rebecca@bsdio.com<mailto:rebecca@bsdio.com>>
Cc: Ni, Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>; Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>
Subject: Re: [edk2-devel] CpuDeadLoop() is optimized by compiler



I don't think the atomic is going to help. The compiler honored the volatile by doing a read, but assumed it would never change due to scoping. As you can see in my example if the compiler thinks DeadLoopCount can be changed it will put the check back in and assume the function can return. So an assembly function that does nothing called IncreaseScope()

 would fix this issue too.





void IncreaseScope(int *ptr);





void CpuDeadLoopFix(void) {



volatile

int DeadLoopCount = 0;



while(DeadLoopCount ==

0) {



IncreaseScope(&DeadLoopCount);



}



}





void CpuDeadLoop(void) {



volatile

int DeadLoopCount = 0;



while(DeadLoopCount ==

0);



}





Gives us:





voltbl

SEGMENT



voltbl

ENDS



voltbl

SEGMENT



voltbl

ENDS





DeadLoopCount$ =

48



CpuDeadLoopFix PROC

; COMDAT



$LN12:



sub

rsp, 40

; 00000028H



mov

DWORD PTR

DeadLoopCount$[rsp],

0



jmp

SHORT $LN10@CpuDeadLoo



$LL2@CpuDeadLoo:



lea

rcx, QWORD

PTR DeadLoopCount$[rsp]



call

IncreaseScope



$LN10@CpuDeadLoo:



mov

eax, DWORD

PTR DeadLoopCount$[rsp]



test

eax, eax



je

SHORT $LL2@CpuDeadLoo



add

rsp, 40

; 00000028H



ret

0



CpuDeadLoopFix ENDP





DeadLoopCount$ =

8



CpuDeadLoop PROC

; COMDAT



mov

DWORD PTR

DeadLoopCount$[rsp],

0



$LL2@CpuDeadLoo:



mov

eax, DWORD

PTR DeadLoopCount$[rsp]



jmp

SHORT $LL2@CpuDeadLoo



CpuDeadLoop ENDP







Compiler Explorer<https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>

godbolt.org<https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>

[cid:image001.png@01DAB6AA.1D74CB60]<https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>



Thanks,



Andrew Fish



PS I'm still not 100% sure it is a compiler bug. Some times things like this are due to the order the compiler applies the optimizations, and changing the order can change the behavior.





On May 19, 2023, at 8:31 AM, Rebecca Cran <rebecca@bsdio.com<mailto:rebecca@bsdio.com>> wrote:



Just to add more data, I also tried with "volatile sig_atomic_t" as someone suggested and both "/volatile:iso" and "/volatile:ms" with no change in results.


--

Rebecca Cran


On 5/18/23 20:53, Ni, Ray wrote:

I think all the options we considered are workarounds. These might break again if compiler is "cleverer" in future. Unless some Cxx spec clearly guarantees that.

I like Mike's idea to use assembly implementation for CpuDeadLoop. The assembly can simply "jmp $" then "ret".

I didn't find a dead-loop intrinsic function in MSVC.

Any better idea?

Thanks,

Ray

*From:* Andrew (EFI) Fish <afish@apple.com<mailto:afish@apple.com>>
*Sent:* Friday, May 19, 2023 8:42 AM
*To:* devel@edk2.groups.io<mailto:devel@edk2.groups.io>; Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>
*Cc:* Ni, Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>; Rebecca Cran <rebecca@bsdio.com<mailto:rebecca@bsdio.com>>
*Subject:* Re: [edk2-devel] CpuDeadLoop() is optimized by compiler

Mike,

Sorry static was just to scope the name to the file since it is a lib, not to make it work.

That is a cool site. I learned about it complaining about stuff to the compiler team on our internal clang Slack channel as they use it to answer my questions.

Thanks,

Andrew Fish



   On May 18, 2023, at 2:42 PM, Michael D Kinney
   <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>> wrote:

   Using that tool, the following fragment seems to generate the
   right code. Volatile is required.  Static is optional.

   staticvolatileint mDeadLoopCount =0;

   void

   CpuDeadLoop(

   void

     )

   {

   while(mDeadLoopCount ==0);

   }

   GCC

   ===

   CpuDeadLoop():

   .L2:

   moveax,DWORDPTRmDeadLoopCount[rip]

   testeax,eax

   je.L2

   ret

   CLANG

   =====

   CpuDeadLoop():# @CpuDeadLoop()

   .LBB0_1:                          # =>This Inner Loop Header:Depth=1

   cmpdwordptr[rip+_ZL14mDeadLoopCount],0

   je.LBB0_1

   ret

   Mike

   *From:*Andrew (EFI) Fish <afish@apple.com<mailto:afish@apple.com>>
   *Sent:*Thursday, May 18, 2023 1:45 PM
   *To:*edk2-devel-groups-io <devel@edk2.groups.io<mailto:devel@edk2.groups.io>>; Andrew Fish
   <afish@apple.com<mailto:afish@apple.com>>
   *Cc:*Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>; Ni, Ray
   <ray.ni@intel.com<mailto:ray.ni@intel.com>>; Rebecca Cran <rebecca@bsdio.com<mailto:rebecca@bsdio.com>>
   *Subject:*Re: [edk2-devel] CpuDeadLoop() is optimized by compiler

   Whoops wrong compiler. Here is an update. I added the flags so
   this one reproduces the issue.

   Compiler Explorer
   <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>

   godbolt.org<http://godbolt.org/>
   <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>



   <image001.png>
   <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>

   Thanks,

   Andrew Fish




       On May 18, 2023, at 11:45 AM, Andrew Fish viagroups.io<http://viagroups.io/>
       <http://groups.io/><afish=apple.com@groups.io<mailto:afish=apple.com@groups.io>> wrote:

       Mike,

       This is a good way to play around with fixes, and to report
       bugs. You can see the assembler for different compilers with
       different flag.

       Compiler Explorer
       <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>

       godbolt.org<http://godbolt.org/>
       <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>



       <favicon.png>
       <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>

       Sorry I'm traveling and in Cupertino with lots of meetings so
       I did not have time to adjust the compiler flags....

       Thanks,

       Andrew Fish




           On May 18, 2023, at 10:24 AM, Andrew (EFI) Fish
           <afish@apple.com<mailto:afish@apple.com>> wrote:

           Mike,

           I guess my other question... If this turns out to be a
           compiler bug should we scope the change to the broken
           toolchain. I'm not sure what the right answer is for that,
           but I want to ask the question?

           Thanks,

           Andrew Fish




               On May 18, 2023, at 10:19 AM, Michael D Kinney
               <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>> wrote:

               Andrew,

               This might work for XIP.  Set non const global to
               initial value that is expected value to stay in dead loop.

               UINTN  mDeadLoopCount = 0;

               VOID

               CpuDeadLoop(

               VOID

               )

               {

               while (mDeadLoopCount == 0) {

                   CpuPause();

               }

               }

               When deadloop is entered, developer can not change
               value of mDeadLoopCount, but they can use debugger to
               force exit loop and return from function.

               Mike

               *From:*Andrew (EFI) Fish <afish@apple.com<mailto:afish@apple.com>>
               *Sent:*Thursday, May 18, 2023 10:09 AM
               *To:*Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>
               *Cc:*edk2-devel-groups-io <devel@edk2.groups.io<mailto:devel@edk2.groups.io>>; Ni,
               Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>; Rebecca Cran <rebecca@bsdio.com<mailto:rebecca@bsdio.com>>
               *Subject:*Re: [edk2-devel] CpuDeadLoop() is optimized
               by compiler

               Mike,

               Good point, that is why we are using the stack ....

               The only other thing I can think of is to pass the
               address of Index to some inline assembler, or an asm
               no op function, to give it a side effect the compiler
               can't resolve.

               Thanks,

               Andrew Fish





                   On May 18, 2023, at 10:05 AM, Kinney, Michael D
                   <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>> wrote:

                   Static global will not work for XIP

                   Mike

                   *From:*Andrew (EFI) Fish <afish@apple.com<mailto:afish@apple.com>>
                   *Sent:*Thursday, May 18, 2023 9:49 AM
                   *To:*edk2-devel-groups-io <devel@edk2.groups.io<mailto:devel@edk2.groups.io>>;
                   Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>
                   *Cc:*Ni, Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>; Rebecca Cran
                   <rebecca@bsdio.com<mailto:rebecca@bsdio.com>>
                   *Subject:*Re: [edk2-devel] CpuDeadLoop() is
                   optimized by compiler

                   Mike,

                   I pinged some compiler experts to see if our code
                   is correct, or if the compiler has an issue. Seems
                   to be trending compiler issue right now, but I've
                   NOT gotten feedback from anyone on the spec
                   committee yet.

                   If we move Index to a static global that would
                   likely work around the compiler issue.

                   Thanks,

                   Andrew Fish






                       On May 18, 2023, at 8:36 AM, Michael D Kinney
                       <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>> wrote:

                       Hi Ray,

                       So the code generated does deadloop, but is
                       just not easy to resume from as we have been
                       able to do in the past.

                       We use CpuDeadloop() for 2 purposes.  One is a
                       terminal condition with no reason to ever
                       continue.

                       The 2^nd is a debug aide for developers to
                       halt the system at a specific location and
                       then continue from that point, usually with a
                       debugger, to step through code to an area to
                       evaluate unexpected behavior.

                       We may have to do a NASM implementation of
                       CpuDeadloop() to make sure it meets both use
                       cases.

                       Mike

                       *From:*Ni, Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>
                       *Sent:*Thursday, May 18, 2023 3:00 AM
                       *To:*devel@edk2.groups.io<mailto:devel@edk2.groups.io>
                       *Cc:*Kinney, Michael D
                       <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>; Rebecca Cran
                       <rebecca@bsdio.com<mailto:rebecca@bsdio.com>>; Ni, Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>
                       *Subject:*CpuDeadLoop() is optimized by compiler

                       Hi,

                       Starting from certain version of Visual Studio
                       C compiler (I don't have the exact version. I
                       am using VS2019), CpuDeadLoop is now optimized
                       quite well by compiler.

                       The optimization is so "good" that it becomes
                       harder for developers to break out of the
                       deadloop.

                       I copied the assembly instructions as below
                       for your reference.

                       The compiler does not generate instructions
                       that jump out of the loop when the Index is
                       not zero.

                       So in order to break out of the loop,
                       developers need to:

                        1. Manually adjust rsp by increasing 40
                        2. Manually "ret"

                       I am not sure if anyone has interest to
                       re-write this function so that compiler can be
                       "fooled" again.

                       Thanks,
                       Ray

                       =======================

                       ; Function compile flags: /Ogspy

                       ; File
                       e:\work\edk2\MdePkg\Library\BaseLib\CpuDeadLoop.c

                       ; COMDAT CpuDeadLoop

                       _TEXT SEGMENT

                       Index$ = 48

                       CpuDeadLoop PROC ; COMDAT

                       ; 26   : {

                       $LN12:

                       00000  48 83 ec 28 sub rsp, 40 ; 00000028H

                       ; 27   : volatile UINTN  Index;

                       ; 28   :

                       ; 29   :   for (Index = 0; Index == 0;) {

                       00004  48 c7 44 24 30

                       00 00 00 00 mov      QWORD PTR Index$[rsp], 0

                       $LN10@CpuDeadLoo:

                       ; 30   : CpuPause ();

                       0000d  48 8b 44 24 30 mov      rax, QWORD PTR
                       Index$[rsp]

                       00012  e8 00 00 00 00 call CpuPause

                       00017  eb f4 jmp SHORT $LN10@CpuDeadLoo

                       CpuDeadLoop ENDP

                       _TEXT ENDS

                       END









-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#119456): https://edk2.groups.io/g/devel/message/119456
Mute This Topic: https://groups.io/mt/98987896/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



[-- Attachment #1.2: Type: text/html, Size: 94051 bytes --]

[-- Attachment #2: image001.png --]
[-- Type: image/png, Size: 12765 bytes --]

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
  2024-06-05  1:07                                   ` Michael D Kinney
@ 2024-06-05 16:48                                     ` Oliver Smith-Denny
  2024-06-07 16:57                                       ` Hernandez Miramontes, Jose Miguel
  0 siblings, 1 reply; 27+ messages in thread
From: Oliver Smith-Denny @ 2024-06-05 16:48 UTC (permalink / raw)
  To: devel, michael.d.kinney, Ni, Ray, Andrew (EFI) Fish, Rebecca Cran,
	Hernandez Miramontes, Jose Miguel

The other option, that I had started working on then got
sidetracked with other things, would just be implement
CpuDeadLoop in assembly, so we don't have to worry about
random compiler optimizations in the future.

This certainly looks like an easy change for the current
issue, though.

Thanks,
Oliver

On 6/4/2024 6:07 PM, Michael D Kinney wrote:
> Hi Ray,
> 
> I know this is an old topic, but I think I have a new idea that works 
> for XIP code.  We can update loop to compare a volatile global to a 
> volatile local.  This forces 2 reads and a comparison on every loop 
> iteration.  The local variable can be set to 1 to exit the loop without 
> modifying the global variable.  I tried this with VS2019 with max opt 
> enabled and I was able to exit loop by setting Index to 1 in a debugger.
> 
> diff --git a/MdePkg/Library/BaseLib/CpuDeadLoop.c 
> b/MdePkg/Library/BaseLib/CpuDeadLoop.c
> 
> index b3b7548fa5..393c4290ed 100644
> 
> --- a/MdePkg/Library/BaseLib/CpuDeadLoop.c
> 
> +++ b/MdePkg/Library/BaseLib/CpuDeadLoop.c
> 
> @@ -9,6 +9,8 @@
> 
> #include <Base.h>
> 
> #include <Library/BaseLib.h>
> 
> +static volatile UINTN  mDeadLoopComparator = 0;^M
> 
> +^M
> 
> /**
> 
>     Executes an infinite loop.
> 
> @@ -26,7 +28,7 @@ CpuDeadLoop (
> 
> {
> 
>     volatile UINTN  Index;
> 
> -  for (Index = 0; Index == 0;) {
> 
> +  for (Index = mDeadLoopComparator; Index == mDeadLoopComparator;) {^M
> 
>       CpuPause ();
> 
>     }
> 
> }
> 
> Mike
> 
> *From:* Kinney, Michael D <michael.d.kinney@intel.com>
> *Sent:* Tuesday, October 31, 2023 7:19 AM
> *To:* Ni, Ray <ray.ni@intel.com>; Andrew (EFI) Fish <afish@apple.com>; 
> edk2-devel-groups-io <devel@edk2.groups.io>; Rebecca Cran 
> <rebecca@bsdio.com>; Hernandez Miramontes, Jose Miguel 
> <jose.miguel.hernandez.miramontes@intel.com>
> *Cc:* Kinney, Michael D <michael.d.kinney@intel.com>
> *Subject:* RE: [edk2-devel] CpuDeadLoop() is optimized by compiler
> 
> Right. But if you break in with debugger, you can still skip over the 
> jmp instruction and continue.
> 
> I agree XIP does not allow variable value to be updated, but we would 
> never want to do that or all future dead loops in non XIP code would not 
> loop.
> 
> Mike
> 
> *From:* Ni, Ray <ray.ni@intel.com <mailto:ray.ni@intel.com>>
> *Sent:* Tuesday, October 31, 2023 1:31 AM
> *To:* Kinney, Michael D <michael.d.kinney@intel.com 
> <mailto:michael.d.kinney@intel.com>>; Andrew (EFI) Fish <afish@apple.com 
> <mailto:afish@apple.com>>; edk2-devel-groups-io <devel@edk2.groups.io 
> <mailto:devel@edk2.groups.io>>; Rebecca Cran <rebecca@bsdio.com 
> <mailto:rebecca@bsdio.com>>; Hernandez Miramontes, Jose Miguel 
> <jose.miguel.hernandez.miramontes@intel.com 
> <mailto:jose.miguel.hernandez.miramontes@intel.com>>
> *Subject:* Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
> 
> Mike,
> 
> This is not friendly for XIP code. With XIP code, the global variable is 
> not able to be updated as it sits in read-only SPI flash.
> 
> Thanks,
> 
> Ray
> 
> ------------------------------------------------------------------------
> 
> *From:*Kinney, Michael D <michael.d.kinney@intel.com 
> <mailto:michael.d.kinney@intel.com>>
> *Sent:* Tuesday, October 31, 2023 11:37 AM
> *To:* Ni, Ray <ray.ni@intel.com <mailto:ray.ni@intel.com>>; Andrew (EFI) 
> Fish <afish@apple.com <mailto:afish@apple.com>>; edk2-devel-groups-io 
> <devel@edk2.groups.io <mailto:devel@edk2.groups.io>>; Rebecca Cran 
> <rebecca@bsdio.com <mailto:rebecca@bsdio.com>>; Hernandez Miramontes, 
> Jose Miguel <jose.miguel.hernandez.miramontes@intel.com 
> <mailto:jose.miguel.hernandez.miramontes@intel.com>>
> *Cc:* Kinney, Michael D <michael.d.kinney@intel.com 
> <mailto:michael.d.kinney@intel.com>>
> *Subject:* RE: [edk2-devel] CpuDeadLoop() is optimized by compiler
> 
> Does using a static volatile global instead of a volatile local work?
> 
> Mike
> 
> *From:* Ni, Ray <ray.ni@intel.com <mailto:ray.ni@intel.com>>
> *Sent:* Monday, October 30, 2023 7:52 PM
> *To:* Andrew (EFI) Fish <afish@apple.com <mailto:afish@apple.com>>; 
> edk2-devel-groups-io <devel@edk2.groups.io 
> <mailto:devel@edk2.groups.io>>; Rebecca Cran <rebecca@bsdio.com 
> <mailto:rebecca@bsdio.com>>; Hernandez Miramontes, Jose Miguel 
> <jose.miguel.hernandez.miramontes@intel.com 
> <mailto:jose.miguel.hernandez.miramontes@intel.com>>
> *Cc:* Kinney, Michael D <michael.d.kinney@intel.com 
> <mailto:michael.d.kinney@intel.com>>
> *Subject:* Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
> 
> It's been a while.
> 
> Is there any better solution? Can we go with assembly solution?
> 
> Thanks,
> 
> Ray
> 
> ------------------------------------------------------------------------
> 
> *From:*Andrew (EFI) Fish <afish@apple.com <mailto:afish@apple.com>>
> *Sent:* Saturday, May 20, 2023 12:31 AM
> *To:* edk2-devel-groups-io <devel@edk2.groups.io 
> <mailto:devel@edk2.groups.io>>; Rebecca Cran <rebecca@bsdio.com 
> <mailto:rebecca@bsdio.com>>
> *Cc:* Ni, Ray <ray.ni@intel.com <mailto:ray.ni@intel.com>>; Kinney, 
> Michael D <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>>
> *Subject:* Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
> 
> I don’t think the atomic is going to help. The compiler honored the 
> volatile by doing a read, but assumed it would never change due to 
> scoping. As you can see in my example if the compiler thinks 
> DeadLoopCount can be changed it will put the check back in and assume 
> the function can return. So an assembly function that does nothing 
> called IncreaseScope()
> 
>   would fix this issue too.
> 
> voidIncreaseScope(int*ptr);
> 
> voidCpuDeadLoopFix(void) {
> 
> volatile
> 
> intDeadLoopCount = 0;
> 
> while(DeadLoopCount ==
> 
> 0) {
> 
> IncreaseScope(&DeadLoopCount);
> 
> }
> 
> }
> 
> voidCpuDeadLoop(void) {
> 
> volatile
> 
> intDeadLoopCount = 0;
> 
> while(DeadLoopCount ==
> 
> 0);
> 
> }
> 
> Gives us:
> 
> voltbl
> 
> SEGMENT
> 
> voltbl
> 
> ENDS
> 
> voltbl
> 
> SEGMENT
> 
> voltbl
> 
> ENDS
> 
> DeadLoopCount$ =
> 
> 48
> 
> CpuDeadLoopFix PROC
> 
> ; COMDAT
> 
> $LN12:
> 
> sub
> 
> rsp, 40
> 
> ; 00000028H
> 
> mov
> 
> DWORDPTR
> 
> DeadLoopCount$[rsp],
> 
> 0
> 
> jmp
> 
> SHORT$LN10@CpuDeadLoo
> 
> $LL2@CpuDeadLoo:
> 
> lea
> 
> rcx, QWORD
> 
> PTRDeadLoopCount$[rsp]
> 
> call
> 
> IncreaseScope
> 
> $LN10@CpuDeadLoo:
> 
> mov
> 
> eax, DWORD
> 
> PTRDeadLoopCount$[rsp]
> 
> test
> 
> eax, eax
> 
> je
> 
> SHORT$LL2@CpuDeadLoo
> 
> add
> 
> rsp, 40
> 
> ; 00000028H
> 
> ret
> 
> 0
> 
> CpuDeadLoopFix ENDP
> 
> DeadLoopCount$ =
> 
> 8
> 
> CpuDeadLoop PROC
> 
> ; COMDAT
> 
> mov
> 
> DWORDPTR
> 
> DeadLoopCount$[rsp],
> 
> 0
> 
> $LL2@CpuDeadLoo:
> 
> mov
> 
> eax, DWORD
> 
> PTRDeadLoopCount$[rsp]
> 
> jmp
> 
> SHORT$LL2@CpuDeadLoo
> 
> CpuDeadLoop ENDP
> 
> Compiler Explorer 
> <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>
> 
> godbolt.org 
> <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>
> 
> 	
> 
> <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>
> 
> Thanks,
> 
> Andrew Fish
> 
> PS I’m still not 100% sure it is a compiler bug. Some times things like 
> this are due to the order the compiler applies the optimizations, and 
> changing the order can change the behavior.
> 
>     On May 19, 2023, at 8:31 AM, Rebecca Cran <rebecca@bsdio.com
>     <mailto:rebecca@bsdio.com>> wrote:
> 
>     Just to add more data, I also tried with "volatile sig_atomic_t" as
>     someone suggested and both "/volatile:iso" and "/volatile:ms" with
>     no change in results.
> 
> 
>     --
> 
>     Rebecca Cran
> 
> 
>     On 5/18/23 20:53, Ni, Ray wrote:
> 
> 
>         I think all the options we considered are workarounds. These
>         might break again if compiler is “cleverer” in future. Unless
>         some Cxx spec clearly guarantees that.
> 
>         I like Mike’s idea to use assembly implementation for
>         CpuDeadLoop. The assembly can simply “jmp $” then “ret”.
> 
>         I didn’t find a dead-loop intrinsic function in MSVC.
> 
>         Any better idea?
> 
>         Thanks,
> 
>         Ray
> 
>         *From:* Andrew (EFI) Fish <afish@apple.com <mailto:afish@apple.com>>
>         *Sent:* Friday, May 19, 2023 8:42 AM
>         *To:* devel@edk2.groups.io <mailto:devel@edk2.groups.io>;
>         Kinney, Michael D <michael.d.kinney@intel.com
>         <mailto:michael.d.kinney@intel.com>>
>         *Cc:* Ni, Ray <ray.ni@intel.com <mailto:ray.ni@intel.com>>;
>         Rebecca Cran <rebecca@bsdio.com <mailto:rebecca@bsdio.com>>
>         *Subject:* Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
> 
>         Mike,
> 
>         Sorry static was just to scope the name to the file since it is
>         a lib, not to make it work.
> 
>         That is a cool site. I learned about it complaining about stuff
>         to the compiler team on our internal clang Slack channel as they
>         use it to answer my questions.
> 
>         Thanks,
> 
>         Andrew Fish
> 
> 
> 
>             On May 18, 2023, at 2:42 PM, Michael D Kinney
>             <michael.d.kinney@intel.com
>         <mailto:michael.d.kinney@intel.com>> wrote:
> 
>             Using that tool, the following fragment seems to generate the
>             right code. Volatile is required.  Static is optional.
> 
>             staticvolatileint mDeadLoopCount =0;
> 
>             void
> 
>             CpuDeadLoop(
> 
>             void
> 
>               )
> 
>             {
> 
>             while(mDeadLoopCount ==0);
> 
>             }
> 
>             GCC
> 
>             ===
> 
>             CpuDeadLoop():
> 
>             .L2:
> 
>             moveax,DWORDPTRmDeadLoopCount[rip]
> 
>             testeax,eax
> 
>             je.L2
> 
>             ret
> 
>             CLANG
> 
>             =====
> 
>             CpuDeadLoop():# @CpuDeadLoop()
> 
>             .LBB0_1:                          # =>This Inner Loop
>         Header:Depth=1
> 
>             cmpdwordptr[rip+_ZL14mDeadLoopCount],0
> 
>             je.LBB0_1
> 
>             ret
> 
>             Mike
> 
>             *From:*Andrew (EFI) Fish <afish@apple.com
>         <mailto:afish@apple.com>>
>             *Sent:*Thursday, May 18, 2023 1:45 PM
>             *To:*edk2-devel-groups-io <devel@edk2.groups.io
>         <mailto:devel@edk2.groups.io>>; Andrew Fish
>             <afish@apple.com <mailto:afish@apple.com>>
>             *Cc:*Kinney, Michael D <michael.d.kinney@intel.com
>         <mailto:michael.d.kinney@intel.com>>; Ni, Ray
>             <ray.ni@intel.com <mailto:ray.ni@intel.com>>; Rebecca Cran
>         <rebecca@bsdio.com <mailto:rebecca@bsdio.com>>
>             *Subject:*Re: [edk2-devel] CpuDeadLoop() is optimized by
>         compiler
> 
>             Whoops wrong compiler. Here is an update. I added the flags so
>             this one reproduces the issue.
> 
>             Compiler Explorer
>            
>         <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>>
> 
>         godbolt.org <http://godbolt.org/>
>            
>         <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>>
> 
> 
> 
>             <image001.png>
>            
>         <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>>
> 
>             Thanks,
> 
>             Andrew Fish
> 
> 
> 
> 
>                 On May 18, 2023, at 11:45 AM, Andrew Fishviagroups.io
>         <http://viagroups.io/>
>                 <http://groups.io/
>         <http://groups.io/>><afish=apple.com@groups.io
>         <mailto:afish=apple.com@groups.io>> wrote:
> 
>                 Mike,
> 
>                 This is a good way to play around with fixes, and to report
>                 bugs. You can see the assembler for different compilers with
>                 different flag.
> 
>                 Compiler Explorer
>                
>         <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>>
> 
>         godbolt.org <http://godbolt.org/>
>                
>         <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>>
> 
> 
> 
>                 <favicon.png>
>                
>         <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>>
> 
>                 Sorry I’m traveling and in Cupertino with lots of
>         meetings so
>                 I did not have time to adjust the compiler flags….
> 
>                 Thanks,
> 
>                 Andrew Fish
> 
> 
> 
> 
>                     On May 18, 2023, at 10:24 AM, Andrew (EFI) Fish
>                     <afish@apple.com <mailto:afish@apple.com>> wrote:
> 
>                     Mike,
> 
>                     I guess my other question… If this turns out to be a
>                     compiler bug should we scope the change to the broken
>                     toolchain. I’m not sure what the right answer is for
>         that,
>                     but I want to ask the question?
> 
>                     Thanks,
> 
>                     Andrew Fish
> 
> 
> 
> 
>                         On May 18, 2023, at 10:19 AM, Michael D Kinney
>                         <michael.d.kinney@intel.com
>         <mailto:michael.d.kinney@intel.com>> wrote:
> 
>                         Andrew,
> 
>                         This might work for XIP.  Set non const global to
>                         initial value that is expected value to stay in
>         dead loop.
> 
>                         UINTN  mDeadLoopCount = 0;
> 
>                         VOID
> 
>                         CpuDeadLoop(
> 
>                         VOID
> 
>                         )
> 
>                         {
> 
>                         while (mDeadLoopCount == 0) {
> 
>                             CpuPause();
> 
>                         }
> 
>                         }
> 
>                         When deadloop is entered, developer can not change
>                         value of mDeadLoopCount, but they can use
>         debugger to
>                         force exit loop and return from function.
> 
>                         Mike
> 
>                         *From:*Andrew (EFI) Fish <afish@apple.com
>         <mailto:afish@apple.com>>
>                         *Sent:*Thursday, May 18, 2023 10:09 AM
>                         *To:*Kinney, Michael D
>         <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>>
>                         *Cc:*edk2-devel-groups-io <devel@edk2.groups.io
>         <mailto:devel@edk2.groups.io>>; Ni,
>                         Ray <ray.ni@intel.com
>         <mailto:ray.ni@intel.com>>; Rebecca Cran <rebecca@bsdio.com
>         <mailto:rebecca@bsdio.com>>
>                         *Subject:*Re: [edk2-devel] CpuDeadLoop() is
>         optimized
>                         by compiler
> 
>                         Mike,
> 
>                         Good point, that is why we are using the stack ….
> 
>                         The only other thing I can think of is to pass the
>                         address of Index to some inline assembler, or an asm
>                         no op function, to give it a side effect the
>         compiler
>                         can’t resolve.
> 
>                         Thanks,
> 
>                         Andrew Fish
> 
> 
> 
> 
> 
>                             On May 18, 2023, at 10:05 AM, Kinney, Michael D
>                             <michael.d.kinney@intel.com
>         <mailto:michael.d.kinney@intel.com>> wrote:
> 
>                             Static global will not work for XIP
> 
>                             Mike
> 
>                             *From:*Andrew (EFI) Fish <afish@apple.com
>         <mailto:afish@apple.com>>
>                             *Sent:*Thursday, May 18, 2023 9:49 AM
>                             *To:*edk2-devel-groups-io
>         <devel@edk2.groups.io <mailto:devel@edk2.groups.io>>;
>                             Kinney, Michael D
>         <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>>
>                             *Cc:*Ni, Ray <ray.ni@intel.com
>         <mailto:ray.ni@intel.com>>; Rebecca Cran
>                             <rebecca@bsdio.com <mailto:rebecca@bsdio.com>>
>                             *Subject:*Re: [edk2-devel] CpuDeadLoop() is
>                             optimized by compiler
> 
>                             Mike,
> 
>                             I pinged some compiler experts to see if our
>         code
>                             is correct, or if the compiler has an issue.
>         Seems
>                             to be trending compiler issue right now, but
>         I’ve
>                             NOT gotten feedback from anyone on the spec
>                             committee yet.
> 
>                             If we move Index to a static global that would
>                             likely work around the compiler issue.
> 
>                             Thanks,
> 
>                             Andrew Fish
> 
> 
> 
> 
> 
> 
>                                 On May 18, 2023, at 8:36 AM, Michael D
>         Kinney
>                                 <michael.d.kinney@intel.com
>         <mailto:michael.d.kinney@intel.com>> wrote:
> 
>                                 Hi Ray,
> 
>                                 So the code generated does deadloop, but is
>                                 just not easy to resume from as we have been
>                                 able to do in the past.
> 
>                                 We use CpuDeadloop() for 2 purposes. 
>         One is a
>                                 terminal condition with no reason to ever
>                                 continue.
> 
>                                 The 2^nd is a debug aide for developers to
>                                 halt the system at a specific location and
>                                 then continue from that point, usually
>         with a
>                                 debugger, to step through code to an area to
>                                 evaluate unexpected behavior.
> 
>                                 We may have to do a NASM implementation of
>                                 CpuDeadloop() to make sure it meets both use
>                                 cases.
> 
>                                 Mike
> 
>                                 *From:*Ni, Ray <ray.ni@intel.com
>         <mailto:ray.ni@intel.com>>
>                                 *Sent:*Thursday, May 18, 2023 3:00 AM
>                                 *To:*devel@edk2.groups.io
>         <mailto:devel@edk2.groups.io>
>                                 *Cc:*Kinney, Michael D
>                                 <michael.d.kinney@intel.com
>         <mailto:michael.d.kinney@intel.com>>; Rebecca Cran
>                                 <rebecca@bsdio.com
>         <mailto:rebecca@bsdio.com>>; Ni, Ray <ray.ni@intel.com
>         <mailto:ray.ni@intel.com>>
>                                 *Subject:*CpuDeadLoop() is optimized by
>         compiler
> 
>                                 Hi,
> 
>                                 Starting from certain version of Visual
>         Studio
>                                 C compiler (I don’t have the exact
>         version. I
>                                 am using VS2019), CpuDeadLoop is now
>         optimized
>                                 quite well by compiler.
> 
>                                 The optimization is so “good” that it
>         becomes
>                                 harder for developers to break out of the
>                                 deadloop.
> 
>                                 I copied the assembly instructions as below
>                                 for your reference.
> 
>                                 The compiler does not generate instructions
>                                 that jump out of the loop when the Index is
>                                 not zero.
> 
>                                 So in order to break out of the loop,
>                                 developers need to:
> 
>                                  1. Manually adjust rsp by increasing 40
>                                  2. Manually “ret”
> 
>                                 I am not sure if anyone has interest to
>                                 re-write this function so that compiler
>         can be
>                                 “fooled” again.
> 
>                                 Thanks,
>                                 Ray
> 
>                                 =======================
> 
>                                 ; Function compile flags: /Ogspy
> 
>                                 ; File
>                                 e:\work\edk2\MdePkg\Library\BaseLib\CpuDeadLoop.c
> 
>                                 ; COMDAT CpuDeadLoop
> 
>                                 _TEXT SEGMENT
> 
>                                 Index$ = 48
> 
>                                 CpuDeadLoop PROC ; COMDAT
> 
>                                 ; 26   : {
> 
>                                 $LN12:
> 
>                                 00000  48 83 ec 28 sub rsp, 40 ; 00000028H
> 
>                                 ; 27   : volatile UINTN  Index;
> 
>                                 ; 28   :
> 
>                                 ; 29   :   for (Index = 0; Index == 0;) {
> 
>                                 00004  48 c7 44 24 30
> 
>                                 00 00 00 00 mov      QWORD PTR
>         Index$[rsp], 0
> 
>                                 $LN10@CpuDeadLoo:
> 
>                                 ; 30   : CpuPause ();
> 
>                                 0000d  48 8b 44 24 30 mov      rax,
>         QWORD PTR
>                                 Index$[rsp]
> 
>                                 00012  e8 00 00 00 00 call CpuPause
> 
>                                 00017  eb f4 jmp SHORT $LN10@CpuDeadLoo
> 
>                                 CpuDeadLoop ENDP
> 
>                                 _TEXT ENDS
> 
>                                 END
> 
> 
> 
> 


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#119479): https://edk2.groups.io/g/devel/message/119479
Mute This Topic: https://groups.io/mt/98987896/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
  2024-06-05 16:48                                     ` Oliver Smith-Denny
@ 2024-06-07 16:57                                       ` Hernandez Miramontes, Jose Miguel
  0 siblings, 0 replies; 27+ messages in thread
From: Hernandez Miramontes, Jose Miguel @ 2024-06-07 16:57 UTC (permalink / raw)
  To: Oliver Smith-Denny, devel@edk2.groups.io, Kinney, Michael D,
	Ni, Ray, Andrew (EFI) Fish, Rebecca Cran


[-- Attachment #1.1: Type: text/plain, Size: 47959 bytes --]

At the very least, it seems to produce the right assembly output.

https://godbolt.org/z/19sPvP1nG



implementing in assembly might be worthwhile though, but not sure what is lost. Perhaps some portability?

It seems Mike code also works on gcc and clang









[cid:image001.png@01DAB8CF.E6109B40]



Jose Miguel Hernandez Miramontes

Silicon Firmware Development Engineer

SATG FST Platform Firmware Development West

jose.miguel.hernandez.miramontes@intel.com

+1 (512) 362-1230

Intel Corporation



-----Original Message-----
From: Oliver Smith-Denny <osde@linux.microsoft.com>
Sent: Wednesday, June 5, 2024 11:49 AM
To: devel@edk2.groups.io; Kinney, Michael D <michael.d.kinney@intel.com>; Ni, Ray <ray.ni@intel.com>; Andrew (EFI) Fish <afish@apple.com>; Rebecca Cran <rebecca@bsdio.com>; Hernandez Miramontes, Jose Miguel <jose.miguel.hernandez.miramontes@intel.com>
Subject: Re: [edk2-devel] CpuDeadLoop() is optimized by compiler



The other option, that I had started working on then got sidetracked with other things, would just be implement CpuDeadLoop in assembly, so we don't have to worry about random compiler optimizations in the future.



This certainly looks like an easy change for the current issue, though.



Thanks,

Oliver



On 6/4/2024 6:07 PM, Michael D Kinney wrote:

> Hi Ray,

>

> I know this is an old topic, but I think I have a new idea that works

> for XIP code.  We can update loop to compare a volatile global to a

> volatile local.  This forces 2 reads and a comparison on every loop

> iteration.  The local variable can be set to 1 to exit the loop

> without modifying the global variable.  I tried this with VS2019 with

> max opt enabled and I was able to exit loop by setting Index to 1 in a debugger.

>

> diff --git a/MdePkg/Library/BaseLib/CpuDeadLoop.c

> b/MdePkg/Library/BaseLib/CpuDeadLoop.c

>

> index b3b7548fa5..393c4290ed 100644

>

> --- a/MdePkg/Library/BaseLib/CpuDeadLoop.c

>

> +++ b/MdePkg/Library/BaseLib/CpuDeadLoop.c

>

> @@ -9,6 +9,8 @@

>

> #include <Base.h>

>

> #include <Library/BaseLib.h>

>

> +static volatile UINTN  mDeadLoopComparator = 0;^M

>

> +^M

>

> /**

>

>     Executes an infinite loop.

>

> @@ -26,7 +28,7 @@ CpuDeadLoop (

>

> {

>

>     volatile UINTN  Index;

>

> -  for (Index = 0; Index == 0;) {

>

> +  for (Index = mDeadLoopComparator; Index == mDeadLoopComparator;)

> +{^M

>

>       CpuPause ();

>

>     }

>

> }

>

> Mike

>

> *From:* Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>

> *Sent:* Tuesday, October 31, 2023 7:19 AM

> *To:* Ni, Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>; Andrew (EFI) Fish <afish@apple.com<mailto:afish@apple.com>>;

> edk2-devel-groups-io <devel@edk2.groups.io<mailto:devel@edk2.groups.io>>; Rebecca Cran

> <rebecca@bsdio.com<mailto:rebecca@bsdio.com>>; Hernandez Miramontes, Jose Miguel

> <jose.miguel.hernandez.miramontes@intel.com<mailto:jose.miguel.hernandez.miramontes@intel.com>>

> *Cc:* Kinney, Michael D <michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com>>

> *Subject:* RE: [edk2-devel] CpuDeadLoop() is optimized by compiler

>

> Right. But if you break in with debugger, you can still skip over the

> jmp instruction and continue.

>

> I agree XIP does not allow variable value to be updated, but we would

> never want to do that or all future dead loops in non XIP code would

> not loop.

>

> Mike

>

> *From:* Ni, Ray <ray.ni@intel.com <mailto:ray.ni@intel.com<mailto:ray.ni@intel.com%20%3cmailto:ray.ni@intel.com>>>

> *Sent:* Tuesday, October 31, 2023 1:31 AM

> *To:* Kinney, Michael D <michael.d.kinney@intel.com

> <mailto:michael.d.kinney@intel.com>>; Andrew (EFI) Fish

> <afish@apple.com <mailto:afish@apple.com<mailto:afish@apple.com%20%3cmailto:afish@apple.com>>>; edk2-devel-groups-io

> <devel@edk2.groups.io <mailto:devel@edk2.groups.io<mailto:devel@edk2.groups.io%20%3cmailto:devel@edk2.groups.io>>>; Rebecca Cran

> <rebecca@bsdio.com <mailto:rebecca@bsdio.com<mailto:rebecca@bsdio.com%20%3cmailto:rebecca@bsdio.com>>>; Hernandez Miramontes,

> Jose Miguel <jose.miguel.hernandez.miramontes@intel.com

> <mailto:jose.miguel.hernandez.miramontes@intel.com>>

> *Subject:* Re: [edk2-devel] CpuDeadLoop() is optimized by compiler

>

> Mike,

>

> This is not friendly for XIP code. With XIP code, the global variable

> is not able to be updated as it sits in read-only SPI flash.

>

> Thanks,

>

> Ray

>

> ----------------------------------------------------------------------

> --

>

> *From:*Kinney, Michael D <michael.d.kinney@intel.com

> <mailto:michael.d.kinney@intel.com>>

> *Sent:* Tuesday, October 31, 2023 11:37 AM

> *To:* Ni, Ray <ray.ni@intel.com <mailto:ray.ni@intel.com<mailto:ray.ni@intel.com%20%3cmailto:ray.ni@intel.com>>>; Andrew

> (EFI) Fish <afish@apple.com <mailto:afish@apple.com<mailto:afish@apple.com%20%3cmailto:afish@apple.com>>>;

> edk2-devel-groups-io <devel@edk2.groups.io

> <mailto:devel@edk2.groups.io>>; Rebecca Cran <rebecca@bsdio.com

> <mailto:rebecca@bsdio.com>>; Hernandez Miramontes, Jose Miguel

> <jose.miguel.hernandez.miramontes@intel.com

> <mailto:jose.miguel.hernandez.miramontes@intel.com>>

> *Cc:* Kinney, Michael D <michael.d.kinney@intel.com

> <mailto:michael.d.kinney@intel.com>>

> *Subject:* RE: [edk2-devel] CpuDeadLoop() is optimized by compiler

>

> Does using a static volatile global instead of a volatile local work?

>

> Mike

>

> *From:* Ni, Ray <ray.ni@intel.com <mailto:ray.ni@intel.com<mailto:ray.ni@intel.com%20%3cmailto:ray.ni@intel.com>>>

> *Sent:* Monday, October 30, 2023 7:52 PM

> *To:* Andrew (EFI) Fish <afish@apple.com <mailto:afish@apple.com<mailto:afish@apple.com%20%3cmailto:afish@apple.com>>>;

> edk2-devel-groups-io <devel@edk2.groups.io

> <mailto:devel@edk2.groups.io>>; Rebecca Cran <rebecca@bsdio.com

> <mailto:rebecca@bsdio.com>>; Hernandez Miramontes, Jose Miguel

> <jose.miguel.hernandez.miramontes@intel.com

> <mailto:jose.miguel.hernandez.miramontes@intel.com>>

> *Cc:* Kinney, Michael D <michael.d.kinney@intel.com

> <mailto:michael.d.kinney@intel.com>>

> *Subject:* Re: [edk2-devel] CpuDeadLoop() is optimized by compiler

>

> It's been a while.

>

> Is there any better solution? Can we go with assembly solution?

>

> Thanks,

>

> Ray

>

> ----------------------------------------------------------------------

> --

>

> *From:*Andrew (EFI) Fish <afish@apple.com <mailto:afish@apple.com<mailto:afish@apple.com%20%3cmailto:afish@apple.com>>>

> *Sent:* Saturday, May 20, 2023 12:31 AM

> *To:* edk2-devel-groups-io <devel@edk2.groups.io

> <mailto:devel@edk2.groups.io>>; Rebecca Cran <rebecca@bsdio.com

> <mailto:rebecca@bsdio.com>>

> *Cc:* Ni, Ray <ray.ni@intel.com <mailto:ray.ni@intel.com<mailto:ray.ni@intel.com%20%3cmailto:ray.ni@intel.com>>>; Kinney,

> Michael D <michael.d.kinney@intel.com

> <mailto:michael.d.kinney@intel.com>>

> *Subject:* Re: [edk2-devel] CpuDeadLoop() is optimized by compiler

>

> I don’t think the atomic is going to help. The compiler honored the

> volatile by doing a read, but assumed it would never change due to

> scoping. As you can see in my example if the compiler thinks

> DeadLoopCount can be changed it will put the check back in and assume

> the function can return. So an assembly function that does nothing

> called IncreaseScope()

>

>   would fix this issue too.

>

> voidIncreaseScope(int*ptr);

>

> voidCpuDeadLoopFix(void) {

>

> volatile

>

> intDeadLoopCount = 0;

>

> while(DeadLoopCount ==

>

> 0) {

>

> IncreaseScope(&DeadLoopCount);

>

> }

>

> }

>

> voidCpuDeadLoop(void) {

>

> volatile

>

> intDeadLoopCount = 0;

>

> while(DeadLoopCount ==

>

> 0);

>

> }

>

> Gives us:

>

> voltbl

>

> SEGMENT

>

> voltbl

>

> ENDS

>

> voltbl

>

> SEGMENT

>

> voltbl

>

> ENDS

>

> DeadLoopCount$ =

>

> 48

>

> CpuDeadLoopFix PROC

>

> ; COMDAT

>

> $LN12:

>

> sub

>

> rsp, 40

>

> ; 00000028H

>

> mov

>

> DWORDPTR

>

> DeadLoopCount$[rsp],

>

> 0

>

> jmp

>

> SHORT$LN10@CpuDeadLoo

>

> $LL2@CpuDeadLoo:

>

> lea

>

> rcx, QWORD

>

> PTRDeadLoopCount$[rsp]

>

> call

>

> IncreaseScope

>

> $LN10@CpuDeadLoo:

>

> mov

>

> eax, DWORD

>

> PTRDeadLoopCount$[rsp]

>

> test

>

> eax, eax

>

> je

>

> SHORT$LL2@CpuDeadLoo

>

> add

>

> rsp, 40

>

> ; 00000028H

>

> ret

>

> 0

>

> CpuDeadLoopFix ENDP

>

> DeadLoopCount$ =

>

> 8

>

> CpuDeadLoop PROC

>

> ; COMDAT

>

> mov

>

> DWORDPTR

>

> DeadLoopCount$[rsp],

>

> 0

>

> $LL2@CpuDeadLoo:

>

> mov

>

> eax, DWORD

>

> PTRDeadLoopCount$[rsp]

>

> jmp

>

> SHORT$LL2@CpuDeadLoo

>

> CpuDeadLoop ENDP

>

> Compiler Explorer

> <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtM

> A7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuY

> ukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CI

> WGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7

> BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHL

> x1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/B

> wtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOi

> xciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQ

> Abph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghE

> YnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5

> U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0B

> QxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG

> 5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYY

> hQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207L

> Mh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfR

> CpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wAL

> SI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8Co

> Kh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQdu

> himJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxM

> tJ4iAA>

>

> godbolt.org

> <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtM

> A7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuY

> ukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CI

> WGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7

> BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHL

> x1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/B

> wtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOi

> xciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQ

> Abph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghE

> YnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5

> U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0B

> QxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG

> 5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYY

> hQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207L

> Mh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfR

> CpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wAL

> SI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8Co

> Kh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQdu

> himJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxM

> tJ4iAA>

>

>

>

> <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtM

> A7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuY

> ukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CI

> WGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7

> BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHL

> x1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/B

> wtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOi

> xciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQ

> Abph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghE

> YnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5

> U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0B

> QxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG

> 5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYY

> hQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207L

> Mh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfR

> CpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wAL

> SI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8Co

> Kh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQdu

> himJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxM

> tJ4iAA>

>

> Thanks,

>

> Andrew Fish

>

> PS I’m still not 100% sure it is a compiler bug. Some times things

> like this are due to the order the compiler applies the optimizations,

> and changing the order can change the behavior.

>

>     On May 19, 2023, at 8:31 AM, Rebecca Cran <rebecca@bsdio.com

>     <mailto:rebecca@bsdio.com>> wrote:

>

>     Just to add more data, I also tried with "volatile sig_atomic_t" as

>     someone suggested and both "/volatile:iso" and "/volatile:ms" with

>     no change in results.

>

>

>     --

>

>     Rebecca Cran

>

>

>     On 5/18/23 20:53, Ni, Ray wrote:

>

>

>         I think all the options we considered are workarounds. These

>         might break again if compiler is “cleverer” in future. Unless

>         some Cxx spec clearly guarantees that.

>

>         I like Mike’s idea to use assembly implementation for

>         CpuDeadLoop. The assembly can simply “jmp $” then “ret”.

>

>         I didn’t find a dead-loop intrinsic function in MSVC.

>

>         Any better idea?

>

>         Thanks,

>

>         Ray

>

>         *From:* Andrew (EFI) Fish <afish@apple.com <mailto:afish@apple.com<mailto:afish@apple.com%20%3cmailto:afish@apple.com>>>

>         *Sent:* Friday, May 19, 2023 8:42 AM

>         *To:* devel@edk2.groups.io<mailto:devel@edk2.groups.io> <mailto:devel@edk2.groups.io>;

>         Kinney, Michael D <michael.d.kinney@intel.com

>         <mailto:michael.d.kinney@intel.com>>

>         *Cc:* Ni, Ray <ray.ni@intel.com <mailto:ray.ni@intel.com<mailto:ray.ni@intel.com%20%3cmailto:ray.ni@intel.com>>>;

>         Rebecca Cran <rebecca@bsdio.com <mailto:rebecca@bsdio.com<mailto:rebecca@bsdio.com%20%3cmailto:rebecca@bsdio.com>>>

>         *Subject:* Re: [edk2-devel] CpuDeadLoop() is optimized by

> compiler

>

>         Mike,

>

>         Sorry static was just to scope the name to the file since it is

>         a lib, not to make it work.

>

>         That is a cool site. I learned about it complaining about stuff

>         to the compiler team on our internal clang Slack channel as they

>         use it to answer my questions.

>

>         Thanks,

>

>         Andrew Fish

>

>

>

>             On May 18, 2023, at 2:42 PM, Michael D Kinney

>             <michael.d.kinney@intel.com

>         <mailto:michael.d.kinney@intel.com>> wrote:

>

>             Using that tool, the following fragment seems to generate the

>             right code. Volatile is required.  Static is optional.

>

>             staticvolatileint mDeadLoopCount =0;

>

>             void

>

>             CpuDeadLoop(

>

>             void

>

>               )

>

>             {

>

>             while(mDeadLoopCount ==0);

>

>             }

>

>             GCC

>

>             ===

>

>             CpuDeadLoop():

>

>             .L2:

>

>             moveax,DWORDPTRmDeadLoopCount[rip]

>

>             testeax,eax

>

>             je.L2

>

>             ret

>

>             CLANG

>

>             =====

>

>             CpuDeadLoop():# @CpuDeadLoop()

>

>             .LBB0_1:                          # =>This Inner Loop

>         Header:Depth=1

>

>             cmpdwordptr[rip+_ZL14mDeadLoopCount],0

>

>             je.LBB0_1

>

>             ret

>

>             Mike

>

>             *From:*Andrew (EFI) Fish <afish@apple.com

>         <mailto:afish@apple.com>>

>             *Sent:*Thursday, May 18, 2023 1:45 PM

>             *To:*edk2-devel-groups-io <devel@edk2.groups.io

>         <mailto:devel@edk2.groups.io>>; Andrew Fish

>             <afish@apple.com <mailto:afish@apple.com<mailto:afish@apple.com%20%3cmailto:afish@apple.com>>>

>             *Cc:*Kinney, Michael D <michael.d.kinney@intel.com

>         <mailto:michael.d.kinney@intel.com>>; Ni, Ray

>             <ray.ni@intel.com <mailto:ray.ni@intel.com<mailto:ray.ni@intel.com%20%3cmailto:ray.ni@intel.com>>>; Rebecca Cran

>         <rebecca@bsdio.com <mailto:rebecca@bsdio.com<mailto:rebecca@bsdio.com%20%3cmailto:rebecca@bsdio.com>>>

>             *Subject:*Re: [edk2-devel] CpuDeadLoop() is optimized by

>         compiler

>

>             Whoops wrong compiler. Here is an update. I added the flags so

>             this one reproduces the issue.

>

>             Compiler Explorer

>

>

> <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtM

> A7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuY

> ukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CI

> WGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7

> BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHL

> x1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/B

> wtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOi

> xciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQ

> Abph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghE

> YnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5

> U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0B

> QxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG

> 5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYY

> hQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207L

> Mh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfR

> CpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wAL

> SI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8Co

> Kh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQdu

> himJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxM

> tJ4iAA

> <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtM

> A7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuY

> ukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CI

> WGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7

> BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHL

> x1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/B

> wtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOi

> xciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQ

> Abph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghE

> YnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5

> U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0B

> QxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG

> 5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYY

> hQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207L

> Mh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfR

> CpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wAL

> SI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8Co

> Kh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQdu

> himJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxM

> tJ4iAA>>

>

>         godbolt.org <http://godbolt.org/>

>

>

> <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtM

> A7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuY

> ukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CI

> WGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7

> BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHL

> x1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/B

> wtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOi

> xciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQ

> Abph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghE

> YnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5

> U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0B

> QxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG

> 5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYY

> hQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207L

> Mh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfR

> CpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wAL

> SI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8Co

> Kh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQdu

> himJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxM

> tJ4iAA

> <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtM

> A7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuY

> ukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CI

> WGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7

> BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHL

> x1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/B

> wtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOi

> xciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQ

> Abph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghE

> YnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5

> U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0B

> QxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG

> 5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYY

> hQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207L

> Mh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfR

> CpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wAL

> SI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8Co

> Kh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQdu

> himJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxM

> tJ4iAA>>

>

>

>

>             <image001.png>

>

>

> <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtM

> A7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuY

> ukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CI

> WGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7

> BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHL

> x1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/B

> wtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOi

> xciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQ

> Abph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghE

> YnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5

> U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0B

> QxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG

> 5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYY

> hQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207L

> Mh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfR

> CpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wAL

> SI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8Co

> Kh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQdu

> himJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxM

> tJ4iAA

> <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtM

> A7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuY

> ukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CI

> WGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7

> BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHL

> x1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/B

> wtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOi

> xciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQ

> Abph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghE

> YnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5

> U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0B

> QxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG

> 5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYY

> hQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207L

> Mh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfR

> CpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wAL

> SI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8Co

> Kh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQdu

> himJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxM

> tJ4iAA>>

>

>             Thanks,

>

>             Andrew Fish

>

>

>

>

>                 On May 18, 2023, at 11:45 AM, Andrew Fishviagroups.io

>         <http://viagroups.io/>

>                 <http://groups.io/

>         <http://groups.io/>><afish=apple.com@groups.io

>         <mailto:afish=apple.com@groups.io>> wrote:

>

>                 Mike,

>

>                 This is a good way to play around with fixes, and to report

>                 bugs. You can see the assembler for different compilers with

>                 different flag.

>

>                 Compiler Explorer

>

>

> <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtM

> A7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuY

> ukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CI

> WGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7

> BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHL

> x1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/B

> wtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOi

> xciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQ

> Abph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghE

> YnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5

> U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0B

> QxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG

> 5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYY

> hQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207L

> Mh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfR

> CpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wAL

> SI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8Co

> Kh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQdu

> himJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxM

> tJ4iAA

> <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtM

> A7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuY

> ukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CI

> WGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7

> BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHL

> x1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/B

> wtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOi

> xciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQ

> Abph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghE

> YnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5

> U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0B

> QxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG

> 5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYY

> hQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207L

> Mh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfR

> CpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wAL

> SI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8Co

> Kh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQdu

> himJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxM

> tJ4iAA>>

>

>         godbolt.org <http://godbolt.org/>

>

>

> <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtM

> A7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuY

> ukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CI

> WGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7

> BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHL

> x1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/B

> wtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOi

> xciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQ

> Abph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghE

> YnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5

> U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0B

> QxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG

> 5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYY

> hQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207L

> Mh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfR

> CpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wAL

> SI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8Co

> Kh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQdu

> himJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxM

> tJ4iAA

> <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtM

> A7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuY

> ukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CI

> WGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7

> BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHL

> x1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/B

> wtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOi

> xciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQ

> Abph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghE

> YnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5

> U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0B

> QxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG

> 5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYY

> hQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207L

> Mh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfR

> CpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wAL

> SI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8Co

> Kh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQdu

> himJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxM

> tJ4iAA>>

>

>

>

>                 <favicon.png>

>

>

> <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtM

> A7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuY

> ukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CI

> WGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7

> BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHL

> x1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/B

> wtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOi

> xciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQ

> Abph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghE

> YnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5

> U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0B

> QxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG

> 5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYY

> hQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207L

> Mh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfR

> CpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wAL

> SI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8Co

> Kh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQdu

> himJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxM

> tJ4iAA

> <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtM

> A7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuY

> ukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CI

> WGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7

> BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHL

> x1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/B

> wtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOi

> xciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQ

> Abph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghE

> YnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5

> U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0B

> QxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG

> 5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYY

> hQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207L

> Mh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfR

> CpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wAL

> SI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8Co

> Kh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQdu

> himJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxM

> tJ4iAA>>

>

>                 Sorry I’m traveling and in Cupertino with lots of

>         meetings so

>                 I did not have time to adjust the compiler flags….

>

>                 Thanks,

>

>                 Andrew Fish

>

>

>

>

>                     On May 18, 2023, at 10:24 AM, Andrew (EFI) Fish

>                     <afish@apple.com <mailto:afish@apple.com<mailto:afish@apple.com%20%3cmailto:afish@apple.com>>> wrote:

>

>                     Mike,

>

>                     I guess my other question… If this turns out to be a

>                     compiler bug should we scope the change to the broken

>                     toolchain. I’m not sure what the right answer is for

>         that,

>                     but I want to ask the question?

>

>                     Thanks,

>

>                     Andrew Fish

>

>

>

>

>                         On May 18, 2023, at 10:19 AM, Michael D Kinney

>                         <michael.d.kinney@intel.com

>         <mailto:michael.d.kinney@intel.com>> wrote:

>

>                         Andrew,

>

>                         This might work for XIP.  Set non const global to

>                         initial value that is expected value to stay in

>         dead loop.

>

>                         UINTN  mDeadLoopCount = 0;

>

>                         VOID

>

>                         CpuDeadLoop(

>

>                         VOID

>

>                         )

>

>                         {

>

>                         while (mDeadLoopCount == 0) {

>

>                             CpuPause();

>

>                         }

>

>                         }

>

>                         When deadloop is entered, developer can not change

>                         value of mDeadLoopCount, but they can use

>         debugger to

>                         force exit loop and return from function.

>

>                         Mike

>

>                         *From:*Andrew (EFI) Fish <afish@apple.com

>         <mailto:afish@apple.com>>

>                         *Sent:*Thursday, May 18, 2023 10:09 AM

>                         *To:*Kinney, Michael D

>         <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com%20%3cmailto:michael.d.kinney@intel.com>>>

>                         *Cc:*edk2-devel-groups-io <devel@edk2.groups.io

>         <mailto:devel@edk2.groups.io>>; Ni,

>                         Ray <ray.ni@intel.com

>         <mailto:ray.ni@intel.com>>; Rebecca Cran <rebecca@bsdio.com

>         <mailto:rebecca@bsdio.com>>

>                         *Subject:*Re: [edk2-devel] CpuDeadLoop() is

>         optimized

>                         by compiler

>

>                         Mike,

>

>                         Good point, that is why we are using the stack ….

>

>                         The only other thing I can think of is to pass the

>                         address of Index to some inline assembler, or an asm

>                         no op function, to give it a side effect the

>         compiler

>                         can’t resolve.

>

>                         Thanks,

>

>                         Andrew Fish

>

>

>

>

>

>                             On May 18, 2023, at 10:05 AM, Kinney, Michael D

>                             <michael.d.kinney@intel.com

>         <mailto:michael.d.kinney@intel.com>> wrote:

>

>                             Static global will not work for XIP

>

>                             Mike

>

>                             *From:*Andrew (EFI) Fish <afish@apple.com

>         <mailto:afish@apple.com>>

>                             *Sent:*Thursday, May 18, 2023 9:49 AM

>                             *To:*edk2-devel-groups-io

>         <devel@edk2.groups.io <mailto:devel@edk2.groups.io<mailto:devel@edk2.groups.io%20%3cmailto:devel@edk2.groups.io>>>;

>                             Kinney, Michael D

>         <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com<mailto:michael.d.kinney@intel.com%20%3cmailto:michael.d.kinney@intel.com>>>

>                             *Cc:*Ni, Ray <ray.ni@intel.com

>         <mailto:ray.ni@intel.com>>; Rebecca Cran

>                             <rebecca@bsdio.com <mailto:rebecca@bsdio.com<mailto:rebecca@bsdio.com%20%3cmailto:rebecca@bsdio.com>>>

>                             *Subject:*Re: [edk2-devel] CpuDeadLoop() is

>                             optimized by compiler

>

>                             Mike,

>

>                             I pinged some compiler experts to see if our

>         code

>                             is correct, or if the compiler has an issue.

>         Seems

>                             to be trending compiler issue right now, but

>         I’ve

>                             NOT gotten feedback from anyone on the spec

>                             committee yet.

>

>                             If we move Index to a static global that would

>                             likely work around the compiler issue.

>

>                             Thanks,

>

>                             Andrew Fish

>

>

>

>

>

>

>                                 On May 18, 2023, at 8:36 AM, Michael D

>         Kinney

>                                 <michael.d.kinney@intel.com

>         <mailto:michael.d.kinney@intel.com>> wrote:

>

>                                 Hi Ray,

>

>                                 So the code generated does deadloop, but is

>                                 just not easy to resume from as we have been

>                                 able to do in the past.

>

>                                 We use CpuDeadloop() for 2 purposes.

>         One is a

>                                 terminal condition with no reason to ever

>                                 continue.

>

>                                 The 2^nd is a debug aide for developers to

>                                 halt the system at a specific location and

>                                 then continue from that point, usually

>         with a

>                                 debugger, to step through code to an area to

>                                 evaluate unexpected behavior.

>

>                                 We may have to do a NASM implementation of

>                                 CpuDeadloop() to make sure it meets both use

>                                 cases.

>

>                                 Mike

>

>                                 *From:*Ni, Ray <ray.ni@intel.com

>         <mailto:ray.ni@intel.com>>

>                                 *Sent:*Thursday, May 18, 2023 3:00 AM

>                                 *To:*devel@edk2.groups.io

>         <mailto:devel@edk2.groups.io>

>                                 *Cc:*Kinney, Michael D

>                                 <michael.d.kinney@intel.com

>         <mailto:michael.d.kinney@intel.com>>; Rebecca Cran

>                                 <rebecca@bsdio.com

>         <mailto:rebecca@bsdio.com>>; Ni, Ray <ray.ni@intel.com

>         <mailto:ray.ni@intel.com>>

>                                 *Subject:*CpuDeadLoop() is optimized by

>         compiler

>

>                                 Hi,

>

>                                 Starting from certain version of Visual

>         Studio

>                                 C compiler (I don’t have the exact

>         version. I

>                                 am using VS2019), CpuDeadLoop is now

>         optimized

>                                 quite well by compiler.

>

>                                 The optimization is so “good” that it

>         becomes

>                                 harder for developers to break out of the

>                                 deadloop.

>

>                                 I copied the assembly instructions as below

>                                 for your reference.

>

>                                 The compiler does not generate instructions

>                                 that jump out of the loop when the Index is

>                                 not zero.

>

>                                 So in order to break out of the loop,

>                                 developers need to:

>

>                                  1. Manually adjust rsp by increasing 40

>                                  2. Manually “ret”

>

>                                 I am not sure if anyone has interest to

>                                 re-write this function so that compiler

>         can be

>                                 “fooled” again.

>

>                                 Thanks,

>                                 Ray

>

>                                 =======================

>

>                                 ; Function compile flags: /Ogspy

>

>                                 ; File

>

> e:\work\edk2\MdePkg\Library\BaseLib\CpuDeadLoop.c

>

>                                 ; COMDAT CpuDeadLoop

>

>                                 _TEXT SEGMENT

>

>                                 Index$ = 48

>

>                                 CpuDeadLoop PROC ; COMDAT

>

>                                 ; 26   : {

>

>                                 $LN12:

>

>                                 00000  48 83 ec 28 sub rsp, 40 ;

> 00000028H

>

>                                 ; 27   : volatile UINTN  Index;

>

>                                 ; 28   :

>

>                                 ; 29   :   for (Index = 0; Index ==

> 0;) {

>

>                                 00004  48 c7 44 24 30

>

>                                 00 00 00 00 mov      QWORD PTR

>         Index$[rsp], 0

>

>                                 $LN10@CpuDeadLoo:

>

>                                 ; 30   : CpuPause ();

>

>                                 0000d  48 8b 44 24 30 mov      rax,

>         QWORD PTR

>                                 Index$[rsp]

>

>                                 00012  e8 00 00 00 00 call CpuPause

>

>                                 00017  eb f4 jmp SHORT

> $LN10@CpuDeadLoo

>

>                                 CpuDeadLoop ENDP

>

>                                 _TEXT ENDS

>

>                                 END

>

>

>

> 


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#119517): https://edk2.groups.io/g/devel/message/119517
Mute This Topic: https://groups.io/mt/98987896/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



[-- Attachment #1.2: Type: text/html, Size: 129159 bytes --]

[-- Attachment #2: image001.png --]
[-- Type: image/png, Size: 111918 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2024-06-07 16:57 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-05-18  9:59 CpuDeadLoop() is optimized by compiler Ni, Ray
2023-05-18 13:19 ` [edk2-devel] " Pedro Falcato
2023-05-18 15:36 ` Michael D Kinney
2023-05-18 16:49   ` [edk2-devel] " Andrew Fish
2023-05-18 17:05     ` Michael D Kinney
2023-05-18 17:08       ` Andrew Fish
2023-05-18 17:19         ` Michael D Kinney
2023-05-18 17:22           ` Andrew Fish
2023-05-18 17:24           ` Andrew Fish
2023-05-18 18:45             ` Andrew Fish
     [not found]             ` <17605136DCF3E084.26337@groups.io>
2023-05-18 20:45               ` Andrew Fish
2023-05-18 21:42                 ` Michael D Kinney
2023-05-19  0:42                   ` Andrew Fish
2023-05-19  2:53                     ` Ni, Ray
2023-05-19  3:03                       ` Jeff Fan
2023-05-19 15:31                       ` Rebecca Cran
2023-05-19 16:31                         ` Andrew Fish
2023-10-31  2:51                           ` Ni, Ray
2023-10-31  3:37                             ` Michael D Kinney
2023-10-31  8:30                               ` Ni, Ray
2023-10-31 14:19                                 ` Michael D Kinney
2024-06-05  1:07                                   ` Michael D Kinney
2024-06-05 16:48                                     ` Oliver Smith-Denny
2024-06-07 16:57                                       ` Hernandez Miramontes, Jose Miguel
     [not found]                       ` <1760952DCE55DF8D.29365@groups.io>
2023-05-19 16:09                         ` Rebecca Cran
2023-05-18 17:36   ` Rebecca Cran
2023-05-18 18:21     ` Andrew Fish

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox