public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
From: "Andrew Fish" <afish@apple.com>
To: edk2-devel-groups-io <devel@edk2.groups.io>,
	Rebecca Cran <rebecca@bsdio.com>
Cc: "Ni, Ray" <ray.ni@intel.com>, Mike Kinney <michael.d.kinney@intel.com>
Subject: Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
Date: Fri, 19 May 2023 09:31:19 -0700	[thread overview]
Message-ID: <31286A1F-3497-478E-A938-50EEB1F61774@apple.com> (raw)
In-Reply-To: <24bf3d19-a292-11df-c84b-5d941cf2e19d@bsdio.com>

[-- Attachment #1: Type: text/plain, Size: 23402 bytes --]

I don’t think the atomic is going to help. The compiler honored the volatile by doing a read, but assumed it would never change due to scoping. As you can see in my example if the compiler thinks DeadLoopCount can be changed it will put the check back in and assume the function can return. So an assembly function that does nothing called IncreaseScope()  would fix this issue too. 

void IncreaseScope(int *ptr);

void CpuDeadLoopFix(void) {
  volatile int DeadLoopCount = 0;
  while(DeadLoopCount == 0) {
    IncreaseScope(&DeadLoopCount);
  }
}

void CpuDeadLoop(void) {
  volatile int DeadLoopCount = 0;
  while(DeadLoopCount == 0);
}

Gives us:

voltbl  SEGMENT
voltbl  ENDS
voltbl  SEGMENT
voltbl  ENDS

DeadLoopCount$ = 48
CpuDeadLoopFix PROC                           ; COMDAT
$LN12:
        sub     rsp, 40                             ; 00000028H
        mov     DWORD PTR DeadLoopCount$[rsp], 0
        jmp     SHORT $LN10@CpuDeadLoo
$LL2@CpuDeadLoo:
        lea     rcx, QWORD PTR DeadLoopCount$[rsp]
        call    IncreaseScope
$LN10@CpuDeadLoo:
        mov     eax, DWORD PTR DeadLoopCount$[rsp]
        test    eax, eax
        je      SHORT $LL2@CpuDeadLoo
        add     rsp, 40                             ; 00000028H
        ret     0
CpuDeadLoopFix ENDP

DeadLoopCount$ = 8
CpuDeadLoop PROC                                        ; COMDAT
        mov     DWORD PTR DeadLoopCount$[rsp], 0
$LL2@CpuDeadLoo:
        mov     eax, DWORD PTR DeadLoopCount$[rsp]
        jmp     SHORT $LL2@CpuDeadLoo
CpuDeadLoop ENDP


https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA

Thanks,

Andrew Fish

PS I’m still not 100% sure it is a compiler bug. Some times things like this are due to the order the compiler applies the optimizations, and changing the order can change the behavior. 


> On May 19, 2023, at 8:31 AM, Rebecca Cran <rebecca@bsdio.com> wrote:
> 
> Just to add more data, I also tried with "volatile sig_atomic_t" as someone suggested and both "/volatile:iso" and "/volatile:ms" with no change in results.
> 
> 
> -- 
> 
> Rebecca Cran
> 
> 
> On 5/18/23 20:53, Ni, Ray wrote:
>> 
>> I think all the options we considered are workarounds. These might break again if compiler is “cleverer” in future. Unless some Cxx spec clearly guarantees that.
>> 
>> I like Mike’s idea to use assembly implementation for CpuDeadLoop. The assembly can simply “jmp $” then “ret”.
>> 
>> I didn’t find a dead-loop intrinsic function in MSVC.
>> 
>> Any better idea?
>> 
>> Thanks,
>> 
>> Ray
>> 
>> *From:* Andrew (EFI) Fish <afish@apple.com>
>> *Sent:* Friday, May 19, 2023 8:42 AM
>> *To:* devel@edk2.groups.io; Kinney, Michael D <michael.d.kinney@intel.com>
>> *Cc:* Ni, Ray <ray.ni@intel.com>; Rebecca Cran <rebecca@bsdio.com>
>> *Subject:* Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
>> 
>> Mike,
>> 
>> Sorry static was just to scope the name to the file since it is a lib, not to make it work.
>> 
>> That is a cool site. I learned about it complaining about stuff to the compiler team on our internal clang Slack channel as they use it to answer my questions.
>> 
>> Thanks,
>> 
>> Andrew Fish
>> 
>> 
>> 
>>    On May 18, 2023, at 2:42 PM, Michael D Kinney
>>    <michael.d.kinney@intel.com> wrote:
>> 
>>    Using that tool, the following fragment seems to generate the
>>    right code. Volatile is required.  Static is optional.
>> 
>>    staticvolatileint mDeadLoopCount =0;
>> 
>>    void
>> 
>>    CpuDeadLoop(
>> 
>>    void
>> 
>>      )
>> 
>>    {
>> 
>>    while(mDeadLoopCount ==0);
>> 
>>    }
>> 
>>    GCC
>> 
>>    ===
>> 
>>    CpuDeadLoop():
>> 
>>    .L2:
>> 
>>    moveax,DWORDPTRmDeadLoopCount[rip]
>> 
>>    testeax,eax
>> 
>>    je.L2
>> 
>>    ret
>> 
>>    CLANG
>> 
>>    =====
>> 
>>    CpuDeadLoop():# @CpuDeadLoop()
>> 
>>    .LBB0_1:                          # =>This Inner Loop Header:Depth=1
>> 
>>    cmpdwordptr[rip+_ZL14mDeadLoopCount],0
>> 
>>    je.LBB0_1
>> 
>>    ret
>> 
>>    Mike
>> 
>>    *From:*Andrew (EFI) Fish <afish@apple.com>
>>    *Sent:*Thursday, May 18, 2023 1:45 PM
>>    *To:*edk2-devel-groups-io <devel@edk2.groups.io>; Andrew Fish
>>    <afish@apple.com>
>>    *Cc:*Kinney, Michael D <michael.d.kinney@intel.com>; Ni, Ray
>>    <ray.ni@intel.com>; Rebecca Cran <rebecca@bsdio.com>
>>    *Subject:*Re: [edk2-devel] CpuDeadLoop() is optimized by compiler
>> 
>>    Whoops wrong compiler. Here is an update. I added the flags so
>>    this one reproduces the issue.
>> 
>>    Compiler Explorer
>>    <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>
>> 
>>    godbolt.org <http://godbolt.org/>
>>    <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>
>> 
>>    	
>> 
>>    <image001.png>
>>    <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>
>> 
>>    Thanks,
>> 
>>    Andrew Fish
>> 
>> 
>> 
>> 
>>        On May 18, 2023, at 11:45 AM, Andrew Fish viagroups.io <http://viagroups.io/>
>>        <http://groups.io/><afish=apple.com@groups.io <mailto:afish=apple.com@groups.io>> wrote:
>> 
>>        Mike,
>> 
>>        This is a good way to play around with fixes, and to report
>>        bugs. You can see the assembler for different compilers with
>>        different flag.
>> 
>>        Compiler Explorer
>>        <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>
>> 
>>        godbolt.org <http://godbolt.org/>
>>        <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>
>> 
>>        	
>> 
>>        <favicon.png>
>>        <https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEINIADqgKhE4MHt6%2B/qRJKY4CIWGRLDFx0naYDmlCBEzEBBk%2BfgEVVQI1dQSFEdGx8ba19Y1ZLYNdoT0lfZIAlLaoXsTI7BwA9KsA1AAqAJ4JmBs7C8QbaFgbCLGYpBskG7SoTOgbhhuYqqwJ9AB0JhoAgqECBsFABHLx1TAQIEbBg%2BGYbEwAdisAI26I2xEwBEWDFhPg2ACp8SwTABmVH/ZEAEQ4c1onAArLw/BwtKRUJw3NZrCDjstEWYyTxSARNHS5gBrECMjT6TiSFnijmcXgKEBysVsumkOCwJBoFgJOixciUQ3G%2BhxYBcMxcPh0AixdUQKLKqKhOo7Tgiw1sQQAeQYtG92tIWBYhmA4jD%2BCxVQAbph1WH3pUvE6fbwgZgGWHaHgosQvR4sMqCMQ8Cws3MqAZgAoAGp4TAAdwD%2B1ZIv4ghEYnYUhkgkUKnUYd09oMRhQPMs%2BkL6sgc1QCXyDBTAFoA2SNhvI0sEOTqQpJTsDJLMGrc5U1y4GO5PE09MEJsVSnpcqkBMM/PbP2vujfPp7VaNcOiGR8shA682gYcDxiKXo4hAsYfz0BQxkApCJDmBR%2BQHUUsRWHh6SZJUw05DhVAADgANg3WjJA2YBkGQDZbW%2BLgNggblLGsG5cEIO5zGFG4PCNE0ThErgZl4LUtDmCADVQCSrTNCALUklApxtSQNDlGhaCdYgXTdMMPWYYhQ19FT/QIIMQ2VCMoxjdk4xvPAkxTdk02QDNiOzQRc2VAsixLDAVnZCsqxrPh6ybFt207LMh2EURxEHHt5CUNRlV0AIdJnPi51CxcIGXVc0k3bdd33ZBDzJY9T3PS9bBg28IFcND7RfRCpmQnJki/dJIN/Qa8jSLD%2BvQ9rqlQ0aZvsMDMNfbCUM6bqBk6Kb3xk%2BZFmWPQK0wALSI4ZlSFZdlKJo%2BjGNOHT2Mkb4NBe7jeKsOcNkEogpKFe0NnEy1YkFMkzFk0VxUUpB8CoKh1KyvsMukLKR1y/MEHVSdMdhqgCD2dg5WITHsmJhRcfx/YNTOi6rt4SjqTwOGNmbNsQduhimJYtiOK4hMFA2Dn7uQR6pBel65KhqUZTlPNFUu5VKLVbJ5IleUODMcjrtVSHtRmOYkxMtJ4iAA>
>> 
>>        Sorry I’m traveling and in Cupertino with lots of meetings so
>>        I did not have time to adjust the compiler flags….
>> 
>>        Thanks,
>> 
>>        Andrew Fish
>> 
>> 
>> 
>> 
>>            On May 18, 2023, at 10:24 AM, Andrew (EFI) Fish
>>            <afish@apple.com <mailto:afish@apple.com>> wrote:
>> 
>>            Mike,
>> 
>>            I guess my other question… If this turns out to be a
>>            compiler bug should we scope the change to the broken
>>            toolchain. I’m not sure what the right answer is for that,
>>            but I want to ask the question?
>> 
>>            Thanks,
>> 
>>            Andrew Fish
>> 
>> 
>> 
>> 
>>                On May 18, 2023, at 10:19 AM, Michael D Kinney
>>                <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>> wrote:
>> 
>>                Andrew,
>> 
>>                This might work for XIP.  Set non const global to
>>                initial value that is expected value to stay in dead loop.
>> 
>>                UINTN  mDeadLoopCount = 0;
>> 
>>                VOID
>> 
>>                CpuDeadLoop(
>> 
>>                VOID
>> 
>>                )
>> 
>>                {
>> 
>>                while (mDeadLoopCount == 0) {
>> 
>>                    CpuPause();
>> 
>>                }
>> 
>>                }
>> 
>>                When deadloop is entered, developer can not change
>>                value of mDeadLoopCount, but they can use debugger to
>>                force exit loop and return from function.
>> 
>>                Mike
>> 
>>                *From:*Andrew (EFI) Fish <afish@apple.com <mailto:afish@apple.com>>
>>                *Sent:*Thursday, May 18, 2023 10:09 AM
>>                *To:*Kinney, Michael D <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>>
>>                *Cc:*edk2-devel-groups-io <devel@edk2.groups.io <mailto:devel@edk2.groups.io>>; Ni,
>>                Ray <ray.ni@intel.com <mailto:ray.ni@intel.com>>; Rebecca Cran <rebecca@bsdio.com <mailto:rebecca@bsdio.com>>
>>                *Subject:*Re: [edk2-devel] CpuDeadLoop() is optimized
>>                by compiler
>> 
>>                Mike,
>> 
>>                Good point, that is why we are using the stack ….
>> 
>>                The only other thing I can think of is to pass the
>>                address of Index to some inline assembler, or an asm
>>                no op function, to give it a side effect the compiler
>>                can’t resolve.
>> 
>>                Thanks,
>> 
>>                Andrew Fish
>> 
>> 
>> 
>> 
>> 
>>                    On May 18, 2023, at 10:05 AM, Kinney, Michael D
>>                    <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>> wrote:
>> 
>>                    Static global will not work for XIP
>> 
>>                    Mike
>> 
>>                    *From:*Andrew (EFI) Fish <afish@apple.com <mailto:afish@apple.com>>
>>                    *Sent:*Thursday, May 18, 2023 9:49 AM
>>                    *To:*edk2-devel-groups-io <devel@edk2.groups.io <mailto:devel@edk2.groups.io>>;
>>                    Kinney, Michael D <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>>
>>                    *Cc:*Ni, Ray <ray.ni@intel.com <mailto:ray.ni@intel.com>>; Rebecca Cran
>>                    <rebecca@bsdio.com <mailto:rebecca@bsdio.com>>
>>                    *Subject:*Re: [edk2-devel] CpuDeadLoop() is
>>                    optimized by compiler
>> 
>>                    Mike,
>> 
>>                    I pinged some compiler experts to see if our code
>>                    is correct, or if the compiler has an issue. Seems
>>                    to be trending compiler issue right now, but I’ve
>>                    NOT gotten feedback from anyone on the spec
>>                    committee yet.
>> 
>>                    If we move Index to a static global that would
>>                    likely work around the compiler issue.
>> 
>>                    Thanks,
>> 
>>                    Andrew Fish
>> 
>> 
>> 
>> 
>> 
>> 
>>                        On May 18, 2023, at 8:36 AM, Michael D Kinney
>>                        <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>> wrote:
>> 
>>                        Hi Ray,
>> 
>>                        So the code generated does deadloop, but is
>>                        just not easy to resume from as we have been
>>                        able to do in the past.
>> 
>>                        We use CpuDeadloop() for 2 purposes.  One is a
>>                        terminal condition with no reason to ever
>>                        continue.
>> 
>>                        The 2^nd is a debug aide for developers to
>>                        halt the system at a specific location and
>>                        then continue from that point, usually with a
>>                        debugger, to step through code to an area to
>>                        evaluate unexpected behavior.
>> 
>>                        We may have to do a NASM implementation of
>>                        CpuDeadloop() to make sure it meets both use
>>                        cases.
>> 
>>                        Mike
>> 
>>                        *From:*Ni, Ray <ray.ni@intel.com <mailto:ray.ni@intel.com>>
>>                        *Sent:*Thursday, May 18, 2023 3:00 AM
>>                        *To:*devel@edk2.groups.io <mailto:devel@edk2.groups.io>
>>                        *Cc:*Kinney, Michael D
>>                        <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>>; Rebecca Cran
>>                        <rebecca@bsdio.com <mailto:rebecca@bsdio.com>>; Ni, Ray <ray.ni@intel.com <mailto:ray.ni@intel.com>>
>>                        *Subject:*CpuDeadLoop() is optimized by compiler
>> 
>>                        Hi,
>> 
>>                        Starting from certain version of Visual Studio
>>                        C compiler (I don’t have the exact version. I
>>                        am using VS2019), CpuDeadLoop is now optimized
>>                        quite well by compiler.
>> 
>>                        The optimization is so “good” that it becomes
>>                        harder for developers to break out of the
>>                        deadloop.
>> 
>>                        I copied the assembly instructions as below
>>                        for your reference.
>> 
>>                        The compiler does not generate instructions
>>                        that jump out of the loop when the Index is
>>                        not zero.
>> 
>>                        So in order to break out of the loop,
>>                        developers need to:
>> 
>>                         1. Manually adjust rsp by increasing 40
>>                         2. Manually “ret”
>> 
>>                        I am not sure if anyone has interest to
>>                        re-write this function so that compiler can be
>>                        “fooled” again.
>> 
>>                        Thanks,
>>                        Ray
>> 
>>                        =======================
>> 
>>                        ; Function compile flags: /Ogspy
>> 
>>                        ; File
>>                        e:\work\edk2\MdePkg\Library\BaseLib\CpuDeadLoop.c
>> 
>>                        ; COMDAT CpuDeadLoop
>> 
>>                        _TEXT SEGMENT
>> 
>>                        Index$ = 48
>> 
>>                        CpuDeadLoop PROC ; COMDAT
>> 
>>                        ; 26   : {
>> 
>>                        $LN12:
>> 
>>                        00000  48 83 ec 28 sub rsp, 40 ; 00000028H
>> 
>>                        ; 27   : volatile UINTN  Index;
>> 
>>                        ; 28   :
>> 
>>                        ; 29   :   for (Index = 0; Index == 0;) {
>> 
>>                        00004  48 c7 44 24 30
>> 
>>                        00 00 00 00 mov      QWORD PTR Index$[rsp], 0
>> 
>>                        $LN10@CpuDeadLoo:
>> 
>>                        ; 30   : CpuPause ();
>> 
>>                        0000d  48 8b 44 24 30 mov      rax, QWORD PTR
>>                        Index$[rsp]
>> 
>>                        00012  e8 00 00 00 00 call CpuPause
>> 
>>                        00017  eb f4 jmp SHORT $LN10@CpuDeadLoo
>> 
>>                        CpuDeadLoop ENDP
>> 
>>                        _TEXT ENDS
>> 
>>                        END
>> 
>>    
> 
> 
> 


[-- Attachment #2.1: Type: text/html, Size: 66309 bytes --]

[-- Attachment #2.2: favicon.png --]
[-- Type: image/png, Size: 12765 bytes --]

  reply	other threads:[~2023-05-19 16:31 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-18  9:59 CpuDeadLoop() is optimized by compiler Ni, Ray
2023-05-18 13:19 ` [edk2-devel] " Pedro Falcato
2023-05-18 15:36 ` Michael D Kinney
2023-05-18 16:49   ` [edk2-devel] " Andrew Fish
2023-05-18 17:05     ` Michael D Kinney
2023-05-18 17:08       ` Andrew Fish
2023-05-18 17:19         ` Michael D Kinney
2023-05-18 17:22           ` Andrew Fish
2023-05-18 17:24           ` Andrew Fish
2023-05-18 18:45             ` Andrew Fish
     [not found]             ` <17605136DCF3E084.26337@groups.io>
2023-05-18 20:45               ` Andrew Fish
2023-05-18 21:42                 ` Michael D Kinney
2023-05-19  0:42                   ` Andrew Fish
2023-05-19  2:53                     ` Ni, Ray
2023-05-19  3:03                       ` Jeff Fan
2023-05-19 15:31                       ` Rebecca Cran
2023-05-19 16:31                         ` Andrew Fish [this message]
2023-10-31  2:51                           ` Ni, Ray
2023-10-31  3:37                             ` Michael D Kinney
2023-10-31  8:30                               ` Ni, Ray
2023-10-31 14:19                                 ` Michael D Kinney
2024-06-05  1:07                                   ` Michael D Kinney
2024-06-05 16:48                                     ` Oliver Smith-Denny
2024-06-07 16:57                                       ` Hernandez Miramontes, Jose Miguel
     [not found]                       ` <1760952DCE55DF8D.29365@groups.io>
2023-05-19 16:09                         ` Rebecca Cran
2023-05-18 17:36   ` Rebecca Cran
2023-05-18 18:21     ` Andrew Fish

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=31286A1F-3497-478E-A938-50EEB1F61774@apple.com \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox