public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
From: Andrew Fish <afish@apple.com>
To: Mike Kinney <michael.d.kinney@intel.com>
Cc: "Gao, Liming" <liming.gao@intel.com>,
	"Zhang, Shenglei" <shenglei.zhang@intel.com>,
	edk2-devel-01 <edk2-devel@lists.01.org>
Subject: Re: [PATCH 3/3] MdePkg/BaseSynchronizationLib: Remove inline X86 assembly code
Date: Wed, 06 Mar 2019 17:31:25 -0800	[thread overview]
Message-ID: <47E8FE4A-4BDC-48A3-B7E6-02F1682C0990@apple.com> (raw)
In-Reply-To: <E92EE9817A31E24EB0585FDF735412F5B8BB86B7@ORSMSX113.amr.corp.intel.com>



> On Mar 6, 2019, at 4:41 PM, Kinney, Michael D <michael.d.kinney@intel.com> wrote:
> 
> Liming,
>  
> That does not work either because inline assembly uses compiler specific syntax.
>  
> Please update with the specific list of functions that you think the inline should be removed to improve maintenance with no impacts in size/perf.
>  

Mike,

It is easy to find the gcc/clang inline assembler, just `git grep "__asm__ __volatile__"`

The main files are:
https://github.com/tianocore/edk2/blob/master/MdePkg/Library/BaseIoLibIntrinsic/IoLibGcc.c
https://github.com/tianocore/edk2/blob/master/MdePkg/Library/BaseLib/X64/GccInline.c
https://github.com/tianocore/edk2/blob/master/MdePkg/Library/BaseLib/Ia32/GccInline.c
https://github.com/tianocore/edk2/blob/master/MdePkg/Library/BaseSynchronizationLib/Ia32/GccInline.c
https://github.com/tianocore/edk2/blob/master/MdePkg/Library/BaseSynchronizationLib/X64/GccInline.c

This is really just compiler optimizer control. 
Library/BaseSynchronizationLib/SynchronizationGcc.c:21:#define _ReadWriteBarrier() do { __asm__ __volatile__ ("": : : "memory"); } while(0)

Looks like this is working around the alignment ASSERT in BaseIoLibIntrinsic. 
OvmfPkg/QemuVideoDxe/UnalignedIoGcc.c:43:  __asm__ __volatile__ ( "outl %0, %1" : : "a" (Value), "d" ((UINT16)Port) );
OvmfPkg/QemuVideoDxe/UnalignedIoGcc.c:67:  __asm__ __volatile__ ( "inl %1, %0" : "=a" (Data) : "d" ((UINT16)Port) );

It seems like we have a reasonable set and should not use the inline assembly for any more complex code. 

Thanks,

Andrew Fish

> The issue you pointed to was around SetJump()/LongJump().  Can we limit this BZ to only those 2 functions?
>  
> Mike
>   <>
> From: Gao, Liming 
> Sent: Wednesday, March 6, 2019 4:28 PM
> To: afish@apple.com
> Cc: Zhang, Shenglei <shenglei.zhang@intel.com>; edk2-devel-01 <edk2-devel@lists.01.org>; Kinney, Michael D <michael.d.kinney@intel.com>
> Subject: RE: [edk2] [PATCH 3/3] MdePkg/BaseSynchronizationLib: Remove inline X86 assembly code
>  
> Andrew:
>   I want to keep only one implementation. If inline assembly c source is preferred, I suggest to remove its nasm implementation.
>  
> Thanks
> Liming
>  <>From: afish@apple.com <mailto:afish@apple.com> [mailto:afish@apple.com <mailto:afish@apple.com>] 
> Sent: Tuesday, March 5, 2019 2:44 PM
> To: Gao, Liming <liming.gao@intel.com <mailto:liming.gao@intel.com>>
> Cc: Zhang, Shenglei <shenglei.zhang@intel.com <mailto:shenglei.zhang@intel.com>>; edk2-devel-01 <edk2-devel@lists.01.org <mailto:edk2-devel@lists.01.org>>; Kinney, Michael D <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>>
> Subject: Re: [edk2] [PATCH 3/3] MdePkg/BaseSynchronizationLib: Remove inline X86 assembly code
>  
>  
>  
> 
> On Mar 5, 2019, at 1:32 PM, Gao, Liming <liming.gao@intel.com <mailto:liming.gao@intel.com>> wrote:
>  
> Andrew:
>  BZ 1163 is to remove inline X86 assembly code in C source file. But, this patch is wrong. I have gave my comments to update this patch.
> 
>  
> Why do we want to remove inline X86 assembly. As I mentioned it will lead to larger binaries, slower binaries, and less optimized code.
>  
> For example take this simple inline assembly function:
>  
> VOID
> EFIAPI
> CpuBreakpoint (
>   VOID
>   )
> {
>   __asm__ __volatile__ ("int $3");
> }
>  
> 
> Today with clang LTO turned on calling CpuBreakpoint() looks like this from the callers point of view. 
>  
> a.out[0x1fa5] <+6>:  cc              int3   
>  
> 
> But if we move that to NASM:
>  
> 
> a.out[0x1fa6] <+7>:  e8 07 00 00 00  calll  0x1fb2                    ; CpuBreakpoint
>  
> 
> plus:
> a.out`CpuBreakpoint:
> a.out[0x1fb2] <+0>: cc     int3   
> a.out[0x1fb3] <+1>: c3     retl   
>  
> 
> And there is also an extra push and pop on the stack. The other issue is the call to the function that is unknown to the compiler will act like a _ReadWriteBarrier (Yikes I see _ReadWriteBarrier is deprecated in VC++ 2017). Is the depreciation of some of these intrinsics in VC++ driving the removal of inline assembly? For GCC inline assembly works great for local compile, and for clang LTO it works in entire program optimization.
>  
> The LTO bitcode includes inline assembly and constraints so that the optimizer knows what to do so it can get optimized just like the abstract bitcode during the Link Time Optimization. 
>  
> This is CpuBreakpoint():
> ; Function Attrs: noinline nounwind optnone ssp uwtable
> define void @CpuBreakpoint() #0 {
>   call void asm sideeffect "int $$3", "~{dirflag},~{fpsr},~{flags}"() #2, !srcloc !3
>   ret void
> }
>  
> 
> This is AsmReadMsr64():
> ; Function Attrs: noinline nounwind optnone ssp uwtable
> define i64 @AsmReadMsr64(i32) #0 {
>   %2 = alloca i32, align 4
>   %3 = alloca i64, align 8
>   store i32 %0, i32* %2, align 4
>   %4 = load i32, i32* %2, align 4
>   %5 = call i64 asm sideeffect "rdmsr", "=A,{cx},~{dirflag},~{fpsr},~{flags}"(i32 %4) #2, !srcloc !4
>   store i64 %5, i64* %3, align 8
>   %6 = load i64, i64* %3, align 8
>   ret i64 %6
> }
>  
> 
>  
> I agree it make sense to remove .S and .asm files and only have .nasm files. 
>  
> Thanks,
>  
> Andrew Fish
>  
> PS For the Xcode clang since it emits frame pointers by default we need to add the extra 4 bytes to save the assembly functions so the stack can get unwound. 
>  
> a.out`CpuBreakpoint:
> a.out[0x1f99] <+0>: 55     pushl  %ebp
> a.out[0x1f9a] <+1>: 89 e5  movl   %esp, %ebp
> a.out[0x1f9c] <+3>: cc     int3   
> a.out[0x1f9d] <+4>: 5d     popl   %ebp
> a.out[0x1f9e] <+5>: c3     retl   
>  
> 
> So breakpoint goes from 1 byte to 11 bytes if we get rid of the intrinsics. 
>  
> 
>  The change is to reduce the duplicated implementation. It will be good on the code maintain. Recently, one patch has to update .c and .nasm both. Please see https://lists.01.org/pipermail/edk2-devel/2019-February/037165.html <https://lists.01.org/pipermail/edk2-devel/2019-February/037165.html>. Beside this change, I propose to remove all GAS assembly code for IA32 and X64 arch. After those change, the patch owner will collect the impact of the image size. 
> 
> Thanks
> Liming



  reply	other threads:[~2019-03-07  1:31 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-05  1:40 [PATCH 0/3] MdePkg: Remove inline X86 assembly code Shenglei Zhang
2019-03-05  1:40 ` [PATCH 1/3] MdePkg/BaseCpuLib: " Shenglei Zhang
2019-03-05  1:40 ` [PATCH 2/3] MdePkg/BaseLib: " Shenglei Zhang
2019-03-05  1:40 ` [PATCH 3/3] MdePkg/BaseSynchronizationLib: " Shenglei Zhang
2019-03-05 20:57   ` Andrew Fish
2019-03-05 21:32     ` Gao, Liming
2019-03-05 21:45       ` Kinney, Michael D
2019-03-05 22:43       ` Andrew Fish
2019-03-07  0:28         ` Gao, Liming
2019-03-07  0:41           ` Kinney, Michael D
2019-03-07  1:31             ` Andrew Fish [this message]
2019-03-07  5:06               ` Yao, Jiewen
2019-03-07  5:56                 ` Andrew Fish
2019-03-07  6:09                   ` Yao, Jiewen
2019-03-07  6:45                     ` Andrew Fish
2019-03-07  7:15                       ` Gao, Liming
2019-03-07  7:25                         ` Yao, Jiewen
2019-03-09  0:07                           ` Kinney, Michael D
2019-03-09  0:15                             ` Andrew Fish
2019-03-14 22:27                               ` Gao, Liming
2019-03-05  2:10 ` [PATCH 0/3] MdePkg: " Gao, Liming

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47E8FE4A-4BDC-48A3-B7E6-02F1682C0990@apple.com \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox