From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received-SPF: Pass (sender SPF authorized) identity=mailfrom; client-ip=66.163.191.203; helo=sonic312-22.consmr.mail.ne1.yahoo.com; envelope-from=zenith432@users.sourceforge.net; receiver=edk2-devel@lists.01.org Received: from sonic312-22.consmr.mail.ne1.yahoo.com (sonic312-22.consmr.mail.ne1.yahoo.com [66.163.191.203]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id D304D211D4D4B for ; Tue, 12 Jun 2018 12:17:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1528831047; bh=8wn2kZbAM0a9Kx7gWPDDzm5aMowJlG/uRJx06b49YLU=; h=Date:From:Reply-To:To:Cc:Subject:References:From:Subject; b=nffbJhhqPn7UpGf+QviAf+4zz4NkLiExw2IAP7KIdO0T6tCYolyjMVmt4MP+cVCBCQnhkMi+Z0Q/IEEzr19wbMkUvKos4Cvr/SmPE8TP/1POJ90XHaM9hyEiRWMuMn3Rhr6s9QEAezrmHpWg/8vlPhXF1bmnDyxrvskgQt0Am7h92cAg2yQXGEQyL56LW8yeXOYSx8CggvbCpnaIlo7fWcdA+vcghTb9umytp82U5t4LfIcPTLaqM1etm0KMW7Eucgr+Eye3vebDcTuXlpJyBkuVGMGoiPa3hXKGfXjDy8WG/PoKPOvq68ZNJwKfguOPPoHZgLG3qZzkmU55ijmFYw== X-YMail-OSG: rvzjLkcVM1k1obDPYjus6cZrLoYYavtl5U4KkSECnwb.FWqnD7kitDGrPlxstzO 7tRfQjfTcSakXDwMrq4DFxeqEU_EHKS.sNN10xnQ7n0dMFvwOk0t_NR8nDnZf7gan4KaBLpdDaQr deyBGUxRn7BZQnjQbh8CzZQSpPUTFcz4qvMbeOdqgwmLd74SJZB5NpdOvYUdYRlP7yH6LE34pdUI _xWZMLtd.mnFmbS2m9nhZm8tAb.ZoLJzI0YsYu0CLx5np7poeqRept9q6qvCL3D3tTpYKLph_vkL C5W..S2KpKA25QN_I4skp0F7sJigSuSiqGzNGy7LjQajXODAWLkFxOQp8ldoAeXCYyVPCsFTndDH .TCHlKoTmutFGPUDYshX_9c0WFMb0INFhOedZ_Ukw1TBDj_7JMfEC2TgxzuxKApQzyUzQwcjsbfv 1jW_uwaBISGAPlZFCqYx2qxpfE7Um9rwYK8j6gsxKyB059rNg5YKpaokluf5lN1q_SiyQaMCUDcj NA0jJtAvWwxK9F65q_RywSVyB37gdjpRE_T3rpVj5AZK6tCiPSb1Fl_b4wUxlrX.MJIlZ8qk1j6W FWZIvEe2xi5wbpWUqR22ryqvYFrgn_7TQTQlvS_GzIv9Cw7dCO0NIuS7PUQfREMF_qr0gWFlqjt0 ud2AptAG8Yj0wjmjXwXB_ZhhnaeOYl5jzp1_m80y2B1qn62ok30U4xy7aumvXPdEn6iDqdA4pJvd 1wTvbgX1YPvM6cel7bDoCzz_UNtzZDjjXmkqfj9lwDmq_CkaVn.1_5wQ4lCWY4eJEFLzV4k.Vo0t ChbNe65_Tj2MoI9.DJjCgtDtVxglR1nqrMn5O2y.aNJjQEC5C9cUZtgNEtZL6XaBgsGWZXJCzoHL _aIjO4R3dm6.fMjYPmcODyf9CIs1BBZ6cq21qbWnUWHj0Iv7V4JuuBbvK0sBX Received: from sonic.gate.mail.ne1.yahoo.com by sonic312.consmr.mail.ne1.yahoo.com with HTTP; Tue, 12 Jun 2018 19:17:27 +0000 Date: Tue, 12 Jun 2018 19:17:26 +0000 (UTC) From: Zenith432 Reply-To: Zenith432 To: Laszlo Ersek Cc: Message-ID: <1142041495.4269416.1528831046054@mail.yahoo.com> MIME-Version: 1.0 References: <1142041495.4269416.1528831046054.ref@mail.yahoo.com> X-Mailer: WebService/1.1.11950 YahooMailBasic Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:60.0) Gecko/20100101 Firefox/60.0 Subject: Re: [RFC PATCH 00/11] GCC/X64: use hidden visibility for LTO PIE code X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Jun 2018 19:17:29 -0000 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable > Absolute symbol references such as? > References to fixed (constant) > addresses? =20 Pointers stored in the .data section.=C2=A0 For example, if you have an arr= ay of const char*. >=C2=A0 Why is that approach optimal? As few > relocations records are required as > possible? small pic model is optimal for AMD64 executables or shared libraries that a= re < 2GB in size, but need to be relocatable to any address in the 64-bit a= ddress space.=C2=A0 It generates the most compact code due to use of PC-rel= ative jumps, calls and effective address calculations. Technically, the small model is potentially more compact, but the sysv AMD6= 4 ABI requires small model programs to fit in the lowest 2GB of the address= space.=C2=A0 EFI binaries load in the lower 4GB but not necessarily lower = 2GB. =20 >=C2=A0 Why don't preemptible symbols make > sense for PIE? > (My apologies if I'm disturbingly > ignorant about this and the question > doesn't even make sense.) =20 They do of course.=C2=A0 The small pie model is a GCC extension not documen= ted in sysv AMD64 ABI and it has a wierd characteristic that it assumes all= external symbols are reachable directly and not via the GOT (=3D are not s= ubject to being dymanically linked to.) small pic model - formalized in sysv AMD64 ABI and mandates access to exter= n symbols via the GOT or PLT. small pie model - a GCC extension that permits the code generator to elide = the GOT, but does not mandate that the code generator elide the GOT. Contrary to conventional wisdom - using the GOT can reduce code size when d= oing pointer arithmetic on the address of an external symbol, or pushing th= e address of an external symbol on the stack to be passed as a function arg= ument.=C2=A0 See my response here to Andrew Fish. https://lists.01.org/pipermail/edk2-devel/2018-June/025710.html As a result, GCC sometimes emits GOT loads for external symbols in the smal= l pie model on AMD64. There is an attribute __attribute__((visibility("hidden"))) that can be att= ached to external symbol declarations and tell the code generator "do not a= ssume this symbol has a GOT entry" - effectively eliminating GOT loads. The pragma mentioned by Ard Biesheuvel turns the attribute on wholesale to = all symbols in sections of source files affected by it. > So... Given this behavior, why is it a > problem for us? What are the bad > symptoms? What is currently broken? =20 Ard Biesheuvel CCed a lot of people that didn't get the private communicati= on about this.=C2=A0 As a continuation to the message above, I sent out an = email detailing what happens in the GCC5 toolchain with LTO enabled and a s= tandalone Shell App that demonstrates how today the GCC5 toolchain on X64 c= an still omit GOT loads into the ELF executable that are not handled by Gen= Fw.=C2=A0 Below is my email.=C2=A0 The standalone test case can be download= ed from here http://www.mediafire.com/file/wkc6bcj17401f4c/GccGOTEmitter.zip/file =3D=3D=3D=3D=3D [quoted email] > I figured out what's going on with LTO build in GCC5 that is compiled wit= h -Os -flto -DUSING_LTO and does not use visibility #pragma. > > When compiling with LTO enabled, what happens is that all C source files = are transformed during compilation stage to LTO intermediate bytecode (gimp= le in GCC). >=20 > Then when static link (ld) takes place, all LTO intermediate bytecode is = sent back to compiler code-generation backend to have machine code generate= d for it as if all the source code is one big C source file ("whole program= optimization"). >=20 > As a result of this, all the extern symbols become local symbols !=C2=A0 = like file-level static.=C2=A0 Because it's as if all the code is in one big= source file.=C2=A0 Since there is no dynamic linking, there are no more "e= xtern", and all symbols are like file-level static and treated the same. >=20 > This is why the LTO build stops emitting GOT loads for size-optimization = purposes.=C2=A0 GCC doesn't emit GOT loads for file-level static, and in LT= O build they're all like that - so no GOT loads. >=20 > But there is still something that fouls this up... >=20 > If an extern symbol is defined in assembly source file. >=20 > Because assembly source files don't participate in LTO.=C2=A0 They are tr= ansformed by assembler into X64 machine code.=C2=A0 During ld, any extern s= ymbol that is defined in an assembly source file and declared and used by C= source file is treated as before like external symbol.=C2=A0 Which means c= ode generator can go back to its practice of emitting GOT loads if they red= uce code size. >=20 > I'm attaching a standalone example of this coded as a UEFI shell applicat= ion. >=20 > - Unpack it to edk2/GccGOTEmitter. >=20 > - Add it to ShellPkg/ShellPkg.dsc so it can be built. > diff --git a/ShellPkg/ShellPkg.dsc b/ShellPkg/ShellPkg.dsc > --- a/ShellPkg/ShellPkg.dsc > +++ b/ShellPkg/ShellPkg.dsc > @@ -134,6 +134,7 @@ >=C2=A0 =C2=A0 >=C2=A0 =C2=A0 =C2=A0 PerformanceLib|MdeModulePkg/Library/DxeSmmPerformanc= eLib/DxeSmmPerformanceLib.inf >=C2=A0 } > +=C2=A0 GccGOTEmitter/GccGOTEmitter.inf >=20 > [BuildOptions] >=C2=A0 *_*_*_CC_FLAGS =3D -D DISABLE_NEW_DEPRECATED_INTERFACES >=20 > - Build with > build -a X64 -b RELEASE -m GccGOTEmitter/GccGOTEmitter.inf -p ShellPkg/Sh= ellPkg.dsc -t GCC5 >=20 > - Result: > "GenFw" -e UEFI_APPLICATION -o /media/Dev/edk2/Build/Shell/RELEASE_GCC5/X= 64/GccGOTEmitter/GccGOTEmitter/DEBUG/GccGOTEmitter.efi /media/Dev/edk2/Buil= d/Shell/RELEASE_GCC5/X64/GccGOTEmitter/GccGOTEmitter/DEBUG/GccGOTEmitter.dl= l make: *** [GNUmakefile:367: /media/Dev/edk2/Build/Shell/RELEASE_GCC5/X64/Gc= cGOTEmitter/GccGOTEmitter/DEBUG/GccGOTEmitter.efi] Error 2 > GenFw: ERROR 3000: Invalid >=C2=A0 /media/Dev/edk2/Build/Shell/RELEASE_GCC5/X64/GccGOTEmitter/GccGOTE= mitter/DEBUG/GccGOTEmitter.dll unsupported ELF EM_X86_64 relocation 0x2a. > GenFw: ERROR 3000: Invalid >=C2=A0 /media/Dev/edk2/Build/Shell/RELEASE_GCC5/X64/GccGOTEmitter/GccGOTE= mitter/DEBUG/GccGOTEmitter.dll unsupported ELF EM_X86_64 relocation 0x2a. >=20 > relocation 0x2a is R_X86_64_REX_GOTPCRELX which is emitted as part of add= q instruction into the GOT in order to implement the pointer arithmetic wit= h slightly smaller code. >=20 > There are 2 possible resolutions to this. > - One is to add the X64 GOTPCREL support to GenFw. > - The other is to document somewhere that if >=C2=A0 -- An external symbol is defined in assembly code. >=C2=A0 -- The symbol is declared and used in C code. >=C2=A0 -- The C code uses pointer arithmetic on the external symbol or pa= sses it as a function argument. >=C2=A0 -- Then the external symbol should be declared as "__attribute__((= visibility("hidden")))"=C2=A0 in the C code. >=20 > Note that the 2nd resolution also works in the sample - if the attribute = is put on ThunksBase declaration.