From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id A43AE21D490EE for ; Thu, 10 Aug 2017 22:26:31 -0700 (PDT) Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 10 Aug 2017 22:28:51 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.41,356,1498546800"; d="scan'208,217";a="122405146" Received: from fmsmsx106.amr.corp.intel.com ([10.18.124.204]) by orsmga002.jf.intel.com with ESMTP; 10 Aug 2017 22:28:50 -0700 Received: from fmsmsx117.amr.corp.intel.com (10.18.116.17) by FMSMSX106.amr.corp.intel.com (10.18.124.204) with Microsoft SMTP Server (TLS) id 14.3.319.2; Thu, 10 Aug 2017 22:28:50 -0700 Received: from shsmsx152.ccr.corp.intel.com (10.239.6.52) by fmsmsx117.amr.corp.intel.com (10.18.116.17) with Microsoft SMTP Server (TLS) id 14.3.319.2; Thu, 10 Aug 2017 22:28:49 -0700 Received: from shsmsx102.ccr.corp.intel.com ([169.254.2.183]) by SHSMSX152.ccr.corp.intel.com ([169.254.6.135]) with mapi id 14.03.0319.002; Fri, 11 Aug 2017 13:28:47 +0800 From: "Shi, Steven" To: Laszlo Ersek , edk2-devel-01 CC: Ard Biesheuvel , "Justen, Jordan L" , "Gao, Liming" , "Kinney, Michael D" Thread-Topic: [edk2] [PATCH 1/1] BaseTools/tools_def.template: revert to large code model for X64/GCC5/LTO Thread-Index: AQHTEjmj11tzzVXSNkSM0oM1tV6CHqJ+mHLg Date: Fri, 11 Aug 2017 05:28:46 +0000 Message-ID: <06C8AB66E78EE34A949939824ABE2B313B560EB2@shsmsx102.ccr.corp.intel.com> References: <20170811003426.2332-1-lersek@redhat.com> <20170811003426.2332-2-lersek@redhat.com> In-Reply-To: <20170811003426.2332-2-lersek@redhat.com> Accept-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-version: 10.0.102.7 dlp-reaction: no-action x-originating-ip: [10.239.127.40] MIME-Version: 1.0 X-Content-Filtered-By: Mailman/MimeDel 2.1.22 Subject: Re: [PATCH 1/1] BaseTools/tools_def.template: revert to large code model for X64/GCC5/LTO X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Aug 2017 05:26:32 -0000 Content-Language: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Hi Laszlo, I'm trying to reproduce your boot failure with OVMF in my Ubuntu system, bu= t not succeed. My GCC was built from GCC main trunk code in 20170601, and = my ld linker is version 2.28. Could you try the ld 2.28 with your gcc-7.1 a= nd check whether it works in your side? Or, do you know where can I downlo= ad the gcc-7.1 pre-built binaries? jshi19@jshi19-desktop:~/wksp_efi/Laszlo/edk2$ gcc --version gcc (GCC) 8.0.0 20170601 (experimental) jshi19@jshi19-desktop:~/wksp_efi/Laszlo/edk2$ ld --version GNU ld (GNU Binutils) 2.28 Below are my trying steps, and the Qemu can boot well into shell: jshi19@jshi19-desktop:~/wksp_efi$ git clone https://github.com/lersek/edk2.= git jshi19@jshi19-desktop:~/wksp_efi/Laszlo/edk2$ git checkout gcc7_lto jshi19@jshi19-desktop:~/wksp_efi/Laszlo/edk2$ git checkout 7ef0dae092afcfb6= fab7e8372c78097672168c4a jshi19@jshi19-desktop:~/wksp_efi/Laszlo/edk2$ rm Build -r jshi19@jshi19-desktop:~/wksp_efi/Laszlo/edk2$ rm Conf/tools_def.txt jshi19@jshi19-desktop:~/wksp_efi/Laszlo/edk2$ rm Conf/target.txt jshi19@jshi19-desktop:~/wksp_efi/Laszlo/edk2$ rm Conf/build_rule.txt jshi19@jshi19-desktop:~/wksp_efi/Laszlo/edk2$ source edksetup.sh jshi19@jshi19-desktop:~/wksp_efi/Laszlo/edk2$ make -C BaseTools/Source/C/ c= lean jshi19@jshi19-desktop:~/wksp_efi/Laszlo/edk2$ make -C BaseTools/Source/C/ jshi19@jshi19-desktop:~/wksp_efi/Laszlo/edk2$ build -p OvmfPkg/OvmfPkgX64.d= sc -t GCC5 -DSECURE_BOOT_ENABLE=3DTRUE -a X64 -b DEBUG -DDEBUG_ON_SERIAL_P= ORT -n 5 jshi19@jshi19-desktop:~/wksp_efi/Laszlo/edk2$ build -p OvmfPkg/OvmfPkgIa32X= 64.dsc -t GCC5 -DSECURE_BOOT_ENABLE=3DTRUE -a IA32 -a X64 -b DEBUG -DDEBUG= _ON_SERIAL_PORT -n 5 jshi19@jshi19-desktop:~/wksp_efi/Laszlo/edk2/Build/OvmfX64/DEBUG_GCC5/FV$ q= emu-system-x86_64 -bios OVMF.fd -serial file:serial.log -m 512 -hda fat:. = -monitor stdio --enable-kvm jshi19@jshi19-desktop:~/wksp_efi/Laszlo/edk2/Build/Ovmf3264/DEBUG_GCC5/FV$ = qemu-system-x86_64 -bios OVMF.fd -serial file:serial.log -m 512 -hda fat:.= -monitor stdio --enable-kvm Steven Shi Intel\SSG\STO\UEFI Firmware Tel: +86 021-61166522 iNet: 821-6522 > -----Original Message----- > From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of > Laszlo Ersek > Sent: Friday, August 11, 2017 8:34 AM > To: edk2-devel-01 > Cc: Ard Biesheuvel ; Justen, Jordan L > ; Gao, Liming ; Kinney, > Michael D > Subject: [edk2] [PATCH 1/1] BaseTools/tools_def.template: revert to large > code model for X64/GCC5/LTO > > Several users have recently reported boot failures with OVMF. The symptom= s > are: blank screen and all VCPUs pegged at 100%. Alex Williamson has found > the commit (via bisection) that exposes the issue. In this patch I'll > attempt to analyze the symptoms and fix the root problem. > > (1) "UefiCpuPkg/Library/MpInitLib" is an IA32/X64 library instance that > provides multiprocessing services. It is built into the CpuMpPei > module (for the PEI phase) and the CpuDxe module (for the DXE and BDS > phases). When either of these modules starts up during boot, it > briefly boots up all of the CPUs, perfoms some counting / > synchronization, and then all the APs (application processors) are > quiesced until further work arrives. > > The APs boot in real mode, and need assembly language init code (a > "reset vector") that gradually takes them into 64-bit mode, before > they can call back into normal C code. The reset vector code is > assembled at build time, but it is copied as data under 1MB (for real > mode execution by the APs) during boot, whenever CpuMpPei or CpuDxe > launches. > > Part of the reset vector code is FPU initialization. The reset vector > in "UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm" calls the assembly > language function InitializeFloatingPointUnits(), which is however > provided by an external library. Normally, when the object files were > linked together, this function call would result in an absolute > reference (and matching relocation, performed at module loading), and > the instance of the reset vector that was copied under 1MB during boo= t > would refer to the same, unchanged, absolute address of > InitializeFloatingPointUnits(), residing in CpuMpPei or CpuDxe. > > This didn't work with X64 XCODE5 builds however. XCODE5 prefers > RIP-relative addressing for X64 (i.e., the assembly language function > call depended on the distance between the call site and the callee). > When the call site was copied under 1MB but > InitializeFloatingPointUnits() would not move, the call broke. > > This was tracked in TianoCore BZ#565 and fixed in commit 3b2928b46987 > ("UefiCpuPkg/MpInitLib: Fix X64 XCODE5/NASM compatibility issues", > 2017-05-17). The solution was to add another function pointer to the > MP_CPU_EXCHANGE_INFO communication area. (The function pointer > would > have size 4 in Ia32 builds, and size 8 in X64 builds.) MpInitLib woul= d > assign the absolute address of InitializeFloatingPointUnits() to the > new field, and the APs would retrieve it -- matching the pointer size > correctly --, and call it. The C assignment can be seen in the > following hunk (from commit 3b2928b46987): > > > diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.c > b/UefiCpuPkg/Library/MpInitLib/MpLib.c > > index 407c44c95e62..735e099b32f2 100644 > > --- a/UefiCpuPkg/Library/MpInitLib/MpLib.c > > +++ b/UefiCpuPkg/Library/MpInitLib/MpLib.c > > @@ -751,6 +751,8 @@ FillExchangeInfoData ( > > > > ExchangeInfo->EnableExecuteDisable =3D IsBspExecuteDisableEnabled ()= ; > > > > + ExchangeInfo->InitializeFloatingPointUnitsAddress =3D > (UINTN)InitializeFloatingPointUnits; > > + > > // > > // Get the BSP's data of GDT and IDT > > // > > This moved the relocation of the InitializeFloatingPointUnits() > reference (or, with XCODE5, the RIP-relative > InitializeFloatingPointUnits() reference) from the assembly code > (which is copied around) to the C code (which is not copied around). > > (2) Unfortunately, the C-language assignment to > "MP_CPU_EXCHANGE_INFO.InitializeFloatingPointUnitsAddress" is > miscompiled under the following circumstances: > > - the edk2 build target is DEBUG (or RELEASE), > - the edk2 toolchain selection is GCC5, > - (from the above two it follows that -Os and LTO are enabled,) > - MpInitLib is built for X64, > - and the compiler is gcc-7.1 (shipped in Fedora-26 e.g.). > > (If the PEI phase is 64-bit, then MpInitLib breaks in CpuMpPei. If th= e > PEI phase is 32-bit and the DXE phase is 64-bit, then MpInitLib break= s > in CpuDxe. If the DXE phase is 32-bit, then nothing breaks.) > > Below I'll use the NOOPT/GCC5/X64/gcc-7.1 build of CpuMpPei to show > the correct assembly code generated by gcc for the assignment, and > flip the build target to DEBUG for showing the broken assembly code. > > (3) Correct code for the assignment: > > > 5406: 48 8d 15 73 24 00 00 lea 0x2473(%rip),%rdx = # 7880 > > > 540d: 48 8b 45 f8 mov -0x8(%rbp),%rax > > 5411: 48 89 90 84 00 00 00 mov %rdx,0x84(%rax) > > Here the absolute address of InitializeFloatingPointUnits() is loaded > into RDX with RIP-relative addressing. Then RAX is pointed to the > ExchangeInfo structure, and finally RDX is stored into > ExchangeInfo->InitializeFloatingPointUnitsAddress (the field offset i= s > 0x84 in the structure). > > Here the only relocation happens when CpuMpPei is linked. The distanc= e > between the call site and the callee is calculated and encoded in the > LEA instruction. The same distance is valid when CpuMpPei runs, so th= e > PEI core doesn't have to relocate the LEA instruction when it loads > CpuMpPei. > > (4) Incorrect code for the assignment: > > > 2d69: 48 c7 c2 60 58 00 00 mov $0x5860,%rdx > > ... > > 2d7b: 48 89 95 84 00 00 00 mov %rdx,0x84(%rbp) > > ... > > 0000000000005860 : > > 5860: 9b db e3 finit > > This is not RIP-relative addressing. Therefore gcc generates a > relocation record for the address encoded in the MOV instruction, at > offset 0x2d69+0x3 -- initial value being 0x5860: > > > Sections: > > Idx Name Size VMA LMA File of= f Algn > > 0 .text 00006e66 0000000000000240 0000000000000240 000000c= 0 > 2**6 > > CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE > > ... > > RELOCATION RECORDS FOR [.text]: > > OFFSET TYPE VALUE > > ... > > 0000000000002b2c R_X86_64_32S InitializeFloatingPointUnits > > (The offset of the address to relocate is 0x2d69+3 =3D=3D 0x2d6c, whi= ch > equals the .text section base plus the relocation offset, i.e., > 0x240+0x2b2c.) > > This is the only R_X86_64_32S relocation record in the linked-togethe= r > CpuMpPei.debug file (still an ELF file). In the rest of the build > process, the GenFw utility translates CpuMpPei.dll to a PE/COFF image > (CpuMpPei.efi), and converts the ELF relocation record to a PE/COFF > relocation record (of type EFI_IMAGE_REL_BASED_HIGHLOW). > > The EFI_IMAGE_REL_BASED_HIGHLOW PE/COFF relocation is acted upon > first > when CpuMpPei.efi is built into the PEIFV volume, since CpuMpPei.efi > could theoretically execute from flash, and PEIFV has a nonzero base > address. (However, CpuMpPei has a DEPEX on permanent PEI RAM > discovery, so the PEI core will only launch it from permanent PEI RAM= , > in practice.) > > The EFI_IMAGE_REL_BASED_HIGHLOW PE/COFF relocation is acted upon for > the second time when the PEI core loads CpuMpPei into permanent PEI > RAM, and performs the relocations. I confirmed (with debug messages) > that the relocated address (within the instruction) is correct: > > > PeiCore:PeCoffLoaderRelocateImage: Fixup32=3DBFF25D6C FixupData=3D0 > > PeiCore:PeCoffLoaderRelocateImage: *Fixup32=3D0x8487A0 > Adjust=3D0xBF6E00C0 > > Loading PEIM at 0x000BFF23000 EntryPoint=3D0x000BFF271BB CpuMpPei.efi > > The "Fixup32" pointer points to the address constant that should be > relocated (0xBFF25D6C - 0xBFF23000 =3D=3D 0x2D6C). The initial value = of > (*Fixup32) is 0x8487A0, which comes from the composition of PEIFV tha= t > I described above (as "first relocation"). > > After incrementing (*Fixup32) by "Adjust", the final value is > 0xBFF28860. This is the absolute address that is patched into the MOV > instruction that sets RDX for the assignment. This value is correct; > the InitializeFloatingPointUnits() function really exists in CpuMpPei > at this address, in permanent PEI RAM. It is at an offset of 0x5860 > from the load address 0x000BFF23000 (see third line). > > So where do things go wrong? Again, the MOV instruction itself is > wrong. I modified the MpInitLib C code to print the value of the fiel= d > right after the assignment, and I got: > > > AP Loop Mode is 1 > > WakeupBufferStart =3D 9F000, WakeupBufferSize =3D 1000 > > CpuMpPei:FillExchangeInfoData: InitializeFloatingPointUnits @ > 0xFFFFFFFFBFF28860 > > The APs are ejected into outer space when they try to init their FPUs > -- there is nothing at 0xFFFFFFFF_BFF28860; the assignment should > store 0x00000000_BFF28860! > > (5) So open the Intel SDM, and decode the instruction by hand: > > > 2d69: 48 c7 c2 60 58 00 00 mov $0x5860,%rdx > > 0x48 stands for the REX.W prefix (0x40 for REX, plus bit 3 for .W). > Looking up REX.W + 0xc7 (the next byte) under the MOV specification, > we get: > - opcode: REX.W + C7 /0 > - instruction: MOV r/m64, imm32 > - Description: Move imm32 sign extended to 64-bits to r/m64. > > "Sign extended". Our imm32 value, patched into the MOV, is 0xBFF28860 > -- the i440fx machine types may have up to 3GB of 32-bit RAM -- so th= e > sign bit is set. And MOV sign-extends 0xBFF28860 to > 0xFFFFFFFF_BFF28860 in the assignment to RDX. > > So... why on earth generates gcc this MOV instruction, with sign > extension? > > (6) It does because we told it to. From the gcc manual: > > > -mcmodel=3Dsmall > > Generate code for the small code model: the program and its symbols > > must be linked in the lower 2 GB of the address space. Pointers ar= e > > 64 bits. Programs can be statically or dynamically linked. This i= s > > the default code model. > > and please refer to the following commit (message slightly rewrapped > here): > > > commit f49513f666ed25d24bdf3a02a1fdb5d18ae081c0 > > Author: Ard Biesheuvel > > > Date: Sat Jul 16 00:16:11 2016 +0200 > > > > BaseTools/tools_def: switch GCC/X64 to the PIE small model > > > > The ordinary small code model for x86_64 cannot be used in UEFI, > > since it assumes the executable is loaded in the first 2 GB of > > memory. Therefore, we use the large model instead, which can execut= e > > anywhere, but uses absolute 64-bit wide quantities for all symbol > > references, which is costly in terms of code size. > > > > So switch to the PIE small code model, this uses 32-bit relative > > references where possible, but does not make any assumptions about > > the load address (i.e., all absolute symbol references are 64-bits > > wide). Note that, due to the 'protected' visibility pragma > > introduced in an earlier patch, there is no need for the EDK2 build > > system to deal with GOT related ELF relocation types. > > > > Contributed-under: TianoCore Contribution Agreement 1.0 > > Signed-off-by: Ard Biesheuvel > > > Reviewed-by: Jordan Justen > > > Tested-by: Laszlo Ersek > > > Tested-By: Liming Gao > > > Reviewed-by: Liming Gao > > > Apparently, gcc-7.1, with Link Time Optimization enabled, *does* make > assumptions about the load address, despite the fact that we didn't > just replace "-mcmodel=3Dlarge" with "-mcmodel=3Dsmall", but we also = added > "-fpie". > > (7) Commit f49513f666ed has been safe until the advent of gcc-7.1, so do > not revert it whole-sale. Instead, undo it only for the LTO build env > where it might cause problems, by appending "-fno-pie -mcmodel=3Dlarg= e". > > The assignment is then compiled like this: > > > 33d7: 48 ba 60 64 00 00 00 movabs $0x6460,%rdx > > 33de: 00 00 00 > > ... > > 33e5: 48 89 95 84 00 00 00 mov %rdx,0x84(%rbp) > > ... > > 0000000000006460 : > > And the matching relocation is (note the target offset > 0x240+0x3199=3D0x33d9): > > > Sections: > > Idx Name Size VMA LMA File of= f Algn > > 0 .text 00007c5e 0000000000000240 0000000000000240 000000c= 0 > 2**6 > > CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE > > ... > > RELOCATION RECORDS FOR [.text]: > > OFFSET TYPE VALUE > > ... > > 0000000000003199 R_X86_64_64 InitializeFloatingPointUnits > > (8) Size changes by this patch: > > $ build -a X64 -p OvmfPkg/OvmfPkgX64.dsc \ > -t GCC5 -b DEBUG \ > -D NETWORK_IP6_ENABLE -D HTTP_BOOT_ENABLE -D TLS_ENABLE > > Before: > > > SECFV [10%Full] 212992 total, 21616 used, 191376 free > > FVMAIN_COMPACT [43%Full] 3440640 total, 1498200 used, 1942440 free > > DXEFV [47%Full] 10485760 total, 4959448 used, 5526312 free > > PEIFV [21%Full] 917504 total, 194480 used, 723024 free > > After: > > > SECFV [11%Full] 212992 total, 23728 used, 189264 free > > FVMAIN_COMPACT [44%Full] 3440640 total, 1539200 used, 1901440 free > > DXEFV [54%Full] 10485760 total, 5741528 used, 4744232 free > > PEIFV [24%Full] 917504 total, 224304 used, 693200 free > > $ build -a IA32 -a X64 -p OvmfPkg/OvmfPkgIa32X64.dsc \ > -t GCC5 -b DEBUG \ > -D NETWORK_IP6_ENABLE -D HTTP_BOOT_ENABLE -D TLS_ENABLE \ > -D SECURE_BOOT_ENABLE -D SMM_REQUIRE > > Before: > > > SECFV [10%Full] 212992 total, 21536 used, 191456 free > > FVMAIN_COMPACT [48%Full] 3440640 total, 1682008 used, 1758632 free > > DXEFV [57%Full] 10485760 total, 6056504 used, 4429256 free > > PEIFV [22%Full] 917504 total, 202032 used, 715472 free > > After: > > > SECFV [10%Full] 212992 total, 21536 used, 191456 free > > FVMAIN_COMPACT [50%Full] 3440640 total, 1728096 used, 1712544 free > > DXEFV [66%Full] 10485760 total, 6979448 used, 3506312 free > > PEIFV [22%Full] 917504 total, 202032 used, 715472 free > > Cc: Alex Williamson > > Cc: Ard Biesheuvel > > Cc: Jordan Justen > > Cc: Liming Gao > > Cc: Michael Kinney > > Cc: Yonghong Zhu > > Contributed-under: TianoCore Contribution Agreement 1.1 > Signed-off-by: Laszlo Ersek > > --- > BaseTools/Conf/tools_def.template | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/BaseTools/Conf/tools_def.template > b/BaseTools/Conf/tools_def.template > index 1fa3ca3ceae9..2ef495540e5f 100755 > --- a/BaseTools/Conf/tools_def.template > +++ b/BaseTools/Conf/tools_def.template > @@ -5335,10 +5335,10 @@ RELEASE_GCC5_IA32_DLINK_FLAGS =3D > DEF(GCC5_IA32_X64_DLINK_FLAGS) -flto -Os -Wl, > *_GCC5_X64_OBJCOPY_FLAGS =3D > *_GCC5_X64_NASM_FLAGS =3D -f elf64 > > - DEBUG_GCC5_X64_CC_FLAGS =3D DEF(GCC5_X64_CC_FLAGS) -flto - > DUSING_LTO -Os > + DEBUG_GCC5_X64_CC_FLAGS =3D DEF(GCC5_X64_CC_FLAGS) -fno-pie - > mcmodel=3Dlarge -flto -DUSING_LTO -Os > DEBUG_GCC5_X64_DLINK_FLAGS =3D DEF(GCC5_X64_DLINK_FLAGS) -flto -Os > > -RELEASE_GCC5_X64_CC_FLAGS =3D DEF(GCC5_X64_CC_FLAGS) -flto - > DUSING_LTO -Os -Wno-unused-but-set-variable > +RELEASE_GCC5_X64_CC_FLAGS =3D DEF(GCC5_X64_CC_FLAGS) -fno-pie - > mcmodel=3Dlarge -flto -DUSING_LTO -Os -Wno-unused-but-set-variable > RELEASE_GCC5_X64_DLINK_FLAGS =3D DEF(GCC5_X64_DLINK_FLAGS) -flto -Os > > NOOPT_GCC5_X64_CC_FLAGS =3D DEF(GCC5_X64_CC_FLAGS) -O0 > -- > 2.13.1.3.g8be5a757fa67 > > _______________________________________________ > edk2-devel mailing list > edk2-devel@lists.01.org > https://lists.01.org/mailman/listinfo/edk2-devel