From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id C28F081C62 for ; Sat, 3 Dec 2016 09:59:33 -0800 (PST) Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 0718980F6B; Sat, 3 Dec 2016 17:59:33 +0000 (UTC) Received: from lacos-laptop-7.usersys.redhat.com (ovpn-116-45.phx2.redhat.com [10.3.116.45]) by int-mx09.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id uB3HxTut031359; Sat, 3 Dec 2016 12:59:30 -0500 To: Anthony PERARD References: <20161201152819.8341-1-anthony.perard@citrix.com> <53c67cb5-e947-8979-7738-288cc83f374b@redhat.com> <20161202160201.GA1848@perard.uk.xensource.com> <9c0c9f43-a297-179f-2d57-fa5d8fab3763@redhat.com> Cc: Jordan Justen , "Gao, Liming" , "Zhu, Yonghong" , Ard Biesheuvel , edk2-devel@ml01.01.org From: Laszlo Ersek Message-ID: Date: Sat, 3 Dec 2016 18:59:28 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.5.1 MIME-Version: 1.0 In-Reply-To: <9c0c9f43-a297-179f-2d57-fa5d8fab3763@redhat.com> X-Scanned-By: MIMEDefang 2.68 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Sat, 03 Dec 2016 17:59:33 +0000 (UTC) Subject: Re: [PATCH 0/4] Fix runtime issue in XenBusDxe when compiled with GCC5 X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 03 Dec 2016 17:59:33 -0000 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit On 12/02/16 20:26, Laszlo Ersek wrote: > On 12/02/16 17:02, Anthony PERARD wrote: >> On Thu, Dec 01, 2016 at 07:43:24PM +0100, Laszlo Ersek wrote: >>> On 12/01/16 16:28, Anthony PERARD wrote: >>>> Hi, >>>> >>>> That might be only with the Xen part of OVMF but now that the GCC5 >>>> toolchains is used with my gcc (6.2.1 20160830, Arch Linux), OVMF fail >>>> to boot in Xen guests. >>>> >> [...] >>>> >>>> Removing the gcc option -flto in only the XenBusDxe module makes OVMF >>>> boot. >>>> >>>> While trying to debug that, I've added some debug prints (in this module >>>> and in XenPvBlkDxe), and the exception could change and become a "page >>>> fault" instead, or even an assert failure in the PrintLib, that was the >>>> ASSERT(Buffer != NULL) at I think >>>> MdePkg/Library/BasePrintLib/PrintLibInternal.c:366 >>>> >>>> Adding EFIAPI to internal functions in XenBusDxe makes things work >>>> again. My guest is that gcc would bypass (optimise) an exported >>>> functions and call directly an internal one but without reordering the >>>> arguments (EFIAPI vs nothing). >>>> >>>> Does that make sense? >>> >>> If "-b NOOPT" works for you, I'd prefer that as a temporary solution >>> (until the root cause is found and addressed) to the XenBusDxe patches. >> >> That works, using GCC49 (with gcc 6.2.1) works as well. >> >>> Hrpmf, wait a second, I do see something interesting: in this series you >>> *are* modifying APIs declared in a library class header (namely >>> "OvmfPkg/Include/Library/XenHypercallLib.h"). Such functions (public >>> libraries) *are* required to specify EFIAPI. >>> >>> What happens if you apply patch #1 only? >> >> With only XenHypercallLib changes, the error is the same. >> >> But I did find the minimum change needed, it envolve a function with a >> VA_LIST as argument. >> >> With only the following patch, OVMF works again. >> >> diff --git a/OvmfPkg/XenBusDxe/XenStore.c b/OvmfPkg/XenBusDxe/XenStore.c >> index 1666c4b..85b0956 100644 >> --- a/OvmfPkg/XenBusDxe/XenStore.c >> +++ b/OvmfPkg/XenBusDxe/XenStore.c >> @@ -1307,6 +1307,7 @@ XenStoreTransactionEnd ( >> } >> >> XENSTORE_STATUS >> +EFIAPI >> XenStoreVSPrint ( >> IN CONST XENSTORE_TRANSACTION *Transaction, >> IN CONST CHAR8 *DirectoryPath, >> diff --git a/OvmfPkg/XenBusDxe/XenStore.h b/OvmfPkg/XenBusDxe/XenStore.h >> index c9d4c65..33bb647 100644 >> --- a/OvmfPkg/XenBusDxe/XenStore.h >> +++ b/OvmfPkg/XenBusDxe/XenStore.h >> @@ -209,6 +209,7 @@ XenStoreSPrint ( >> indicating the type of write failure. >> **/ >> XENSTORE_STATUS >> +EFIAPI >> XenStoreVSPrint ( >> IN CONST XENSTORE_TRANSACTION *Transaction, >> IN CONST CHAR8 *DirectoryPath, >> IN CONST CHAR8 *Node, >> IN CONST CHAR8 *FormatString, >> IN VA_LIST Marker >> ); >> >> >> I think the exception happen when this function is called via >> XENBUS_PROTOCOL->XsPrintf() from XenPvBlockFrontInitialization() in >> OvmfPkg/XenPvBlkDxe/BlockFront.c >> > > It used to be a known requirement / limitation that all functions with > variable argument lists had to be EFIAPI, regardless of cross-module > use. However, commit 48d5f9a551a93acb45f272dda879b0ab5a504e36 changed > that, and varargs should "just work" now. I suspect this is a > __builtin_ms_va_* regression in gcc-6. Thank you for narrowing it down. > It might make sense to report a bug in the upstream gcc tracker. > > ... Oh wow, this is a known gcc bug! See: > > https://lists.01.org/pipermail/edk2-devel/2016-August/001018.html > > Upstream gcc BZ was > apparently solved for "Target Milestone: 6.3" (your version is 6.2.1). > So we'll either need a GCC6 toolchain in BaseTools that drops -flto, in > order to work around this gcc issue, or we'll have to ask gcc-6 users to > use at least gcc-6.3. > > Oh wait, gcc-6.3 hasn't been released yet. We need the BaseTools > workaround then. I think I got confused in parts of the above; I got some details wrong. Namely, commit 48d5f9a551a9 did not remove the requirement/limitation that all varargs functions have to be EFIAPI. Said commit only changed how the VA_*() macros would be implemented. The two caller functions of XenStoreVSPrint(), namely XenStoreSPrint() and XenBusXenStoreSPrint(), are varargs functions, but they are already EFIAPI. So the requirement/limitation (which was unaffected by 48d5f9a551a9) is actually satisfied / considered in XenBusDxe. The XenStoreVSPrint() function, which you identified as the breaking part, is *not* a varargs function itself, so it needn't be EFIAPI. It simply receives a VA_LIST parameter (which is "__builtin_ms_va_list", from commit 48d5f9a551a9), and (a) copies it with VA_COPY() for passing the copy to SPrintLengthAsciiFormat(), (b) passes the original parameter to AsciiVSPrint(). In turn both of those functions call the common BasePrintLibSPrintMarker() function. Comment says, > This is bug report that the specialized > __builtin_ms_va_{list,start,end,copy} builtins have stopped working > when -flto is used. They worked until gcc 5.3, both with or without > -flto. In gcc 6.1 with -flto, the canonical iterator __builtin_va_arg > ignores them and works on a sysv_va_list. To be precise, it's > __builtin_va_arg in the context of -flto that's broken, not the > specialized builtins. __builtin_va_arg has always been a polymorphic > builtin that changes its behavior based on the type of va_list it's > given as an argument. Without this polymorphic behavior, there's no > way to iterate over an ms_va_list. Apparently, when BasePrintLibSPrintMarker() finally calls VA_ARG() (== __builtin_va_arg(), from commit 48d5f9a551a9) on Marker / Marker2, with LTO enabled, __builtin_va_arg() fails to notice what context VaListMarker comes from: - __builtin_ms_va_start() in XenStoreSPrint() and XenBusXenStoreSPrint(), or - __builtin_ms_va_copy() in XenStoreVSPrint(). So I think we *are* being hit by gcc BZ#70955, and making XenStoreVSPrint() EFIAPI only masks the issue. Comment seems relevant: > The change with GCC 6 is that the builtins are now lowered during > link-time optimization rather than at compile-time. Thus the abi > selection bits are possibly not transfered correctly (type merging?). > I remember the business was quite ugly, but eventually we just miss to > properly transfer the function attribute. The end result for edk2 remains the same (= BaseTools should work around this gcc issue with a new GCC6 toolchain that drops -flto, unless gcc-6.3 is about to become available to users real quick). I just wanted to point out that my earlier statement "commit 48d5f9a551a9 had removed the need for varargs functions to be EFIAPI" was incorrect -- varargs functions still must be EFIAPI (and XenBusDxe conforms, see XenStoreSPrint() and XenBusXenStoreSPrint()). Thanks Laszlo