From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 35F9E81FD5 for ; Sun, 4 Dec 2016 18:55:12 -0800 (PST) Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga102.jf.intel.com with ESMTP; 04 Dec 2016 18:55:11 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.33,302,1477983600"; d="scan'208";a="908677651" Received: from fmsmsx104.amr.corp.intel.com ([10.18.124.202]) by orsmga003.jf.intel.com with ESMTP; 04 Dec 2016 18:55:11 -0800 Received: from fmsmsx115.amr.corp.intel.com (10.18.116.19) by fmsmsx104.amr.corp.intel.com (10.18.124.202) with Microsoft SMTP Server (TLS) id 14.3.248.2; Sun, 4 Dec 2016 18:55:11 -0800 Received: from shsmsx103.ccr.corp.intel.com (10.239.4.69) by fmsmsx115.amr.corp.intel.com (10.18.116.19) with Microsoft SMTP Server (TLS) id 14.3.248.2; Sun, 4 Dec 2016 18:55:10 -0800 Received: from shsmsx102.ccr.corp.intel.com ([169.254.2.239]) by SHSMSX103.ccr.corp.intel.com ([169.254.4.96]) with mapi id 14.03.0248.002; Mon, 5 Dec 2016 10:55:09 +0800 From: "Gao, Liming" To: Laszlo Ersek , Anthony PERARD CC: "Justen, Jordan L" , "edk2-devel@ml01.01.org" , Ard Biesheuvel Thread-Topic: [edk2] [PATCH 0/4] Fix runtime issue in XenBusDxe when compiled with GCC5 Thread-Index: AQHSTNIJ2HMSS5D7iUeufHfGyYCh/aD1/joAgAKrqJA= Date: Mon, 5 Dec 2016 02:55:07 +0000 Message-ID: <4A89E2EF3DFEDB4C8BFDE51014F606A14B4BCD2B@shsmsx102.ccr.corp.intel.com> References: <20161201152819.8341-1-anthony.perard@citrix.com> <53c67cb5-e947-8979-7738-288cc83f374b@redhat.com> <20161202160201.GA1848@perard.uk.xensource.com> <9c0c9f43-a297-179f-2d57-fa5d8fab3763@redhat.com> In-Reply-To: Accept-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.239.127.40] MIME-Version: 1.0 Subject: Re: [PATCH 0/4] Fix runtime issue in XenBusDxe when compiled with GCC5 X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Dec 2016 02:55:12 -0000 Content-Language: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Laszlo: Thanks for your investigation. In tools_def.txt, there are two chains: GC= C49 and GCC5. GCC49 enables Os without lto, GCC5 enables Os and lto. If GCC= version supports lto well, it can use GCC5 tool chain. Otherwise, it can u= se GCC49 tool chain. I suggest to add comments in GCC5 tool chain to docume= nt the known workable GCC version. From below comments, only GCC5.3 and GCC= 5.4 can work with GCC5 tool chain with lto enable. =20 Thanks Liming -----Original Message----- From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of Lasz= lo Ersek Sent: Sunday, December 04, 2016 1:59 AM To: Anthony PERARD Cc: Justen, Jordan L ; edk2-devel@ml01.01.org; G= ao, Liming ; Ard Biesheuvel Subject: Re: [edk2] [PATCH 0/4] Fix runtime issue in XenBusDxe when compile= d with GCC5 On 12/02/16 20:26, Laszlo Ersek wrote: > On 12/02/16 17:02, Anthony PERARD wrote: >> On Thu, Dec 01, 2016 at 07:43:24PM +0100, Laszlo Ersek wrote: >>> On 12/01/16 16:28, Anthony PERARD wrote: >>>> Hi, >>>> >>>> That might be only with the Xen part of OVMF but now that the GCC5=20 >>>> toolchains is used with my gcc (6.2.1 20160830, Arch Linux), OVMF=20 >>>> fail to boot in Xen guests. >>>> >> [...] >>>> >>>> Removing the gcc option -flto in only the XenBusDxe module makes=20 >>>> OVMF boot. >>>> >>>> While trying to debug that, I've added some debug prints (in this=20 >>>> module and in XenPvBlkDxe), and the exception could change and=20 >>>> become a "page fault" instead, or even an assert failure in the=20 >>>> PrintLib, that was the ASSERT(Buffer !=3D NULL) at I think >>>> MdePkg/Library/BasePrintLib/PrintLibInternal.c:366 >>>> >>>> Adding EFIAPI to internal functions in XenBusDxe makes things work=20 >>>> again. My guest is that gcc would bypass (optimise) an exported=20 >>>> functions and call directly an internal one but without reordering=20 >>>> the arguments (EFIAPI vs nothing). >>>> >>>> Does that make sense? >>> >>> If "-b NOOPT" works for you, I'd prefer that as a temporary solution=20 >>> (until the root cause is found and addressed) to the XenBusDxe patches. >> >> That works, using GCC49 (with gcc 6.2.1) works as well. >> >>> Hrpmf, wait a second, I do see something interesting: in this series=20 >>> you >>> *are* modifying APIs declared in a library class header (namely=20 >>> "OvmfPkg/Include/Library/XenHypercallLib.h"). Such functions (public >>> libraries) *are* required to specify EFIAPI. >>> >>> What happens if you apply patch #1 only? >> >> With only XenHypercallLib changes, the error is the same. >> >> But I did find the minimum change needed, it envolve a function with=20 >> a VA_LIST as argument. >> >> With only the following patch, OVMF works again. >> >> diff --git a/OvmfPkg/XenBusDxe/XenStore.c=20 >> b/OvmfPkg/XenBusDxe/XenStore.c index 1666c4b..85b0956 100644 >> --- a/OvmfPkg/XenBusDxe/XenStore.c >> +++ b/OvmfPkg/XenBusDxe/XenStore.c >> @@ -1307,6 +1307,7 @@ XenStoreTransactionEnd ( } >> =20 >> XENSTORE_STATUS >> +EFIAPI >> XenStoreVSPrint ( >> IN CONST XENSTORE_TRANSACTION *Transaction, >> IN CONST CHAR8 *DirectoryPath, >> diff --git a/OvmfPkg/XenBusDxe/XenStore.h=20 >> b/OvmfPkg/XenBusDxe/XenStore.h index c9d4c65..33bb647 100644 >> --- a/OvmfPkg/XenBusDxe/XenStore.h >> +++ b/OvmfPkg/XenBusDxe/XenStore.h >> @@ -209,6 +209,7 @@ XenStoreSPrint ( >> indicating the type of write failure. >> **/ >> XENSTORE_STATUS >> +EFIAPI >> XenStoreVSPrint ( >> IN CONST XENSTORE_TRANSACTION *Transaction, >> IN CONST CHAR8 *DirectoryPath, >> IN CONST CHAR8 *Node, >> IN CONST CHAR8 *FormatString, >> IN VA_LIST Marker >> ); >> >> >> I think the exception happen when this function is called via >> XENBUS_PROTOCOL->XsPrintf() from XenPvBlockFrontInitialization() in=20 >> OvmfPkg/XenPvBlkDxe/BlockFront.c >> >=20 > It used to be a known requirement / limitation that all functions with=20 > variable argument lists had to be EFIAPI, regardless of cross-module=20 > use. However, commit 48d5f9a551a93acb45f272dda879b0ab5a504e36 changed=20 > that, and varargs should "just work" now. I suspect this is a > __builtin_ms_va_* regression in gcc-6. Thank you for narrowing it down. > It might make sense to report a bug in the upstream gcc tracker. >=20 > ... Oh wow, this is a known gcc bug! See: >=20 > https://lists.01.org/pipermail/edk2-devel/2016-August/001018.html >=20 > Upstream gcc BZ =20 > was apparently solved for "Target Milestone: 6.3" (your version is 6.2.1)= . > So we'll either need a GCC6 toolchain in BaseTools that drops -flto,=20 > in order to work around this gcc issue, or we'll have to ask gcc-6=20 > users to use at least gcc-6.3. >=20 > Oh wait, gcc-6.3 hasn't been released yet. We need the BaseTools=20 > workaround then. I think I got confused in parts of the above; I got some details wrong. Namely, commit 48d5f9a551a9 did not remove the requirement/limitation that = all varargs functions have to be EFIAPI. Said commit only changed how the V= A_*() macros would be implemented. The two caller functions of XenStoreVSPrint(), namely XenStoreSPrint() and = XenBusXenStoreSPrint(), are varargs functions, but they are already EFIAPI.= So the requirement/limitation (which was unaffected by 48d5f9a551a9) is actually satisfied / considered in XenBusDxe. The XenStoreVSPrint() function, which you identified as the breaking part, = is *not* a varargs function itself, so it needn't be EFIAPI. It simply rece= ives a VA_LIST parameter (which is "__builtin_ms_va_list", from commit 48d5= f9a551a9), and (a) copies it with VA_COPY() for passing the copy to SPrintL= engthAsciiFormat(), (b) passes the original parameter to AsciiVSPrint(). In= turn both of those functions call the common BasePrintLibSPrintMarker() function. Comment says, > This is bug report that the specialized=20 > __builtin_ms_va_{list,start,end,copy} builtins have stopped working=20 > when -flto is used. They worked until gcc 5.3, both with or without=20 > -flto. In gcc 6.1 with -flto, the canonical iterator __builtin_va_arg=20 > ignores them and works on a sysv_va_list. To be precise, it's=20 > __builtin_va_arg in the context of -flto that's broken, not the=20 > specialized builtins. __builtin_va_arg has always been a polymorphic=20 > builtin that changes its behavior based on the type of va_list it's=20 > given as an argument. Without this polymorphic behavior, there's no=20 > way to iterate over an ms_va_list. Apparently, when BasePrintLibSPrintMarker() finally calls VA_ARG() (=3D=3D = __builtin_va_arg(), from commit 48d5f9a551a9) on Marker / Marker2, with LTO= enabled, __builtin_va_arg() fails to notice what context VaListMarker come= s from: - __builtin_ms_va_start() in XenStoreSPrint() and XenBusXenStoreSPrint(), o= r - __builtin_ms_va_copy() in XenStoreVSPrint(). So I think we *are* being hit by gcc BZ#70955, and making XenStoreVSPrint() EFIAPI only masks the issue. Comment seems relevant: > The change with GCC 6 is that the builtins are now lowered during=20 > link-time optimization rather than at compile-time. Thus the abi=20 > selection bits are possibly not transfered correctly (type merging?). > I remember the business was quite ugly, but eventually we just miss to=20 > properly transfer the function attribute. The end result for edk2 remains the same (=3D BaseTools should work around = this gcc issue with a new GCC6 toolchain that drops -flto, unless gcc-6.3 is about to become available to users real quick). I just wanted to= point out that my earlier statement "commit 48d5f9a551a9 had removed the n= eed for varargs functions to be EFIAPI" was incorrect -- varargs functions = still must be EFIAPI (and XenBusDxe conforms, see XenStoreSPrint() and XenBusXenStoreSPrint()). Thanks Laszlo _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel