public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
From: Laszlo Ersek <lersek@redhat.com>
To: Anthony PERARD <anthony.perard@citrix.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>,
	"Gao, Liming" <liming.gao@intel.com>,
	"Zhu, Yonghong" <yonghong.zhu@intel.com>,
	Ard Biesheuvel <ard.biesheuvel@linaro.org>,
	edk2-devel@ml01.01.org, Rebecca Cran <rebecca@bluestop.org>,
	Konrad Rzeszutek Wilk <konrad@kernel.org>
Subject: Re: [PATCH 0/4] Fix runtime issue in XenBusDxe when compiled with GCC5
Date: Tue, 21 Feb 2017 18:07:15 +0100	[thread overview]
Message-ID: <97214320-6054-8034-7667-6edde4debd80@redhat.com> (raw)
In-Reply-To: <20170221163922.GC1867@perard.uk.xensource.com>

CC Rebecca & Konrad

On 02/21/17 17:39, Anthony PERARD wrote:
> On Sat, Dec 03, 2016 at 06:59:28PM +0100, Laszlo Ersek wrote:
>> On 12/02/16 20:26, Laszlo Ersek wrote:
>>> On 12/02/16 17:02, Anthony PERARD wrote:
>>>> On Thu, Dec 01, 2016 at 07:43:24PM +0100, Laszlo Ersek wrote:
>>>>> On 12/01/16 16:28, Anthony PERARD wrote:
>>>>>> Hi,
>>>>>>
>>>>>> That might be only with the Xen part of OVMF but now that the GCC5
>>>>>> toolchains is used with my gcc (6.2.1 20160830, Arch Linux), OVMF fail
>>>>>> to boot in Xen guests.
>>>>>>
>>>> [...]
>>>>>>
>>>>>> Removing the gcc option -flto in only the XenBusDxe module makes OVMF
>>>>>> boot.
>>>>>>
>>>>>> While trying to debug that, I've added some debug prints (in this module
>>>>>> and in XenPvBlkDxe), and the exception could change and become a "page
>>>>>> fault" instead, or even an assert failure in the PrintLib, that was the
>>>>>> ASSERT(Buffer != NULL) at I think
>>>>>> MdePkg/Library/BasePrintLib/PrintLibInternal.c:366
>>>>>>
>>>>>> Adding EFIAPI to internal functions in XenBusDxe makes things work
>>>>>> again.  My guest is that gcc would bypass (optimise) an exported
>>>>>> functions and call directly an internal one but without reordering the
>>>>>> arguments (EFIAPI vs nothing).
>>>>>>
>>>>>> Does that make sense?
>>>>>
>>>>> If "-b NOOPT" works for you, I'd prefer that as a temporary solution
>>>>> (until the root cause is found and addressed) to the XenBusDxe patches.
>>>>
>>>> That works, using GCC49 (with gcc 6.2.1) works as well.
>>>>
>>>>> Hrpmf, wait a second, I do see something interesting: in this series you
>>>>> *are* modifying APIs declared in a library class header (namely
>>>>> "OvmfPkg/Include/Library/XenHypercallLib.h"). Such functions (public
>>>>> libraries) *are* required to specify EFIAPI.
>>>>>
>>>>> What happens if you apply patch #1 only?
>>>>
>>>> With only XenHypercallLib changes, the error is the same.
>>>>
>>>> But I did find the minimum change needed, it envolve a function with a
>>>> VA_LIST as argument.
>>>>
>>>> With only the following patch, OVMF works again.
>>>>
>>>> diff --git a/OvmfPkg/XenBusDxe/XenStore.c b/OvmfPkg/XenBusDxe/XenStore.c
>>>> index 1666c4b..85b0956 100644
>>>> --- a/OvmfPkg/XenBusDxe/XenStore.c
>>>> +++ b/OvmfPkg/XenBusDxe/XenStore.c
>>>> @@ -1307,6 +1307,7 @@ XenStoreTransactionEnd (
>>>>  }
>>>>  
>>>>  XENSTORE_STATUS
>>>> +EFIAPI
>>>>  XenStoreVSPrint (
>>>>    IN CONST XENSTORE_TRANSACTION *Transaction,
>>>>    IN CONST CHAR8           *DirectoryPath,
>>>> diff --git a/OvmfPkg/XenBusDxe/XenStore.h b/OvmfPkg/XenBusDxe/XenStore.h
>>>> index c9d4c65..33bb647 100644
>>>> --- a/OvmfPkg/XenBusDxe/XenStore.h
>>>> +++ b/OvmfPkg/XenBusDxe/XenStore.h
>>>> @@ -209,6 +209,7 @@ XenStoreSPrint (
>>>>             indicating the type of write failure.
>>>>  **/
>>>>  XENSTORE_STATUS
>>>> +EFIAPI
>>>>  XenStoreVSPrint (
>>>>    IN CONST XENSTORE_TRANSACTION *Transaction,
>>>>    IN CONST CHAR8           *DirectoryPath,
>>>>    IN CONST CHAR8           *Node,
>>>>    IN CONST CHAR8           *FormatString,
>>>>    IN VA_LIST               Marker
>>>>    );
>>>>
>>>>
>>>> I think the exception happen when this function is called via
>>>> XENBUS_PROTOCOL->XsPrintf() from XenPvBlockFrontInitialization() in
>>>> OvmfPkg/XenPvBlkDxe/BlockFront.c
>>>>
>>>
>>> It used to be a known requirement / limitation that all functions with
>>> variable argument lists had to be EFIAPI, regardless of cross-module
>>> use. However, commit 48d5f9a551a93acb45f272dda879b0ab5a504e36 changed
>>> that, and varargs should "just work" now. I suspect this is a
>>> __builtin_ms_va_* regression in gcc-6. Thank you for narrowing it down.
>>> It might make sense to report a bug in the upstream gcc tracker.
>>>
>>> ... Oh wow, this is a known gcc bug! See:
>>>
>>> https://lists.01.org/pipermail/edk2-devel/2016-August/001018.html
>>>
>>> Upstream gcc BZ <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70955> was
>>> apparently solved for "Target Milestone: 6.3" (your version is 6.2.1).
>>> So we'll either need a GCC6 toolchain in BaseTools that drops -flto, in
>>> order to work around this gcc issue, or we'll have to ask gcc-6 users to
>>> use at least gcc-6.3.
>>>
>>> Oh wait, gcc-6.3 hasn't been released yet. We need the BaseTools
>>> workaround then.
>>
>> I think I got confused in parts of the above; I got some details wrong.
>> Namely, commit 48d5f9a551a9 did not remove the requirement/limitation
>> that all varargs functions have to be EFIAPI. Said commit only changed
>> how the VA_*() macros would be implemented.
>>
>> The two caller functions of XenStoreVSPrint(), namely XenStoreSPrint()
>> and XenBusXenStoreSPrint(), are varargs functions, but they are already
>> EFIAPI. So the requirement/limitation (which was unaffected by
>> 48d5f9a551a9) is actually satisfied / considered in XenBusDxe.
>>
>> The XenStoreVSPrint() function, which you identified as the breaking
>> part, is *not* a varargs function itself, so it needn't be EFIAPI. It
>> simply receives a VA_LIST parameter (which is "__builtin_ms_va_list",
>> from commit 48d5f9a551a9), and (a) copies it with VA_COPY() for passing
>> the copy to SPrintLengthAsciiFormat(), (b) passes the original parameter
>> to AsciiVSPrint(). In turn both of those functions call the common
>> BasePrintLibSPrintMarker() function.
>>
>> Comment <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70955#c6> says,
>>
>>> This is bug report that the specialized
>>> __builtin_ms_va_{list,start,end,copy} builtins have stopped working
>>> when -flto is used.  They worked until gcc 5.3, both with or without
>>> -flto.  In gcc 6.1 with -flto, the canonical iterator __builtin_va_arg
>>> ignores them and works on a sysv_va_list.  To be precise, it's
>>> __builtin_va_arg in the context of -flto that's broken, not the
>>> specialized builtins.  __builtin_va_arg has always been a polymorphic
>>> builtin that changes its behavior based on the type of va_list it's
>>> given as an argument.  Without this polymorphic behavior, there's no
>>> way to iterate over an ms_va_list.
>>
>> Apparently, when BasePrintLibSPrintMarker() finally calls VA_ARG() (==
>> __builtin_va_arg(), from commit 48d5f9a551a9) on Marker / Marker2, with
>> LTO enabled, __builtin_va_arg() fails to notice what context
>> VaListMarker comes from:
>> - __builtin_ms_va_start() in XenStoreSPrint() and XenBusXenStoreSPrint(), or
>> - __builtin_ms_va_copy() in XenStoreVSPrint().
>>
>> So I think we *are* being hit by gcc BZ#70955, and making
>> XenStoreVSPrint() EFIAPI only masks the issue. Comment
>> <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70955#c7> seems relevant:
>>
>>> The change with GCC 6 is that the builtins are now lowered during
>>> link-time optimization rather than at compile-time.  Thus the abi
>>> selection bits are possibly not transfered correctly (type merging?).
>>> I remember the business was quite ugly, but eventually we just miss to
>>> properly transfer the function attribute.
>>
>> The end result for edk2 remains the same (= BaseTools should work around
>> this gcc issue with a new GCC6 toolchain that drops -flto, unless
>> gcc-6.3 is about to become available to users real quick). I just wanted
>> to point out that my earlier statement "commit 48d5f9a551a9 had removed
>> the need for varargs functions to be EFIAPI" was incorrect -- varargs
>> functions still must be EFIAPI (and XenBusDxe conforms, see
>> XenStoreSPrint() and XenBusXenStoreSPrint()).
> 
> Hi Laszlo,
> 
> Now that gcc 6.3 is out, the bug described in the thread strikes again.
> Building OVMF with -flto result in Page-Fault or General Protection
> fault, due to the way va_args are used in XenStoreVSPrint().
> 
> Also, now I've tried to build OVMF with gcc 5.4, same result, using
> -flto result in a firmware that does not work.
> 
> 
> I've tried to create a small programme that use the va_args in the same
> way, and compiled-test it with different gcc (gcc 4.9.2, gcc 5.4, gcc
> 6.3), then depending on the options use, it does not work or it works:
> 
> Don't work (prog segv or wrong output):
> gcc -o prog va_main.c va_test.c
> gcc -o prog -flto va_main.c va_test.c
> gcc -Os -o prog -flto va_main.c va_test.c
> 
> Work:
> gcc -Os -o prog va_main.c va_test.c
> 
> I'll attach va_main.c and va_test.c.
> 
> So, should I add EFIAPI to XenStoreVSPrint, as it is using VA_COPY?
> 

Hm, please help me jog my memory...

If I remember correctly, this is still a GCC bug, one that we suppressed for gcc-6.2 with your patch as follows:

> commit 432f1d83f77acf92d52ef18d2cee6dbf7c5b9b86
> Author: Anthony PERARD <anthony.perard@citrix.com>
> Date:   Tue Dec 6 12:03:25 2016 +0000
>
>     OvmfPkg/build.sh: Use GCC49 toolchains with GCC 6.[0-2]
>
>     The goal of the patch is to avoid using -flto with GCC 6.0 to 6.2.
>
>     This is to workaround a GCC bug:
>     https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70955
>
>     Contributed-under: TianoCore Contribution Agreement 1.0
>     Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
>     Reviewed-by: Laszlo Ersek <lersek@redhat.com>
>     Regression-tested-by: Laszlo Ersek <lersek@redhat.com>
>
> diff --git a/OvmfPkg/build.sh b/OvmfPkg/build.sh
> index 95fe8fb07647..b6e936056ca0 100755
> --- a/OvmfPkg/build.sh
> +++ b/OvmfPkg/build.sh
> @@ -102,7 +102,7 @@ case `uname` in
>        4.8.*)
>          TARGET_TOOLS=GCC48
>          ;;
> -      4.9.*)
> +      4.9.*|6.[0-2].*)
>          TARGET_TOOLS=GCC49
>          ;;
>        *)

Do I understand correctly that the gcc bug has not been fixed in gcc-6.3, and -- because we don't suppress it for gcc-6.3 as the above expression does not match -- it causes problems again?

You also mention gcc-5.4 as problematic. I think we haven't received such reports about gcc-5 versions up to and including gcc-5.3 (that's why GCC5 is the default selection in "OvmfPkg/build.sh"). Do you mean that the gcc bug has now been "backported" from the gcc-6 series to the gcc-5 series (starting with gcc-5.4)?

If that's the case, then I suggest flipping "OvmfPkg/build.sh" from black-listing gcc versions for -flto to white-listing. In other words, assume that -flto is generally broken with GCC, except for a few known versions: 5.0 through 5.3 inclusive. Those versions should trigger the use of the GCC5 toolchain, and everything else (5.4+, 6.*, 4.9.*) should use GCC49.

I don't feel comfortable about adding EFIAPI to XenStoreVSPrint just because it takes a VA_LIST parameter -- note: it is *not* a varargs function itself! --; the same issue might hit elsewhere in the edk2 tree at any time, outside of OvmfPkg too.

Would the gcc white-listing work for you?

Note that the white-listing would practically undo Konrad's commit 2667ad40919a ("OvmfPkg/build.sh: Make GCC5 the default toolchain, catch GCC43 and earlier", 2016-11-23), but given the recent gcc developments (gcc-6.3 has not fixed the gcc bug, and the bug has even surfaced in gcc-5.4), I think it would be justified.

Thanks
Laszlo


  reply	other threads:[~2017-02-21 17:07 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-01 15:28 [PATCH 0/4] Fix runtime issue in XenBusDxe when compiled with GCC5 Anthony PERARD
2016-12-01 15:28 ` [PATCH 1/4] OvmfPkg/XenHypercallLib: Add EFIAPI Anthony PERARD
2016-12-01 15:28 ` [PATCH 2/4] OvmfPkg/XenBusDxe: Add EFIAPI to XenEventChannelNotify Anthony PERARD
2016-12-01 15:28 ` [PATCH 3/4] OvmfPkg/XenBusDxe: Add EFIAPI to XenStore functions Anthony PERARD
2016-12-01 15:28 ` [PATCH 4/4] OvmfPkg/XenBusDxe: Add EFIAPI to XenGrantTable{Grant, End}Access Anthony PERARD
2016-12-01 18:43 ` [PATCH 0/4] Fix runtime issue in XenBusDxe when compiled with GCC5 Laszlo Ersek
2016-12-01 20:06   ` Jordan Justen
2016-12-01 20:54     ` Laszlo Ersek
2016-12-02  0:58       ` Jordan Justen
2016-12-02  9:45         ` Laszlo Ersek
2016-12-02  4:36   ` Gao, Liming
2016-12-02 10:00     ` Laszlo Ersek
2016-12-02 16:02   ` Anthony PERARD
2016-12-02 19:26     ` Laszlo Ersek
2016-12-03 17:59       ` Laszlo Ersek
2016-12-05  2:55         ` Gao, Liming
2016-12-05 10:09           ` Laszlo Ersek
2017-02-21 16:39         ` Anthony PERARD
2017-02-21 17:07           ` Laszlo Ersek [this message]
2017-02-21 17:53             ` Anthony PERARD
2017-02-21 19:02               ` Laszlo Ersek
2017-02-21 19:08                 ` Rebecca Cran
2017-02-21 22:45                   ` Jordan Justen
2017-02-21 23:59                     ` Laszlo Ersek
2017-02-22 14:16                       ` Gao, Liming
2017-02-22  8:54                 ` Gao, Liming
2017-02-23 10:19                   ` Laszlo Ersek
2017-02-23 12:43                     ` Anthony PERARD
2017-02-23 13:00                     ` Gao, Liming

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=97214320-6054-8034-7667-6edde4debd80@redhat.com \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox