public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
From: Ard Biesheuvel <ard.biesheuvel@linaro.org>
To: Leif Lindholm <leif.lindholm@linaro.org>
Cc: "Ni, Ruiyu" <ruiyu.ni@intel.com>,
	"Tian, Feng" <feng.tian@intel.com>,
	 "edk2-devel@lists.01.org" <edk2-devel@lists.01.org>,
	"afish@apple.com" <afish@apple.com>,
	 "Gao, Liming" <liming.gao@intel.com>,
	"Kinney, Michael D" <michael.d.kinney@intel.com>
Subject: Re: [PATCH v4 3/5] MdeModulePkg: implement generic PCI I/O driver for non-discoverable devices
Date: Fri, 2 Dec 2016 18:42:58 +0000	[thread overview]
Message-ID: <CAKv+Gu86me0nAijVZbDB-ASzS2joWtQaX7d5O4BdXHHErGR+Bw@mail.gmail.com> (raw)
In-Reply-To: <20161202155637.GB27069@bivouac.eciton.net>

On 2 December 2016 at 15:56, Leif Lindholm <leif.lindholm@linaro.org> wrote:
> On Fri, Dec 02, 2016 at 03:07:39PM +0000, Ard Biesheuvel wrote:
>>
>> > On 2 Dec 2016, at 14:40, Leif Lindholm <leif.lindholm@linaro.org> wrote:
>> >
>> > On Fri, Dec 02, 2016 at 12:54:45PM +0000, Ni, Ruiyu wrote:
>> >>>>> +    Dev->ConfigSpace.Device.Bar[Idx] = (UINT32)Desc->AddrRangeMin;
>> >>>>> +    Dev->BarCount++;
>> >>>>> +
>> >>>>> +    if (Desc->AddrSpaceGranularity == 64) {
>> >>>>> +      Dev->ConfigSpace.Device.Bar[Idx] |= 0x4;
>> >>>>> +      Dev->ConfigSpace.Device.Bar[++Idx] = (UINT32)(Desc->AddrRangeMin >> 32);
>> >>>>
>> >>>> 1. You need to use RShiftU64 otherwise the above code will break the
>> >>>> build process in 32bit.
>> >>>
>> >>> I don't understand how that could cause breakage (and experiments with
>> >>> IA32 and ARM fail to reproduce it with GCC).
>> >>>
>> >>> -  Bar[x] is an array of UINT32
>> >>> -  Desc->AddrRangeMin is UINT64
>> >>> -  The result of (Desc->AddrRangeMin >> 32) is explicitly cast to
>> >>>  UINT32 (the type of Bar[x]).
>> >>>
>> >>> Is the problem related to the shift value being signed?
>> >>> Or does some other compiler refuse to cast a UINT64 to a UINT32?
>> >>
>> >> Desc->AddrRangeMin is UINT64. My experience tells me that
>> >> LShiftU64 or RShiftU64 should be used for shifting operation
>> >> on UINT64.
>> >
>> > This is the bit that confuses me. There is no such automatic
>> > limitation in the C language (if long long is supported at all). So if
>> > we're working around some bug in a specific compiler - I am happy to
>> > do so. I just prefer knowing why rather than cargo-culting.
>> >
>> >> I guess the GCC might cleverly recognizes the ">>32" actually
>> >> picks up the high-32 bits so in 32bit environment, it just uses
>> >> value of the register which stores the high-32 bits.
>> >> Can you check the assembly code GCC generates?
>> >> or simply can you change ">>32" to ">>31" or ">>33" to see
>> >> what happens?
>> >
>> > There is no cleverness required, it's just a defined operation on a
>> > base type. But sure, for example:
>> >
>> > unsigned int shift(long long val)
>> > {
>> >  return (val >> 33);
>> > }
>> >
>> > decodes as
>> > ---
>> > 0000000000000000 <shift>:
>> >   0:     48 89 f8        mov    %rdi,%rax
>> >   3:     48 c1 f8 21              sar    $0x21,%rax
>> >   7:     c3                       retq
>> > ---
>> > (in x86_64, using gcc -O2 to get rid of some redundant stack
>> > operations)
>> >
>> > The only thing that changes if the shift value is modified is that the
>> > 0x21 is changed accordingly.
>> >
>> > Change it to an inline constant, not much changes:
>> >
>> > unsigned int shiftconst(void)
>> > {
>> >  unsigned long long var = 0xaaaabbbbccccddddULL;
>> >  return (var >> 31);
>> > }
>> >
>> > decodes as
>> > ---
>> > 0000000000000000 <shiftconst>:
>> >   0:   55            push   %rbp
>> >   1:   48 89 e5                 mov    %rsp,%rbp
>> >   4:   48 b8 dd dd cc cc bb    movabs $0xaaaabbbbccccdddd,%rax
>> >   b:   bb aa aa
>> >   e:   48 89 45 f8        mov    %rax,-0x8(%rbp)
>> >  12:   48 8b 45 f8              mov    -0x8(%rbp),%rax
>> >  16:   48 c1 e8 1f              shr    $0x1f,%rax
>> >  1a:   5d                       pop    %rbp
>> >  1b:   c3            retq
>> > ---
>> > (with -O0, to prevent the compiler from just doing the arithmetic
>> > compile time)
>> >
>> > Both also compile correctly for IA32 with -m32.
>> >
>> > The point is that there is nothing clever here for the compiler to do:
>> > casting a 64-bit int to a 32-bit int is an operation that discards the
>> > top 32 bits - but the behaviour is entirely defined.
>> >
>> > In fact, all architectures other than IA32/ARM use the Math64.c
>> > implementation of RShiftU64, which simply does the shift.
>> > And ARM does it only for GCC. I'm not sure why, I can't imagine a GCC
>> > version that would do anything more complicated than
>> > ---
>> > 00000000 <shift>:
>> >   0:    e1a000c1    asr    r0, r1, #1
>> >   4:    e12fff1e    bx    lr
>> > ---
>> > (and that's with -march=armv4t onwards)
>>
>> That's a 32-bit operand. Shifting a 64-bit value will result in a
>> call to a compiler intrinsic, which is kind of the point of Riuyu's
>> remark
>
> No, it's not.
> As stated, that is the output from compiling:
>
> unsigned int shift(long long val)
> {
>   return (val >> 33);
> }
>
> Unless you are claiming long long is 32-bit on ARM.
>
> Although I did mess up and make it signed :)
> When fixed, the code generation is identical but with an lsr instead.
>
> The compiler just operates directly on the top 32-bits (in r1), since
> the ones in r0 are irrelevant to the result.
>

OK, but that does mean it is performing optimization up to some point.
So right shifts of 64-bit quantities by 32 bits or more will be
'optimized' by GCC into something that does not rely on the
intrinsics. But this is not universally true for all toolchains (and
variable shifts are probably treated differently as well)

> Change the shift to 31 and the compiler doesn't do anything cute, so
> the code is slightly more obvious:
> ---
> 00000000 <shift>:
>    0:    e1a00fa0       lsr     r0, r0, #31
>    4:    e1800081       orr     r0, r0, r1, lsl #1
>    8:    e12fff1e       bx      lr
> ---
>
> But even if there were intrinsics generated, shouldn't we support the
> intrinsic instead of massaging the C code and hope we can prevent it
> from being generated.
>

This is exactly the issue at hand: EDK2 typically does *not* rely on
such intrinsics, even if we were forced to add them for ARM because
GCC cannot be instructed to inhibit such calls.

Actually, I do not agree with the EDK2 position, i.e., I think there
is no issue relying on routines that are typically provided by the C
runtime. But this is MdeModulePkg, so it is not up to us.


  reply	other threads:[~2016-12-02 18:42 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-25 15:42 [PATCH v4 0/5] MdeModulePkg: add support for non-discoverable devices Ard Biesheuvel
2016-11-25 15:42 ` [PATCH v4 1/5] MdeModulePkg: introduce non-discoverable device protocol Ard Biesheuvel
2016-11-28  1:57   ` Ni, Ruiyu
2016-11-25 15:42 ` [PATCH v4 2/5] MdeModulePkg: introduce helper library to register non-discoverable devices Ard Biesheuvel
2016-11-28  1:58   ` Ni, Ruiyu
2016-11-25 15:42 ` [PATCH v4 3/5] MdeModulePkg: implement generic PCI I/O driver for " Ard Biesheuvel
2016-11-28  1:56   ` Ni, Ruiyu
2016-11-30 14:06     ` Leif Lindholm
2016-12-02 10:09       ` Leif Lindholm
2016-12-02 12:54       ` Ni, Ruiyu
2016-12-02 14:40         ` Leif Lindholm
2016-12-02 15:07           ` Ard Biesheuvel
2016-12-02 15:56             ` Leif Lindholm
2016-12-02 18:42               ` Ard Biesheuvel [this message]
2016-12-05  9:35                 ` Leif Lindholm
2016-11-25 15:42 ` [PATCH v4 4/5] MdeModulePkg/NonDiscoverablePciDeviceDxe: add support for non-coherent DMA Ard Biesheuvel
2016-11-28  2:25   ` Ni, Ruiyu
2016-11-25 15:42 ` [PATCH v4 5/5] Omap35xxPkg/PciEmulation: port to new non-discoverable device infrastructure Ard Biesheuvel
2016-11-27 10:18 ` [PATCH v4 0/5] MdeModulePkg: add support for non-discoverable devices Marcin Wojtas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAKv+Gu86me0nAijVZbDB-ASzS2joWtQaX7d5O4BdXHHErGR+Bw@mail.gmail.com \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox