From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-io0-x232.google.com (mail-io0-x232.google.com [IPv6:2607:f8b0:4001:c06::232]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id A5DEA81F52 for ; Fri, 2 Dec 2016 10:42:59 -0800 (PST) Received: by mail-io0-x232.google.com with SMTP id a124so496934692ioe.2 for ; Fri, 02 Dec 2016 10:42:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=rJQCsr2ZPJF/YIdwr+V5y/28L5ylWCEVLTP2ig83x8w=; b=RLdMv0CyCyLkPjCfUllw4+ZiLPT2uOc9jbmdVtVrw1qg9ZNGoEH/z2myBgRN21SuE2 gcIa8O1EEjhYUzNGycBeT6L8/vt5BhC3sHxQDo7b1n0oVhWT/3seyUGw+/ZHtXLtm2G5 1L6w1aO4H8ZpLYlCCuW3sdhtH81CTO/BhA7zw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=rJQCsr2ZPJF/YIdwr+V5y/28L5ylWCEVLTP2ig83x8w=; b=djkUjyScE79qPjdvMEgZqQpKzsYfNxjrlnF7CoQkfzfYYByfK5MULhXobYEm20bB/1 TGjXQgPGgySt/Xbk4xtGp8G8TZooMSdBcEiWsVhu0iV2/QiLuTyVY8ZLfJmP1BzC971g sl1E6Uj6U65esDadhx8GgfYt/5Ro71/zmGV4lnZ5p66t8zHDnZMfBNMbLYCPOAwMAEtB QtLliUW1tIoN1OfFwIZ6gCYoXKr8drsqulBm6hrq34iXaO2AXoD+6ZvVMsdN7/J08J26 fsNHrYhicK2IgnL7PcxU8KZZG/oIMrEARTH3a/2y4+zAEok8FCQxkPFOLus23JXpIikN OBdA== X-Gm-Message-State: AKaTC01epFVLkrbub4lvr6MKXjXQF4AXl1ZSvWTE2LtkP5BoQUDSd0LWcDwQU42Fol6e7TcX48G3BNBrzA+PaYlN X-Received: by 10.107.18.39 with SMTP id a39mr36356257ioj.45.1480704178734; Fri, 02 Dec 2016 10:42:58 -0800 (PST) MIME-Version: 1.0 Received: by 10.107.198.67 with HTTP; Fri, 2 Dec 2016 10:42:58 -0800 (PST) In-Reply-To: <20161202155637.GB27069@bivouac.eciton.net> References: <1480088538-21834-1-git-send-email-ard.biesheuvel@linaro.org> <1480088538-21834-4-git-send-email-ard.biesheuvel@linaro.org> <734D49CCEBEEF84792F5B80ED585239D58E8CFA1@SHSMSX104.ccr.corp.intel.com> <20161130140655.GG27069@bivouac.eciton.net> <734D49CCEBEEF84792F5B80ED585239D58E926A5@SHSMSX104.ccr.corp.intel.com> <20161202144050.GV27069@bivouac.eciton.net> <20161202155637.GB27069@bivouac.eciton.net> From: Ard Biesheuvel Date: Fri, 2 Dec 2016 18:42:58 +0000 Message-ID: To: Leif Lindholm Cc: "Ni, Ruiyu" , "Tian, Feng" , "edk2-devel@lists.01.org" , "afish@apple.com" , "Gao, Liming" , "Kinney, Michael D" Subject: Re: [PATCH v4 3/5] MdeModulePkg: implement generic PCI I/O driver for non-discoverable devices X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 02 Dec 2016 18:42:59 -0000 Content-Type: text/plain; charset=UTF-8 On 2 December 2016 at 15:56, Leif Lindholm wrote: > On Fri, Dec 02, 2016 at 03:07:39PM +0000, Ard Biesheuvel wrote: >> >> > On 2 Dec 2016, at 14:40, Leif Lindholm wrote: >> > >> > On Fri, Dec 02, 2016 at 12:54:45PM +0000, Ni, Ruiyu wrote: >> >>>>> + Dev->ConfigSpace.Device.Bar[Idx] = (UINT32)Desc->AddrRangeMin; >> >>>>> + Dev->BarCount++; >> >>>>> + >> >>>>> + if (Desc->AddrSpaceGranularity == 64) { >> >>>>> + Dev->ConfigSpace.Device.Bar[Idx] |= 0x4; >> >>>>> + Dev->ConfigSpace.Device.Bar[++Idx] = (UINT32)(Desc->AddrRangeMin >> 32); >> >>>> >> >>>> 1. You need to use RShiftU64 otherwise the above code will break the >> >>>> build process in 32bit. >> >>> >> >>> I don't understand how that could cause breakage (and experiments with >> >>> IA32 and ARM fail to reproduce it with GCC). >> >>> >> >>> - Bar[x] is an array of UINT32 >> >>> - Desc->AddrRangeMin is UINT64 >> >>> - The result of (Desc->AddrRangeMin >> 32) is explicitly cast to >> >>> UINT32 (the type of Bar[x]). >> >>> >> >>> Is the problem related to the shift value being signed? >> >>> Or does some other compiler refuse to cast a UINT64 to a UINT32? >> >> >> >> Desc->AddrRangeMin is UINT64. My experience tells me that >> >> LShiftU64 or RShiftU64 should be used for shifting operation >> >> on UINT64. >> > >> > This is the bit that confuses me. There is no such automatic >> > limitation in the C language (if long long is supported at all). So if >> > we're working around some bug in a specific compiler - I am happy to >> > do so. I just prefer knowing why rather than cargo-culting. >> > >> >> I guess the GCC might cleverly recognizes the ">>32" actually >> >> picks up the high-32 bits so in 32bit environment, it just uses >> >> value of the register which stores the high-32 bits. >> >> Can you check the assembly code GCC generates? >> >> or simply can you change ">>32" to ">>31" or ">>33" to see >> >> what happens? >> > >> > There is no cleverness required, it's just a defined operation on a >> > base type. But sure, for example: >> > >> > unsigned int shift(long long val) >> > { >> > return (val >> 33); >> > } >> > >> > decodes as >> > --- >> > 0000000000000000 : >> > 0: 48 89 f8 mov %rdi,%rax >> > 3: 48 c1 f8 21 sar $0x21,%rax >> > 7: c3 retq >> > --- >> > (in x86_64, using gcc -O2 to get rid of some redundant stack >> > operations) >> > >> > The only thing that changes if the shift value is modified is that the >> > 0x21 is changed accordingly. >> > >> > Change it to an inline constant, not much changes: >> > >> > unsigned int shiftconst(void) >> > { >> > unsigned long long var = 0xaaaabbbbccccddddULL; >> > return (var >> 31); >> > } >> > >> > decodes as >> > --- >> > 0000000000000000 : >> > 0: 55 push %rbp >> > 1: 48 89 e5 mov %rsp,%rbp >> > 4: 48 b8 dd dd cc cc bb movabs $0xaaaabbbbccccdddd,%rax >> > b: bb aa aa >> > e: 48 89 45 f8 mov %rax,-0x8(%rbp) >> > 12: 48 8b 45 f8 mov -0x8(%rbp),%rax >> > 16: 48 c1 e8 1f shr $0x1f,%rax >> > 1a: 5d pop %rbp >> > 1b: c3 retq >> > --- >> > (with -O0, to prevent the compiler from just doing the arithmetic >> > compile time) >> > >> > Both also compile correctly for IA32 with -m32. >> > >> > The point is that there is nothing clever here for the compiler to do: >> > casting a 64-bit int to a 32-bit int is an operation that discards the >> > top 32 bits - but the behaviour is entirely defined. >> > >> > In fact, all architectures other than IA32/ARM use the Math64.c >> > implementation of RShiftU64, which simply does the shift. >> > And ARM does it only for GCC. I'm not sure why, I can't imagine a GCC >> > version that would do anything more complicated than >> > --- >> > 00000000 : >> > 0: e1a000c1 asr r0, r1, #1 >> > 4: e12fff1e bx lr >> > --- >> > (and that's with -march=armv4t onwards) >> >> That's a 32-bit operand. Shifting a 64-bit value will result in a >> call to a compiler intrinsic, which is kind of the point of Riuyu's >> remark > > No, it's not. > As stated, that is the output from compiling: > > unsigned int shift(long long val) > { > return (val >> 33); > } > > Unless you are claiming long long is 32-bit on ARM. > > Although I did mess up and make it signed :) > When fixed, the code generation is identical but with an lsr instead. > > The compiler just operates directly on the top 32-bits (in r1), since > the ones in r0 are irrelevant to the result. > OK, but that does mean it is performing optimization up to some point. So right shifts of 64-bit quantities by 32 bits or more will be 'optimized' by GCC into something that does not rely on the intrinsics. But this is not universally true for all toolchains (and variable shifts are probably treated differently as well) > Change the shift to 31 and the compiler doesn't do anything cute, so > the code is slightly more obvious: > --- > 00000000 : > 0: e1a00fa0 lsr r0, r0, #31 > 4: e1800081 orr r0, r0, r1, lsl #1 > 8: e12fff1e bx lr > --- > > But even if there were intrinsics generated, shouldn't we support the > intrinsic instead of massaging the C code and hope we can prevent it > from being generated. > This is exactly the issue at hand: EDK2 typically does *not* rely on such intrinsics, even if we were forced to add them for ARM because GCC cannot be instructed to inhibit such calls. Actually, I do not agree with the EDK2 position, i.e., I think there is no issue relying on routines that are typically provided by the C runtime. But this is MdeModulePkg, so it is not up to us.