From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-in7.apple.com (mail-out7.apple.com [17.151.62.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id E11CC81E64 for ; Wed, 16 Nov 2016 11:40:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; d=apple.com; s=mailout2048s; c=relaxed/simple; q=dns/txt; i=@apple.com; t=1479325210; h=From:Sender:Reply-To:Subject:Date:Message-id:To:Cc:MIME-version:Content-type: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-reply-to:References:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=UGFhP8szHRfmfzliGtl+xAN0oqwJTYJnVw3sdamgfzM=; b=j20M7V0ztELCVdxbSDKCrMbivRQ2vGa0bAO7TN2paBSqgSLZjGpxRp/5WgZaiRwO kRJRKKkr3SjC+1+B4TZ2k5zpAOtr+AeUrwcGViHBxoF3gN/d/i1PtMCwZaqRYJm4 sEKyYRSu2wBLRQCfvQCeu3dq4qBlEcSmSe9DK6/xI7vz1FI4dg+941EAyeVaC4PU npsnCiBoviAtlCU0kFO+UMAWHC5yzSMGR5Pljg9KmlhRAHifIzS9FlOX6/iCCcX7 rUf3XqOkj/xo7eIaq7wwtRLG3P2AZMOoe3CwcnKS4wP80CSFzVULFBuTznyQnWBw EoVT2AoA+KRBI7glI9UeXA==; Received: from relay6.apple.com (relay6.apple.com [17.128.113.90]) by mail-in7.apple.com (Apple Secure Mail Relay) with SMTP id AC.5E.32245.A16BC285; Wed, 16 Nov 2016 11:40:10 -0800 (PST) X-AuditID: 11973e16-f7e959a000007df5-c0-582cb61af891 Received: from chive.apple.com (chive.apple.com [17.128.115.15]) by relay6.apple.com (Apple SCV relay) with SMTP id 25.A6.23613.A16BC285; Wed, 16 Nov 2016 11:40:10 -0800 (PST) MIME-version: 1.0 Received: from da0601a-dhcp137.apple.com ([17.226.15.137]) by chive.apple.com (Oracle Communications Messaging Server 8.0.1.1.0 64bit (built Jun 15 2016)) with ESMTPSA id <0OGR003TM2MXR080@chive.apple.com>; Wed, 16 Nov 2016 11:40:09 -0800 (PST) Sender: afish@apple.com From: Andrew Fish Message-id: <753C4238-14C6-4839-B05B-0EB6BBFC2644@apple.com> Date: Wed, 16 Nov 2016 11:40:09 -0800 In-reply-to: Cc: Laszlo Ersek , "Fan, Jeff" , "edk2-devel@ml01.01.org" , Paolo Bonzini , "Tian, Feng" , "Yao, Jiewen" To: Mike Kinney References: <20161115140809.5292-1-jeff.fan@intel.com> <605a5187-5b91-e0f8-5560-8a02ddad5057@redhat.com> X-Mailer: Apple Mail (2.3226) X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFvrILMWRmVeSWpSXmKPExsUi2FAYpSu1TSfCYGK/kMW6Pd/YLSbtZrc4 uX4Jo8W6jx4Wy47tYLHo6PjHZLF/2z9WB3aPxXteMnlMuvCY2eP9vqtsAcxRXDYpqTmZZalF +nYJXBmrWvoZC9b2Mlf82baKqYFx7U2mLkZODgkBE4npj/cwdzFycQgJ7GWUePb7HFyi7ddy FojERkaJNU1v2UASvAKCEj8m32MBsZkFwiR2bljKBFH0i1Fi34M7YAlhAXGJd2c2MYPYbALK Eivmf2DvYuQAaraRuLZPHqLEWeLf5NVgM1kEVCUuH30GtpgTaObWxinsIDOZBd4xShxoa2YD 6RUR0JHoXhkNsWsLo8TR/uVMIHEJAVmJ2b+8QOISAv/ZJLZsWMc8gVFoFpJbZyG5FcLWkvj+ qBUozgFky0scPC8LEdaUeHbvEzuErS3x5N0F1gWMbKsYhXITM3N0M/PM9RILCnJS9ZLzczcx guJoup3YDsaHq6wOMQpwMCrx8EoU6UQIsSaWFVfmHmKU5mBREud1WwUUEkhPLEnNTk0tSC2K LyrNSS0+xMjEwSkFjIZJsezra7u+KKyYtW5q/Zovijz8KvnO8RtuWBpxFv1zlYiZvrva8KGX 5YR9lw8pS+RcY2G++b6Ns15xwyw2a17ro3Wfn1xV1XZdGrW6clXQB4W47d/3mLbsmCO8tDj/ 79HXvAxHN3od4pR8J73wwsW9U24GOHJ8zBB8tu//xMCsXg+p5ojVh5RYijMSDbWYi4oTAXNM tMSEAgAA X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrOIsWRmVeSWpSXmKPExsUi2FDMryu1TSfC4Ms2Lot1e76xW0zazW5x cv0SRot1Hz0slh3bwWLR0fGPyWL/tn+sDuwei/e8ZPKYdOExs8f7fVfZApijuGxSUnMyy1KL 9O0SuDJWtfQzFqztZa74s20VUwPj2ptMXYycHBICJhJtv5azQNhiEhfurWfrYuTiEBLYyCix puktG0iCV0BQ4sfke2BFzAJhEjs3LGWCKPrFKLHvwR2whLCAuMS7M5uYQWw2AWWJFfM/sHcx cgA120hc2ycPUeIs8W/yarCZLAKqEpePPgM7ghNo5tbGKewgM5kF3jFKHGhrZgPpFRHQkehe GQ2xawujxNH+5UwgcQkBWYnZv7wmMArMQnLeLCTnQdhaEt8ftQLFOYBseYmD52UhwpoSz+59 YoewtSWevLvAuoCRbRWjQFFqTmKlmV5iQUFOql5yfu4mRnA8FEbtYGxYbnWIUYCDUYmHV6JI J0KINbGsuDL3EKMEB7OSCK/oWqAQb0piZVVqUX58UWlOavEhxomMQE9OZJYSTc4HRmteSbyh iYmBibGxmbGxuYk5LYWVxHk3bdaOEBJITyxJzU5NLUgtgjmKiYNTqoExa88Se889T3IOMu9b afAhfJ6qu7JP5rUTPCay77V97vF6L7p+Je59DMPO6L5Ty0qrJe/vbhd1cT9Xr9S2WeD+Rnav Uql6tv1PN5skupg3vKj0P9W95vyBq6zNvn8EFgue9rvZ/jx/8U32hctUZ04T/reDz+TO/1mB jy58CjE2zjNYtq1NTOmAEktxRqKhFnNRcSIAMkttBvoCAAA= X-Content-Filtered-By: Mailman/MimeDel 2.1.21 Subject: Re: [PATCH v2 0/2] Add volatile for mNumberToFinish X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 16 Nov 2016 19:40:06 -0000 Content-Type: text/plain; CHARSET=US-ASCII Content-Transfer-Encoding: 7BIT > On Nov 16, 2016, at 10:18 AM, Kinney, Michael D wrote: > > Laszlo, > > Thanks for the details from and ANSI C spec. > > For this compiler issue, are there more details on the > assembly code generated by the GCC 5.4 compiler in the > failing mode? > > I also see Liming's observation that the internal > implementation of the SynchronizationLib adds volatile > in some internal worker functions. Other implementation > artifacts include the use of a read/write memory barrier > to prevent the optimizing compiler from optimizing > across a boundary. These read/write barriers are used > in the spin lock functions, but not the Interlocked*() > functions. > > I want to make sure we have studied the code generation > to see if the issue is related to volatile or a read/write > memory barrier. It could be adding volatile forced the > compiler to internally add read/write barrier. > Mike, I don't think volatile variables imply memory barriers as they are different constraints placed on the compiler. The volatile variable forces the memory operations to alway happen to memory. Thus the compiler can not optimize out a read or write. If two values are written in a row (the 1st write could be optimized away). The compiler also can not cache the variable in a register, or imply it is a constant (last value written etc.). The memory barrier (also called fence) just forces the compiler to complete load store operations before the barrier before it can perform operations after the barrier. Lots of optimization and reordering can happen between the barriers. You are still running the optimizer so code can get optimized away and reordered, only the scope of how much code can be optimized was changed. Thus the volatile is a constraint placed on an individual variable, and the barrier is a constraint places on how big a chunk of code can be optimized. I guess if you placed a barrier between every access to a variable you could simulate volatile behavior, but with a lot more side effects (less optimization than just using volatile). Here is a simple example: a) Nothing() writes 3 to 0x12345678 and returns constant 3 b) Barrier() writes 2 to 0x12345678, then writes 3 and returns constant 3 c) Volatile() writes 1 to 0x12345678, writes 2, writes 3, then reads 3 from memory to return it. d) AllBarrier() writes 1 to 0x12345678, writes 2, writes 3, then reads 3 from memory to return it. $ cat barriers.c #define _ReadWriteBarrier() do { __asm__ __volatile__ ("": : : "memory"); } while(0) int Nothing () { int *ptr = (int *)0x12345678; *ptr = 1; *ptr = 2; *ptr = 3; return *ptr; } int Barrier () { int *ptr = (int *)0x12345678; *ptr = 1; *ptr = 2; _ReadWriteBarrier(); *ptr = 3; return *ptr; } int Volatile () { volatile int *ptr = (int *)0x12345678; *ptr = 1; *ptr = 2; *ptr = 3; return *ptr; } int AllBarrier () { int *ptr = (int *)0x12345678; *ptr = 1; _ReadWriteBarrier(); *ptr = 2; _ReadWriteBarrier(); *ptr = 3; _ReadWriteBarrier(); return *ptr; } $ clang -Os -S barriers.c $ cat barriers.S .globl _Nothing _Nothing: ## @Nothing pushq %rbp movq %rsp, %rbp movl $3, 305419896 movl $3, %eax popq %rbp retq .globl _Barrier _Barrier: ## @Barrier pushq %rbp movq %rsp, %rbp movl $2, 305419896 ## InlineAsm Start ## InlineAsm End movl $3, 305419896 movl $3, %eax popq %rbp retq .globl _Volatile _Volatile: ## @Volatile pushq %rbp movq %rsp, %rbp movl $1, 305419896 movl $2, 305419896 movl $3, 305419896 movl 305419896, %eax popq %rbp retq _AllBarrier: ## @AllBarrier pushq %rbp movq %rsp, %rbp movl $1, 305419896 ## InlineAsm Start ## InlineAsm End movl $2, 305419896 ## InlineAsm Start ## InlineAsm End movl $3, 305419896 ## InlineAsm Start ## InlineAsm End movl 305419896, %eax popq %rbp retq Thanks, Andrew Fish PS These are just X86 examples on a architecture that required synchronizing instructions things are a little more complex and there could be more difference between volatile and barriers. But this topic was complex enough .... > Also, given the implementation I see in the SynchronizationLib > I am not sure the cast to (UINT32 *) is required in the > proposed patch. Do we get compiler warnings/errors if those > casts are not included? > > The second topic is what the SynchronizationLib API interfaces > should have been from the beginning. In retrospect, I think > they should have been defined with volatile pointers. The spin > lock APIs do use volatile pointers, but that is embedded in the > typedef for SPIN_LOCK, so it is not as obvious. > > Thanks, > > Mike > >> -----Original Message----- >> From: Laszlo Ersek [mailto:lersek@redhat.com] >> Sent: Tuesday, November 15, 2016 8:10 AM >> To: Fan, Jeff ; edk2-devel@ml01.01.org >> Cc: Paolo Bonzini ; Yao, Jiewen ; Tian, >> Feng ; Kinney, Michael D >> Subject: Re: [PATCH v2 0/2] Add volatile for mNumberToFinish >> >> Jeff, >> >> On 11/15/16 15:08, Jeff Fan wrote: >>> v2: >>> Add patch #1 per Laszlo's comments >>> at https://lists.01.org/pipermail/edk2-devel/2016-November/004697.html >>> >>> About the comments updated SynchronizationLib to add volatile for >>> input parameter, I will send in another serial patches. >>> >>> Cc: Paolo Bonzini >>> Cc: Laszlo Ersek >>> Cc: Jiewen Yao >>> Cc: Feng Tian >>> Cc: Michael D Kinney >>> Contributed-under: TianoCore Contribution Agreement 1.0 >>> Signed-off-by: Jeff Fan >>> >>> Jeff Fan (2): >>> UefiCpuPkg/PiSmmCpuDxeSmm: Add volatile for parameter NumberToFinish >>> UefiCpuPkg/PiSmmCpuDxeSmm: Add volatile for mNumberToFinish >>> >>> UefiCpuPkg/PiSmmCpuDxeSmm/CpuS3.c | 4 ++-- >>> UefiCpuPkg/PiSmmCpuDxeSmm/Ia32/SmmFuncsArch.c | 4 ++-- >>> UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.h | 2 +- >>> UefiCpuPkg/PiSmmCpuDxeSmm/X64/SmmFuncsArch.c | 2 +- >>> 4 files changed, 6 insertions(+), 6 deletions(-) >>> >> >> if you want to keep GCC5.4 from optimizing away the access, the >> synchronization object itself, and all pointers to it must remain >> volatile. Wherever you cast away the volatile qualifier, for example in >> a function call, GCC can break code on the next level, even if you don't >> actually access the object through that pointer (i.e., if you cast the >> pointer back to volatile just in time for the access). >> >> So, the safe way to go about this is to change function prototypes from >> callee towards callers -- first change the callee (because both volatile >> and non-volatile can be accepted as volatile), then change the caller >> (make sure what you pass in is volatile, and propagate it outwards). >> >> It is also okay to convert the original volatile pointer to UINTN, and >> to pass it to assembly code like that, or to convert it back to a >> volatile pointer from UINTN before use. >> >> From the C99 standard: >> >> 6.3 Conversions >> 6.3.2.3 Pointers >> >> 2 For any qualifier q, a pointer to a non-q-qualified type may be >> converted to a pointer to the q-qualified version of the type; the >> values stored in the original and converted pointers shall compare >> equal. >> >> 5 An integer may be converted to any pointer type. Except as >> previously specified, the result is implementation-defined, might >> not be correctly aligned, might not point to an entity of the >> referenced type, and might be a trap representation. >> >> 6 Any pointer type may be converted to an integer type. Except as >> previously specified, the result is implementation-defined. If the >> result cannot be represented in the integer type, the behavior is >> undefined. The result need not be in the range of values of any >> integer type. >> >> 6.7.3 Type qualifiers, paragraph 5: >> >> If an attempt is made to modify an object defined with a >> const-qualified type through use of an lvalue with >> non-const-qualified type, the behavior is undefined. If an attempt >> is made to refer to an object defined with a volatile-qualified >> type through use of an lvalue with non-volatile-qualified type, the >> behavior is undefined. >> >> In summary: >> >> - casting away "volatile" even just temporarily (without actual >> accesses) may give gcc license to break the code (6.3.2.3 p2) >> >> - accessing without volatile is known to break the code (6.7.3 p5) >> >> - you can always cast from non-volatile to volatile (6.3.2.3 p2), >> but not the other way around! >> >> - you can cast from (volatile VOID *) to UINTN and back as much as >> you want (6.3.2.3 p5 p6), because our execution environment makes >> that safe ("implementation-defined") >> >> We might want to play loose with 6.3.2.3 p2 -- that is, cast away >> volatile from the pointer only temporarily, and cast it back just before >> accessing the object through the pointer --, but that could be unsafe in >> the long term. The *really* safe method is to cast it to UINTN, and then >> back the same way. >> >> Yes, this would affect functions like SwitchStack() too -- the Context1 >> and Context2 parameters would have to change their types to UINTN. >> >> I think what we should discuss at this point is whether we'd like to >> care about 6.3.2.3 p2; that is, whether we consider casting away >> volatile temporarily. >> >> The direction I've been experiencing with GCC is that its optimization >> methods are becoming more aggressive. For some optimizations, there are >> flags that disable them; I'm not sure they provide a specific flag for >> preventing GCC from exploiting violations of 6.3.2.3 p2. >> >> Thanks >> Laszlo > _______________________________________________ > edk2-devel mailing list > edk2-devel@lists.01.org > https://lists.01.org/mailman/listinfo/edk2-devel