From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: mx.groups.io; dkim=missing; spf=pass (domain: hpe.com, ip: 148.163.147.86, mailfrom: brian.johnson@hpe.com) Received: from mx0a-002e3701.pphosted.com (mx0a-002e3701.pphosted.com [148.163.147.86]) by groups.io with SMTP; Fri, 26 Apr 2019 14:08:06 -0700 Received: from pps.filterd (m0134420.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x3QL6aGZ012355; Fri, 26 Apr 2019 21:08:06 GMT Received: from g9t5009.houston.hpe.com (g9t5009.houston.hpe.com [15.241.48.73]) by mx0b-002e3701.pphosted.com with ESMTP id 2s46w4h8dc-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 26 Apr 2019 21:08:06 +0000 Received: from g9t2301.houston.hpecorp.net (g9t2301.houston.hpecorp.net [16.220.97.129]) by g9t5009.houston.hpe.com (Postfix) with ESMTP id 305C855; Fri, 26 Apr 2019 21:08:05 +0000 (UTC) Received: from [10.33.152.19] (bjj-laptop2.americas.hpqcorp.net [10.33.152.19]) by g9t2301.houston.hpecorp.net (Postfix) with ESMTP id DD3D14F; Fri, 26 Apr 2019 21:08:04 +0000 (UTC) Subject: Re: [edk2-devel] [Patch 1/4] MdePkg/BaseLib: Verify SSE2 support in IA32 AsmLfence() To: devel@edk2.groups.io, lersek@redhat.com, michael.d.kinney@intel.com Cc: Liming Gao References: <20190425175334.5944-1-michael.d.kinney@intel.com> <20190425175334.5944-2-michael.d.kinney@intel.com> From: "Brian J. Johnson" Message-ID: <7b0ca337-3642-2070-0fff-515704a24bdf@hpe.com> Date: Fri, 26 Apr 2019 16:08:04 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1 MIME-Version: 1.0 In-Reply-To: X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2019-04-26_14:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=2 phishscore=0 bulkscore=0 spamscore=0 clxscore=1011 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1904260139 Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit On 4/26/19 3:27 PM, Laszlo Ersek wrote: > On 04/25/19 19:53, Michael D Kinney wrote: >> Use CPUID in IA32 implementation of AsmLfence() to verify >> that SSE2 is supported before using the LFENCE instruction. >> >> Cc: Liming Gao >> Signed-off-by: Michael D Kinney >> --- >> MdePkg/Library/BaseLib/Ia32/Lfence.nasm | 14 +++++++++++++- >> 1 file changed, 13 insertions(+), 1 deletion(-) >> >> diff --git a/MdePkg/Library/BaseLib/Ia32/Lfence.nasm b/MdePkg/Library/BaseLib/Ia32/Lfence.nasm >> index 44478be35f..0a60ae1d57 100644 >> --- a/MdePkg/Library/BaseLib/Ia32/Lfence.nasm >> +++ b/MdePkg/Library/BaseLib/Ia32/Lfence.nasm >> @@ -1,5 +1,5 @@ >> ;------------------------------------------------------------------------------ ; >> -; Copyright (c) 2018, Intel Corporation. All rights reserved.
>> +; Copyright (c) 2018 - 2019, Intel Corporation. All rights reserved.
>> ; SPDX-License-Identifier: BSD-2-Clause-Patent >> ; >> ; Module Name: >> @@ -26,5 +26,17 @@ >> ;------------------------------------------------------------------------------ >> global ASM_PFX(AsmLfence) >> ASM_PFX(AsmLfence): >> + ; >> + ; Use CPUID instruction (CPUID.01H:EDX.SSE2[bit 26] = 1) to test whether the >> + ; processor supports SSE2 instruction. Save/restore non-volatile register >> + ; EBX that is modified by CPUID >> + ; >> + push ebx >> + mov eax, 1 >> + cpuid >> + bt edx, 26 >> + jnc Done >> lfence >> +Done: >> + pop ebx >> ret >> > > The SDM seems to confirm that lfence depends on SSE2. > > However, that raises another question: > > AsmLfence() is used for implementing SpeculationBarrier() on IA32/X64. > And so I wonder: the plaforms where lfence is unavailable, do they *not* > need a speculation barrier at all, or do they need a *different* > implementation? > > Thanks, > Laszlo Several issues: This patch introduces stronger fencing than is required. The SDM says, "Reads or writes cannot be reordered with I/O instructions, locked instructions, or serializing instructions." (vol 3a, sec 8.2.2.) The cpuid instruction is a "serializing instruction" (sec 8.3). So the cpuid is essentially a load fence plus a store fence (plus other functionality, such as draining the memory queues.) That makes the lfence following it redundant. Cpuid is a fairly heavyweight instruction due to its serializing behavior. It provides the load fencing semantics of lfence, but can introduce a significant performance hit if it's called often. Lfence is a lot lighter weight instruction. So using cpuid in AsmLfence may make it a lot slower than the caller expects. Also, skipping a fencing instruction if it's not supported doesn't seem like the right approach in any case. The caller expects AsmLfence to provide certain fencing semantics... the implementation isn't free to just do nothing (unless it's on an architecture where load fencing is not required, since memory is always ordered.) Plus, the routine is called "AsmLfence", so I'd expect it to always use lfence, and cause an exception if the instruction isn't implemented. I'd think the callers should be modified to call AsmLfence or routines using other instructions (such as cpuid) to provide fencing according to the CPU architecture they are on. Is there a way to make this a compile-time decision? I haven't tried to track down all the callers of AsmLfence.... Thanks, -- Brian J. Johnson Enterprise X86 Lab Hewlett Packard Enterprise