From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by mx.groups.io with SMTP id smtpd.web12.5725.1612432274334667139 for ; Thu, 04 Feb 2021 01:51:14 -0800 Authentication-Results: mx.groups.io; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=SihvlNGk; spf=pass (domain: redhat.com, ip: 216.205.24.124, mailfrom: lersek@redhat.com) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1612432273; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=c6vtTDYHzuvxhJDCXhd7R30I6fk5EMva580YWwhF/B8=; b=SihvlNGkanYxOnr2FE2AgnhjeyQtMAaVfiaod3CzCTvhfxUmRuOClG39SaC4+bEMdf7uSI Rusz8bTKF1r41ncCZM4Him3ek1DTcuyjaPGoWFSvJkiSgeLbB5Ke6HayRuP8bQ9bwpkCkV UPDcFoZnT2CJiBIGb1TvFwyT1/wcv/k= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-151-hUSrvIuENZiHk9PC9-btfA-1; Thu, 04 Feb 2021 04:51:11 -0500 X-MC-Unique: hUSrvIuENZiHk9PC9-btfA-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 6F633107ACE8; Thu, 4 Feb 2021 09:51:10 +0000 (UTC) Received: from lacos-laptop-7.usersys.redhat.com (ovpn-114-169.ams2.redhat.com [10.36.114.169]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4166F194A4; Thu, 4 Feb 2021 09:51:09 +0000 (UTC) Subject: Re: [edk2-devel] [PATCH 2/2] UefiCpuPkg/MpInitLib: Use XADD to avoid lock acquire/release To: devel@edk2.groups.io, ray.ni@intel.com Cc: Eric Dong , Rahul1 Kumar References: <20210204025921.1428-1-ray.ni@intel.com> <20210204025921.1428-3-ray.ni@intel.com> From: "Laszlo Ersek" Message-ID: <2afe6a89-5469-7576-08e3-5ee60eb9e754@redhat.com> Date: Thu, 4 Feb 2021 10:51:08 +0100 MIME-Version: 1.0 In-Reply-To: <20210204025921.1428-3-ray.ni@intel.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=lersek@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit On 02/04/21 03:59, Ni, Ray wrote: > When AP firstly wakes up, MpFuncs.nasm contains below logic to assign > an unique ApIndex to each AP according to who comes first: > ---ASM--- > TestLock: > xchg [edi], eax > cmp eax, NotVacantFlag > jz TestLock > > mov ecx, esi > add ecx, ApIndexLocation > inc dword [ecx] > mov ebx, [ecx] > > Releaselock: > mov eax, VacantFlag > xchg [edi], eax > ---ASM END--- > > "lock inc" cannot be used to increase ApIndex because not only the > global ApIndex should be increased, but also the result should be > stored to a local general purpose register EBX. > > This patch learns from the NASM implementation of > InternalSyncIncrement() to use "XADD" instruction which can increase > the global ApIndex and store the original ApIndex to EBX in one > instruction. > > With this patch, OVMF when running in a 255 threads QEMU spends about > one second to wakeup all APs. Original implementation needs more than > 10 seconds. > > Signed-off-by: Ray Ni > Cc: Laszlo Ersek > Cc: Eric Dong > Cc: Rahul1 Kumar > --- > UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc | 4 ---- > .../Library/MpInitLib/Ia32/MpFuncs.nasm | 21 +++++-------------- > UefiCpuPkg/Library/MpInitLib/MpLib.c | 3 +-- > UefiCpuPkg/Library/MpInitLib/MpLib.h | 3 +-- > UefiCpuPkg/Library/MpInitLib/X64/MpEqu.inc | 4 ---- > UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm | 18 ++++------------ > 6 files changed, 11 insertions(+), 42 deletions(-) > > diff --git a/UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc b/UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc > index 244c1e72b7..5f1f0351d2 100644 > --- a/UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc > +++ b/UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc > @@ -12,16 +12,12 @@ > ; > ;------------------------------------------------------------------------------- > > -VacantFlag equ 00h > -NotVacantFlag equ 0ffh > - > CPU_SWITCH_STATE_IDLE equ 0 > CPU_SWITCH_STATE_STORED equ 1 > CPU_SWITCH_STATE_LOADED equ 2 > > MP_CPU_EXCHANGE_INFO_OFFSET equ (SwitchToRealProcEnd - RendezvousFunnelProcStart) > struc MP_CPU_EXCHANGE_INFO > - .Lock: resd 1 > .StackStart: resd 1 > .StackSize: resd 1 > .CFunction: resd 1 > diff --git a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm > index 908c2eb447..b7267393db 100644 > --- a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm > +++ b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm > @@ -122,23 +122,12 @@ SkipEnableExecuteDisable: > > ; AP init > mov edi, esi > - add edi, MP_CPU_EXCHANGE_INFO_OFFSET > - mov eax, NotVacantFlag > - > -TestLock: > - xchg [edi], eax > - cmp eax, NotVacantFlag > - jz TestLock > - > - mov ecx, esi > - add ecx, MP_CPU_EXCHANGE_INFO_OFFSET + MP_CPU_EXCHANGE_INFO.ApIndex > - inc dword [ecx] > - mov ebx, [ecx] > - > -Releaselock: > - mov eax, VacantFlag > - xchg [edi], eax > + add edi, MP_CPU_EXCHANGE_INFO_OFFSET + MP_CPU_EXCHANGE_INFO.ApIndex > + mov ebx, 1 > + lock xadd [edi], ebx ; EBX = ApIndex++ (1) For more clarity, I would suggest lock xadd dword [edi], ebx (even though "ebx" already specifies the width, yes) Applies to the X64 version too. > + inc ebx ; EBX is CpuNumber > > + ; program stack (2) This change belongs in a separate patch. The X64 version already has this comment, and making both versions makes sense. But touching anything "stack-related" in the current patch is very confusing to me. Thanks Laszlo > mov edi, esi > add edi, MP_CPU_EXCHANGE_INFO_OFFSET + MP_CPU_EXCHANGE_INFO.StackSize > mov eax, [edi] > diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.c b/UefiCpuPkg/Library/MpInitLib/MpLib.c > index 8b1f7f84ba..32a3951742 100644 > --- a/UefiCpuPkg/Library/MpInitLib/MpLib.c > +++ b/UefiCpuPkg/Library/MpInitLib/MpLib.c > @@ -1,7 +1,7 @@ > /** @file > CPU MP Initialize Library common functions. > > - Copyright (c) 2016 - 2020, Intel Corporation. All rights reserved.
> + Copyright (c) 2016 - 2021, Intel Corporation. All rights reserved.
> Copyright (c) 2020, AMD Inc. All rights reserved.
> > SPDX-License-Identifier: BSD-2-Clause-Patent > @@ -1012,7 +1012,6 @@ FillExchangeInfoData ( > IA32_CR4 Cr4; > > ExchangeInfo = CpuMpData->MpCpuExchangeInfo; > - ExchangeInfo->Lock = 0; > ExchangeInfo->StackStart = CpuMpData->Buffer; > ExchangeInfo->StackSize = CpuMpData->CpuApStackSize; > ExchangeInfo->BufferStart = CpuMpData->WakeupBuffer; > diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.h b/UefiCpuPkg/Library/MpInitLib/MpLib.h > index 02652eaae1..0bd60388b1 100644 > --- a/UefiCpuPkg/Library/MpInitLib/MpLib.h > +++ b/UefiCpuPkg/Library/MpInitLib/MpLib.h > @@ -1,7 +1,7 @@ > /** @file > Common header file for MP Initialize Library. > > - Copyright (c) 2016 - 2020, Intel Corporation. All rights reserved.
> + Copyright (c) 2016 - 2021, Intel Corporation. All rights reserved.
> Copyright (c) 2020, AMD Inc. All rights reserved.
> > SPDX-License-Identifier: BSD-2-Clause-Patent > @@ -190,7 +190,6 @@ typedef struct _CPU_MP_DATA CPU_MP_DATA; > // into this structure are used in assembly code in this module > // > typedef struct { > - UINTN Lock; > UINTN StackStart; > UINTN StackSize; > UINTN CFunction; > diff --git a/UefiCpuPkg/Library/MpInitLib/X64/MpEqu.inc b/UefiCpuPkg/Library/MpInitLib/X64/MpEqu.inc > index 3974330991..32e9a262bc 100644 > --- a/UefiCpuPkg/Library/MpInitLib/X64/MpEqu.inc > +++ b/UefiCpuPkg/Library/MpInitLib/X64/MpEqu.inc > @@ -12,16 +12,12 @@ > ; > ;------------------------------------------------------------------------------- > > -VacantFlag equ 00h > -NotVacantFlag equ 0ffh > - > CPU_SWITCH_STATE_IDLE equ 0 > CPU_SWITCH_STATE_STORED equ 1 > CPU_SWITCH_STATE_LOADED equ 2 > > MP_CPU_EXCHANGE_INFO_OFFSET equ (SwitchToRealProcEnd - RendezvousFunnelProcStart) > struc MP_CPU_EXCHANGE_INFO > - .Lock: resq 1 > .StackStart: resq 1 > .StackSize: resq 1 > .CFunction: resq 1 > diff --git a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm > index 423beb2cca..383b4974f8 100644 > --- a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm > +++ b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm > @@ -158,21 +158,11 @@ LongModeStart: > > ; AP init > mov edi, esi > - add edi, MP_CPU_EXCHANGE_INFO_OFFSET + MP_CPU_EXCHANGE_INFO.Lock > - mov rax, NotVacantFlag > + add edi, MP_CPU_EXCHANGE_INFO_OFFSET + MP_CPU_EXCHANGE_INFO.ApIndex > + mov ebx, 1 > + lock xadd [edi], ebx ; EBX = ApIndex++ > + inc ebx ; EBX is CpuNumber > > -TestLock: > - xchg qword [edi], rax > - cmp rax, NotVacantFlag > - jz TestLock > - > - lea ecx, [esi + MP_CPU_EXCHANGE_INFO_OFFSET + MP_CPU_EXCHANGE_INFO.ApIndex] > - inc dword [ecx] > - mov ebx, [ecx] > - > -Releaselock: > - mov rax, VacantFlag > - xchg qword [edi], rax > ; program stack > mov edi, esi > add edi, MP_CPU_EXCHANGE_INFO_OFFSET + MP_CPU_EXCHANGE_INFO.StackSize >