From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail02.groups.io (mail02.groups.io [66.175.222.108]) by spool.mail.gandi.net (Postfix) with ESMTPS id F03217803CE for ; Wed, 6 Mar 2024 12:11:14 +0000 (UTC) DKIM-Signature: a=rsa-sha256; bh=4JZgXzGTLlBxuHKzPX/6+6Rqxmbm39n6y33B2l8FN88=; c=relaxed/simple; d=groups.io; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References:Precedence:List-Subscribe:List-Help:Sender:List-Id:Mailing-List:Delivered-To:Reply-To:List-Unsubscribe-Post:List-Unsubscribe; s=20240206; t=1709727073; v=1; b=QHRs3503iy7l7CSs0EYDmf9agl06ZxxUxlVDcPNoYZ6TLr/w92upUqyAJyMz0hNyH1/VU2/u /Mten8AvHIOKmtN55ioR3ZZPGZqvO+DrL6Vvjbw6Xzl8HiRMqduXY5zzSXhvw9P+aJfYD/8tGKd 9YQ67TTOJYKJu6DfRD8CTbeULaKcxML+b3OFPIOSAQb7nkq/3QK90spBecHOiCnjbPaHmO5nnOE otwuvJI4dst967kdEm5zNPyHm9rt5geyaGOovudnDNL+N4WjaWgnRS3J1LwNpN6woE0hh/oXRxm ZHfDAzmMRXRMjpKbTqcT9UMNDDer3GHW/5XpxuwSJnrNw== X-Received: by 127.0.0.2 with SMTP id dzaFYY7687511x85nLbAJm7D; Wed, 06 Mar 2024 04:11:13 -0800 X-Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.9]) by mx.groups.io with SMTP id smtpd.web10.10348.1709727069087992268 for ; Wed, 06 Mar 2024 04:11:13 -0800 X-IronPort-AV: E=McAfee;i="6600,9927,11004"; a="26803661" X-IronPort-AV: E=Sophos;i="6.06,208,1705392000"; d="scan'208";a="26803661" X-Received: from fmviesa010.fm.intel.com ([10.60.135.150]) by orvoesa101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Mar 2024 04:11:13 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.06,208,1705392000"; d="scan'208";a="9672899" X-Received: from sh1gapp1009.ccr.corp.intel.com ([10.239.189.219]) by fmviesa010.fm.intel.com with ESMTP; 06 Mar 2024 04:11:10 -0800 From: "Wu, Jiaxin" To: devel@edk2.groups.io Cc: Ray Ni , Laszlo Ersek , Eric Dong , Zeng Star , Gerd Hoffmann , Rahul Kumar Subject: [edk2-devel] [PATCH v1 2/2] UefiCpuPkg/PiSmmCpuDxeSmm: Make RunningApCount on exclusive cacheline Date: Wed, 06 Mar 2024 04:11:13 -0800 Message-Id: <20240306121103.356-3-jiaxin.wu@intel.com> In-Reply-To: <20240306121103.356-1-jiaxin.wu@intel.com> References: <20240306121103.356-1-jiaxin.wu@intel.com> Precedence: Bulk List-Subscribe: List-Help: Sender: devel@edk2.groups.io List-Id: Mailing-List: list devel@edk2.groups.io; contact devel+owner@edk2.groups.io Reply-To: devel@edk2.groups.io,jiaxin.wu@intel.com List-Unsubscribe-Post: List-Unsubscribe=One-Click List-Unsubscribe: X-Gm-Message-State: FSmr6wA59sYDBLQwwFuuwJnbx7686176AA= X-GND-Status: LEGIT Authentication-Results: spool.mail.gandi.net; dkim=pass header.d=groups.io header.s=20240206 header.b=QHRs3503; dmarc=fail reason="SPF not aligned (relaxed), DKIM not aligned (relaxed)" header.from=intel.com (policy=none); spf=pass (spool.mail.gandi.net: domain of bounce@groups.io designates 66.175.222.108 as permitted sender) smtp.mailfrom=bounce@groups.io For non blocking mode during SmmMpBroadcastProcedure, multiple APs might contended access the RunningApCount in the PROCEDURE_TOKEN: Step1: RunningApCount is initialized to the mMaxNumberOfCpus (See GetFreeToken). Step2: Decrease RunningApCount if the AP is not present (See InterlockedDecrement in InternalSmmStartupAllAPs). Step3: multiple APs are contended to decrease RunningApCount in the same token (See ReleaseToken in APHandler). So, Contended lock case happen during Step3. For multiple APs access the shared memory (RunningApCount), we shall use exclusive cache line with WB attribute for SMM Performance Tuning. This patch makes RunningApCount on exclusive cacheline. Cc: Ray Ni Cc: Laszlo Ersek Cc: Eric Dong Cc: Zeng Star Cc: Gerd Hoffmann Cc: Rahul Kumar Signed-off-by: Jiaxin Wu --- UefiCpuPkg/PiSmmCpuDxeSmm/MpService.c | 35 +++++++++++++++++++----------- UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.h | 2 +- 2 files changed, 23 insertions(+), 14 deletions(-) diff --git a/UefiCpuPkg/PiSmmCpuDxeSmm/MpService.c b/UefiCpuPkg/PiSmmCpuDxeSmm/MpService.c index 9790b4f888..05fa6854fe 100644 --- a/UefiCpuPkg/PiSmmCpuDxeSmm/MpService.c +++ b/UefiCpuPkg/PiSmmCpuDxeSmm/MpService.c @@ -421,11 +421,11 @@ ReleaseToken ( { PROCEDURE_TOKEN *Token; Token = mSmmMpSyncData->CpuData[CpuIndex].Token; - if (InterlockedDecrement (&Token->RunningApCount) == 0) { + if (InterlockedDecrement (Token->RunningApCount) == 0) { ReleaseSpinLock (Token->SpinLock); } mSmmMpSyncData->CpuData[CpuIndex].Token = NULL; } @@ -970,12 +970,12 @@ AllocateTokenBuffer ( ) { UINTN SpinLockSize; UINT32 TokenCountPerChunk; UINTN Index; - SPIN_LOCK *SpinLock; - UINT8 *SpinLockBuffer; + UINTN BufferAddr; + VOID *Buffer; PROCEDURE_TOKEN *ProcTokens; SpinLockSize = GetSpinLockProperties (); TokenCountPerChunk = FixedPcdGet32 (PcdCpuSmmMpTokenCountPerChunk); @@ -986,25 +986,34 @@ AllocateTokenBuffer ( } DEBUG ((DEBUG_INFO, "CpuSmm: SpinLock Size = 0x%x, PcdCpuSmmMpTokenCountPerChunk = 0x%x\n", SpinLockSize, TokenCountPerChunk)); // - // Separate the Spin_lock and Proc_token because the alignment requires by Spin_Lock. + // Allocate the buffer for SpinLock and RunningApCount to meet the alignment requirement. // - SpinLockBuffer = AllocatePool (SpinLockSize * TokenCountPerChunk); - ASSERT (SpinLockBuffer != NULL); + Buffer = AllocatePages (EFI_SIZE_TO_PAGES (SpinLockSize * TokenCountPerChunk * 2)); + if (Buffer == NULL) { + DEBUG ((DEBUG_ERROR, "AllocateTokenBuffer: Failed to allocate the buffer for SpinLock and RunningApCount!\n")); + CpuDeadLoop (); + } ProcTokens = AllocatePool (sizeof (PROCEDURE_TOKEN) * TokenCountPerChunk); ASSERT (ProcTokens != NULL); + BufferAddr = (UINTN)Buffer; for (Index = 0; Index < TokenCountPerChunk; Index++) { - SpinLock = (SPIN_LOCK *)(SpinLockBuffer + SpinLockSize * Index); - InitializeSpinLock (SpinLock); + ProcTokens[Index].Signature = PROCEDURE_TOKEN_SIGNATURE; + + ProcTokens[Index].SpinLock = (SPIN_LOCK *)BufferAddr; + InitializeSpinLock (ProcTokens[Index].SpinLock); + + BufferAddr += SpinLockSize; + + ProcTokens[Index].RunningApCount = (volatile UINT32 *)BufferAddr; + *ProcTokens[Index].RunningApCount = 0; - ProcTokens[Index].Signature = PROCEDURE_TOKEN_SIGNATURE; - ProcTokens[Index].SpinLock = SpinLock; - ProcTokens[Index].RunningApCount = 0; + BufferAddr += SpinLockSize; InsertTailList (&gSmmCpuPrivate->TokenList, &ProcTokens[Index].Link); } return &ProcTokens[0].Link; @@ -1036,11 +1045,11 @@ GetFreeToken ( } NewToken = PROCEDURE_TOKEN_FROM_LINK (gSmmCpuPrivate->FirstFreeToken); gSmmCpuPrivate->FirstFreeToken = GetNextNode (&gSmmCpuPrivate->TokenList, gSmmCpuPrivate->FirstFreeToken); - NewToken->RunningApCount = RunningApsCount; + *NewToken->RunningApCount = RunningApsCount; AcquireSpinLock (NewToken->SpinLock); return NewToken; } @@ -1298,11 +1307,11 @@ InternalSmmStartupAllAPs ( // // Decrease the count to mark this processor(AP or BSP) as finished. // if (ProcToken != NULL) { - InterlockedDecrement (&ProcToken->RunningApCount); + InterlockedDecrement (ProcToken->RunningApCount); } } } ReleaseAllAPs (); diff --git a/UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.h b/UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.h index 7f244ea803..07473208fd 100644 --- a/UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.h +++ b/UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.h @@ -213,11 +213,11 @@ typedef struct { typedef struct { UINTN Signature; LIST_ENTRY Link; SPIN_LOCK *SpinLock; - volatile UINT32 RunningApCount; + volatile UINT32 *RunningApCount; } PROCEDURE_TOKEN; #define PROCEDURE_TOKEN_FROM_LINK(a) CR (a, PROCEDURE_TOKEN, Link, PROCEDURE_TOKEN_SIGNATURE) #define TOKEN_BUFFER_SIGNATURE SIGNATURE_32 ('T', 'K', 'B', 'S') -- 2.16.2.windows.1 -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#116435): https://edk2.groups.io/g/devel/message/116435 Mute This Topic: https://groups.io/mt/104764105/7686176 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io] -=-=-=-=-=-=-=-=-=-=-=-