From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by mx.groups.io with SMTP id smtpd.web12.6051.1575531832435897090 for ; Wed, 04 Dec 2019 23:43:52 -0800 Authentication-Results: mx.groups.io; dkim=missing; spf=pass (domain: intel.com, ip: 134.134.136.100, mailfrom: ray.ni@intel.com) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 04 Dec 2019 23:43:51 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.69,280,1571727600"; d="scan'208";a="236584091" Received: from fmsmsx105.amr.corp.intel.com ([10.18.124.203]) by fmsmga004.fm.intel.com with ESMTP; 04 Dec 2019 23:43:51 -0800 Received: from fmsmsx602.amr.corp.intel.com (10.18.126.82) by FMSMSX105.amr.corp.intel.com (10.18.124.203) with Microsoft SMTP Server (TLS) id 14.3.439.0; Wed, 4 Dec 2019 23:43:51 -0800 Received: from fmsmsx602.amr.corp.intel.com (10.18.126.82) by fmsmsx602.amr.corp.intel.com (10.18.126.82) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Wed, 4 Dec 2019 23:43:50 -0800 Received: from shsmsx105.ccr.corp.intel.com (10.239.4.158) by fmsmsx602.amr.corp.intel.com (10.18.126.82) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256) id 15.1.1713.5 via Frontend Transport; Wed, 4 Dec 2019 23:43:50 -0800 Received: from shsmsx104.ccr.corp.intel.com ([169.254.5.90]) by SHSMSX105.ccr.corp.intel.com ([169.254.11.72]) with mapi id 14.03.0439.000; Thu, 5 Dec 2019 15:43:48 +0800 From: "Ni, Ray" To: "Dong, Eric" , "devel@edk2.groups.io" CC: Laszlo Ersek Subject: Re: [PATCH v5] UefiCpuPkg/PiSmmCpuDxeSmm: Avoid allocate Token every time Thread-Topic: [PATCH v5] UefiCpuPkg/PiSmmCpuDxeSmm: Avoid allocate Token every time Thread-Index: AQHVqzrCivFQVHBdoUKAuPJo49QGLKerKFRA Date: Thu, 5 Dec 2019 07:43:47 +0000 Deferred-Delivery: Thu, 5 Dec 2019 07:42:00 +0000 Message-ID: <734D49CCEBEEF84792F5B80ED585239D5C391550@SHSMSX104.ccr.corp.intel.com> References: <20191205070805.1951-1-eric.dong@intel.com> In-Reply-To: <20191205070805.1951-1-eric.dong@intel.com> Accept-Language: en-US, zh-CN X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.239.127.40] MIME-Version: 1.0 Return-Path: ray.ni@intel.com Content-Language: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Reviewed-by: Ray Ni > -----Original Message----- > From: Dong, Eric > Sent: Thursday, December 5, 2019 3:08 PM > To: devel@edk2.groups.io > Cc: Ni, Ray ; Laszlo Ersek > Subject: [PATCH v5] UefiCpuPkg/PiSmmCpuDxeSmm: Avoid allocate Token > every time >=20 > REF: https://bugzilla.tianocore.org/show_bug.cgi?id=3D2388 >=20 > Token is new introduced by MM MP Protocol. Current logic allocate Token > every time when need to use it. The logic caused SMI latency raised to > very high. Update logic to allocate Token buffer at driver's entry point. > Later use the token from the allocated token buffer. Only when all the > buffer have been used, then need to allocate new buffer. >=20 > Signed-off-by: Eric Dong > Cc: Ray Ni > Cc: Laszlo Ersek > --- >=20 > V5 changes: >=20 > Refine PCD names and some code comments. >=20 > V4 changes: >=20 > Specify PCD type to FixedPcd in code. >=20 > V3 changes: >=20 > Introduce PCD to control the pre allocated toke buffer size. >=20 > v2 changes: >=20 > Minor update based on comments. >=20 >=20 > UefiCpuPkg/PiSmmCpuDxeSmm/MpService.c | 67 > ++++++++++++++++++-- > UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.h | 15 +++++ > UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.inf | 3 + > UefiCpuPkg/UefiCpuPkg.dec | 4 ++ > UefiCpuPkg/UefiCpuPkg.uni | 3 + > 5 files changed, 88 insertions(+), 4 deletions(-) >=20 > diff --git a/UefiCpuPkg/PiSmmCpuDxeSmm/MpService.c > b/UefiCpuPkg/PiSmmCpuDxeSmm/MpService.c > index 0685637c2b..757f1056f7 100644 > --- a/UefiCpuPkg/PiSmmCpuDxeSmm/MpService.c > +++ b/UefiCpuPkg/PiSmmCpuDxeSmm/MpService.c > @@ -492,6 +492,24 @@ FreeTokens ( > { >=20 > LIST_ENTRY *Link; >=20 > PROCEDURE_TOKEN *ProcToken; >=20 > + TOKEN_BUFFER *TokenBuf; >=20 > + >=20 > + // >=20 > + // Only free the token buffer recorded in the OldTOkenBufList >=20 > + // upon exiting SMI. Current token buffer stays allocated so >=20 > + // next SMI doesn't need to re-allocate. >=20 > + // >=20 > + gSmmCpuPrivate->UsedTokenNum =3D 0; >=20 > + >=20 > + Link =3D GetFirstNode (&gSmmCpuPrivate->OldTokenBufList); >=20 > + while (!IsNull (&gSmmCpuPrivate->OldTokenBufList, Link)) { >=20 > + TokenBuf =3D TOKEN_BUFFER_FROM_LINK (Link); >=20 > + >=20 > + Link =3D RemoveEntryList (&TokenBuf->Link); >=20 > + >=20 > + FreePool (TokenBuf->Buffer); >=20 > + FreePool (TokenBuf); >=20 > + } >=20 >=20 >=20 > while (!IsListEmpty (&gSmmCpuPrivate->TokenList)) { >=20 > Link =3D GetFirstNode (&gSmmCpuPrivate->TokenList); >=20 > @@ -499,7 +517,6 @@ FreeTokens ( >=20 >=20 > RemoveEntryList (&ProcToken->Link); >=20 >=20 >=20 > - FreePool ((VOID *)ProcToken->ProcedureToken); >=20 > FreePool (ProcToken); >=20 > } >=20 > } >=20 > @@ -1115,13 +1132,37 @@ CreateToken ( > VOID >=20 > ) >=20 > { >=20 > - PROCEDURE_TOKEN *ProcToken; >=20 > + PROCEDURE_TOKEN *ProcToken; >=20 > SPIN_LOCK *CpuToken; >=20 > UINTN SpinLockSize; >=20 > + TOKEN_BUFFER *TokenBuf; >=20 > + UINT32 TokenCountPerChunk; >=20 >=20 >=20 > SpinLockSize =3D GetSpinLockProperties (); >=20 > - CpuToken =3D AllocatePool (SpinLockSize); >=20 > - ASSERT (CpuToken !=3D NULL); >=20 > + TokenCountPerChunk =3D FixedPcdGet32 > (PcdCpuSmmMpTokenCountPerChunk); >=20 > + >=20 > + if (gSmmCpuPrivate->UsedTokenNum =3D=3D TokenCountPerChunk) { >=20 > + DEBUG ((DEBUG_VERBOSE, "CpuSmm: No free token buffer, allocate > new buffer!\n")); >=20 > + >=20 > + // >=20 > + // Record current token buffer for later free action usage. >=20 > + // Current used token buffer not in this list. >=20 > + // >=20 > + TokenBuf =3D AllocatePool (sizeof (TOKEN_BUFFER)); >=20 > + ASSERT (TokenBuf !=3D NULL); >=20 > + TokenBuf->Signature =3D TOKEN_BUFFER_SIGNATURE; >=20 > + TokenBuf->Buffer =3D gSmmCpuPrivate->CurrentTokenBuf; >=20 > + >=20 > + InsertTailList (&gSmmCpuPrivate->OldTokenBufList, &TokenBuf->Link); >=20 > + >=20 > + gSmmCpuPrivate->CurrentTokenBuf =3D AllocatePool (SpinLockSize * > TokenCountPerChunk); >=20 > + ASSERT (gSmmCpuPrivate->CurrentTokenBuf !=3D NULL); >=20 > + gSmmCpuPrivate->UsedTokenNum =3D 0; >=20 > + } >=20 > + >=20 > + CpuToken =3D (SPIN_LOCK *)(gSmmCpuPrivate->CurrentTokenBuf + > SpinLockSize * gSmmCpuPrivate->UsedTokenNum); >=20 > + gSmmCpuPrivate->UsedTokenNum++; >=20 > + >=20 > InitializeSpinLock (CpuToken); >=20 > AcquireSpinLock (CpuToken); >=20 >=20 >=20 > @@ -1732,10 +1773,28 @@ InitializeDataForMmMp ( > VOID >=20 > ) >=20 > { >=20 > + UINTN SpinLockSize; >=20 > + UINT32 TokenCountPerChunk; >=20 > + >=20 > + SpinLockSize =3D GetSpinLockProperties (); >=20 > + TokenCountPerChunk =3D FixedPcdGet32 > (PcdCpuSmmMpTokenCountPerChunk); >=20 > + ASSERT (TokenCountPerChunk !=3D 0); >=20 > + if (TokenCountPerChunk =3D=3D 0) { >=20 > + DEBUG ((DEBUG_ERROR, "PcdCpuSmmMpTokenCountPerChunk should > not be Zero!\n")); >=20 > + CpuDeadLoop (); >=20 > + } >=20 > + DEBUG ((DEBUG_INFO, "CpuSmm: SpinLock Size =3D 0x%x, > PcdCpuSmmMpTokenCountPerChunk =3D 0x%x\n", SpinLockSize, > TokenCountPerChunk)); >=20 > + >=20 > + gSmmCpuPrivate->CurrentTokenBuf =3D AllocatePool (SpinLockSize * > TokenCountPerChunk); >=20 > + ASSERT (gSmmCpuPrivate->CurrentTokenBuf !=3D NULL); >=20 > + >=20 > + gSmmCpuPrivate->UsedTokenNum =3D 0; >=20 > + >=20 > gSmmCpuPrivate->ApWrapperFunc =3D AllocatePool (sizeof > (PROCEDURE_WRAPPER) * gSmmCpuPrivate- > >SmmCoreEntryContext.NumberOfCpus); >=20 > ASSERT (gSmmCpuPrivate->ApWrapperFunc !=3D NULL); >=20 >=20 >=20 > InitializeListHead (&gSmmCpuPrivate->TokenList); >=20 > + InitializeListHead (&gSmmCpuPrivate->OldTokenBufList); >=20 > } >=20 >=20 >=20 > /** >=20 > diff --git a/UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.h > b/UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.h > index 7e7c73f27f..5c1a01e42b 100644 > --- a/UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.h > +++ b/UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.h > @@ -217,6 +217,17 @@ typedef struct { >=20 >=20 > #define PROCEDURE_TOKEN_FROM_LINK(a) CR (a, PROCEDURE_TOKEN, > Link, PROCEDURE_TOKEN_SIGNATURE) >=20 >=20 >=20 > +#define TOKEN_BUFFER_SIGNATURE SIGNATURE_32 ('T', 'K', 'B', 'S') >=20 > + >=20 > +typedef struct { >=20 > + UINTN Signature; >=20 > + LIST_ENTRY Link; >=20 > + >=20 > + UINT8 *Buffer; >=20 > +} TOKEN_BUFFER; >=20 > + >=20 > +#define TOKEN_BUFFER_FROM_LINK(a) CR (a, TOKEN_BUFFER, Link, > TOKEN_BUFFER_SIGNATURE) >=20 > + >=20 > // >=20 > // Private structure for the SMM CPU module that is stored in DXE Runtim= e > memory >=20 > // Contains the SMM Configuration Protocols that is produced. >=20 > @@ -243,6 +254,10 @@ typedef struct { > PROCEDURE_WRAPPER *ApWrapperFunc; >=20 > LIST_ENTRY TokenList; >=20 >=20 >=20 > + LIST_ENTRY OldTokenBufList; >=20 > + >=20 > + UINT8 *CurrentTokenBuf; >=20 > + UINT32 UsedTokenNum; // Only record token= s used in > CurrentTokenBuf. >=20 > } SMM_CPU_PRIVATE_DATA; >=20 >=20 >=20 > extern SMM_CPU_PRIVATE_DATA *gSmmCpuPrivate; >=20 > diff --git a/UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.inf > b/UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.inf > index b12b2691f8..76b1462996 100644 > --- a/UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.inf > +++ b/UefiCpuPkg/PiSmmCpuDxeSmm/PiSmmCpuDxeSmm.inf > @@ -140,6 +140,9 @@ > gEfiMdeModulePkgTokenSpaceGuid.PcdHeapGuardPropertyMask > ## CONSUMES >=20 > gEfiMdePkgTokenSpaceGuid.PcdControlFlowEnforcementPropertyMask > ## CONSUMES >=20 >=20 >=20 > +[FixedPcd] >=20 > + gUefiCpuPkgTokenSpaceGuid.PcdCpuSmmMpTokenCountPerChunk > ## CONSUMES >=20 > + >=20 > [Pcd.X64] >=20 > gUefiCpuPkgTokenSpaceGuid.PcdCpuSmmRestrictedMemoryAccess ## > CONSUMES >=20 >=20 >=20 > diff --git a/UefiCpuPkg/UefiCpuPkg.dec b/UefiCpuPkg/UefiCpuPkg.dec > index 12f4413ea5..797f948631 100644 > --- a/UefiCpuPkg/UefiCpuPkg.dec > +++ b/UefiCpuPkg/UefiCpuPkg.dec > @@ -148,6 +148,10 @@ > # @Prompt Specify size of good stack of exception which need switching > stack. >=20 >=20 > gUefiCpuPkgTokenSpaceGuid.PcdCpuKnownGoodStackSize|2048|UINT32|0 > x30002001 >=20 >=20 >=20 > + ## Count of pre allocated SMM MP tokens per chunk. >=20 > + # @Prompt Specify the count of pre allocated SMM MP tokens per chunk. >=20 > + > gUefiCpuPkgTokenSpaceGuid.PcdCpuSmmMpTokenCountPerChunk|64|UI > NT32|0x30002002 >=20 > + >=20 > [PcdsFixedAtBuild, PcdsPatchableInModule] >=20 > ## This value is the CPU Local APIC base address, which aligns the add= ress > on a 4-KByte boundary. >=20 > # @Prompt Configure base address of CPU Local APIC >=20 > diff --git a/UefiCpuPkg/UefiCpuPkg.uni b/UefiCpuPkg/UefiCpuPkg.uni > index bfd696f48c..442ce5cb85 100644 > --- a/UefiCpuPkg/UefiCpuPkg.uni > +++ b/UefiCpuPkg/UefiCpuPkg.uni > @@ -272,3 +272,6 @@ > = "24000000 - 6th and 7th > generation Intel Core processors and Intel Xeon W Processor > Family(24MHz).
\n" >=20 > = "19200000 - Intel Atom > processors based on Goldmont Microarchitecture with CPUID signature > 06_5CH(19.2MHz).
\n" >=20 >=20 >=20 > +#string > STR_gUefiCpuPkgTokenSpaceGuid_PcdCpuSmmMpTokenCountPerChunk_P > ROMPT #language en-US "Specify the count of pre allocated SMM MP > tokens per chunk.\n" >=20 > + >=20 > +#string > STR_gUefiCpuPkgTokenSpaceGuid_PcdCpuSmmMpTokenCountPerChunk_H > ELP #language en-US "This value used to specify the count of pre alloc= ated > SMM MP tokens per chunk.\n" > \ No newline at end of file > -- > 2.23.0.windows.1