From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id DA57120958985 for ; Wed, 5 Jul 2017 18:26:26 -0700 (PDT) Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga105.jf.intel.com with ESMTP; 05 Jul 2017 18:28:06 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.40,314,1496127600"; d="scan'208";a="121333797" Received: from fmsmsx106.amr.corp.intel.com ([10.18.124.204]) by orsmga005.jf.intel.com with ESMTP; 05 Jul 2017 18:28:02 -0700 Received: from fmsmsx121.amr.corp.intel.com (10.18.125.36) by FMSMSX106.amr.corp.intel.com (10.18.124.204) with Microsoft SMTP Server (TLS) id 14.3.319.2; Wed, 5 Jul 2017 18:27:35 -0700 Received: from shsmsx151.ccr.corp.intel.com (10.239.6.50) by fmsmsx121.amr.corp.intel.com (10.18.125.36) with Microsoft SMTP Server (TLS) id 14.3.319.2; Wed, 5 Jul 2017 18:27:35 -0700 Received: from shsmsx102.ccr.corp.intel.com ([169.254.2.146]) by SHSMSX151.ccr.corp.intel.com ([169.254.3.56]) with mapi id 14.03.0319.002; Thu, 6 Jul 2017 09:27:32 +0800 From: "Fan, Jeff" To: Leo Duran , "edk2-devel@lists.01.org" CC: "Justen, Jordan L" , "Gao, Liming" , Brijesh Singh , "Dong, Eric" Thread-Topic: [PATCH v2] UefiCpuPkg: ApicLib Thread-Index: AQHS8ru8yrlSGlvnzEqejJ1+JmOwF6JGBlLw Date: Thu, 6 Jul 2017 01:27:32 +0000 Message-ID: <542CF652F8836A4AB8DBFAAD40ED192A4C6207FF@shsmsx102.ccr.corp.intel.com> References: <1498949104-11986-1-git-send-email-leo.duran@amd.com> <1498949104-11986-2-git-send-email-leo.duran@amd.com> In-Reply-To: <1498949104-11986-2-git-send-email-leo.duran@amd.com> Accept-Language: zh-CN, en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiODNhMDZkN2QtNTU1OS00ZDNkLWJhOTAtMWQxMWE5YjQ1Mzg3IiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX0lDIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE2LjUuOS4zIiwiVHJ1c3RlZExhYmVsSGFzaCI6IkF6UDg5Z3FPY0kxYTJ4SVh5U0lJMTdNNzk5RHNMZ21rK0l1VW5FVGhFaDg9In0= x-ctpclassification: CTP_IC x-originating-ip: [10.239.127.40] MIME-Version: 1.0 Subject: Re: [PATCH v2] UefiCpuPkg: ApicLib X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Jul 2017 01:26:27 -0000 Content-Language: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Leo, How AMD public spec dos define the manner of sending startup IPI to Aps? Wh= ich chapter is it defined in AMD public spec? Does AMD public spec indicate the second startup IPI is not required? Thanks! Jeff -----Original Message----- From: Leo Duran [mailto:leo.duran@amd.com]=20 Sent: Sunday, July 02, 2017 6:45 AM To: edk2-devel@lists.01.org Cc: Leo Duran; Justen, Jordan L; Fan, Jeff; Gao, Liming; Brijesh Singh Subject: [PATCH v2] UefiCpuPkg: ApicLib 1) SendInitSipiSipi () Skip repeating SendIpi () on AMD processor. 2) SendInitSipiSipiAllExcludingSelf () Skip repeating SendIpi () on AMD processor. 3) GetProcessorLocationByApicId () Adjust InitialApicId to properly concatenate Package on AMD processor. Clean-ups on C Coding standards. Cc: Jordan Justen Cc: Jeff Fan Cc: Liming Gao Cc: Brijesh Singh Contributed-under: TianoCore Contribution Agreement 1.0 Signed-off-by: Leo Duran --- UefiCpuPkg/Library/BaseXApicLib/BaseXApicLib.c | 52 +++++++++++++-----= ---- .../BaseXApicX2ApicLib/BaseXApicX2ApicLib.c | 50 +++++++++++++-----= --- 2 files changed, 63 insertions(+), 39 deletions(-) diff --git a/UefiCpuPkg/Library/BaseXApicLib/BaseXApicLib.c b/UefiCpuPkg/Li= brary/BaseXApicLib/BaseXApicLib.c index 2091e5e..a6e4e2e 100644 --- a/UefiCpuPkg/Library/BaseXApicLib/BaseXApicLib.c +++ b/UefiCpuPkg/Library/BaseXApicLib/BaseXApicLib.c @@ -338,7 +338,7 @@ GetInitialApicId ( AsmCpuid (CPUID_SIGNATURE, &MaxCpuIdIndex, NULL, NULL, NULL); =20 // - // If CPUID Leaf B is supported,=20 + // If CPUID Leaf B is supported, // And CPUID.0BH:EBX[15:0] reports a non-zero value, // Then the initial 32-bit APIC ID =3D CPUID.0BH:EDX // Else the initial 8-bit APIC ID =3D CPUID.1:EBX[31:24] @@ -554,8 +554,= 10 @@ SendInitSipiSipi ( IcrLow.Bits.DeliveryMode =3D LOCAL_APIC_DELIVERY_MODE_STARTUP; IcrLow.Bits.Level =3D 1; SendIpi (IcrLow.Uint32, ApicId); - MicroSecondDelay (200); - SendIpi (IcrLow.Uint32, ApicId); + if (!StandardSignatureIsAuthenticAMD ()) { + MicroSecondDelay (200); + SendIpi (IcrLow.Uint32, ApicId); + } } =20 /** @@ -588,8 +590,10 @@ SendInitSipiSipiAllExcludingSelf ( IcrLow.Bits.Level =3D 1; IcrLow.Bits.DestinationShorthand =3D LOCAL_APIC_DESTINATION_SHORTHAND_AL= L_EXCLUDING_SELF; SendIpi (IcrLow.Uint32, 0); - MicroSecondDelay (200); - SendIpi (IcrLow.Uint32, 0); + if (!StandardSignatureIsAuthenticAMD ()) { + MicroSecondDelay (200); + SendIpi (IcrLow.Uint32, 0); + } } =20 /** @@ -1013,13 +1017,14 @@ GetProcessorLocationByApicId ( UINT32 MaxCoresPerNode; UINT32 CorePerNodeMask; UINT32 ApicIdShift; + UINT32 ApicIdMask; UINTN ThreadBits; UINTN CoreBits; =20 // // Check if the processor is capable of supporting more than one logical= processor. // - AsmCpuid(CPUID_VERSION_INFO, NULL, NULL, NULL, &VersionInfoEdx.Uint32); + AsmCpuid (CPUID_VERSION_INFO, NULL, NULL, NULL,=20 + &VersionInfoEdx.Uint32); if (VersionInfoEdx.Bits.HTT =3D=3D 0) { if (Thread !=3D NULL) { *Thread =3D 0; @@ -1042,8 +1047,8 @@ GetProcessorLocationByApicId ( // // Get max index of CPUID // - AsmCpuid(CPUID_SIGNATURE, &MaxStandardCpuIdIndex, NULL, NULL, NULL); - AsmCpuid(CPUID_EXTENDED_FUNCTION, &MaxExtendedCpuIdIndex, NULL, NULL, NU= LL); + AsmCpuid (CPUID_SIGNATURE, &MaxStandardCpuIdIndex, NULL, NULL, NULL); =20 + AsmCpuid (CPUID_EXTENDED_FUNCTION, &MaxExtendedCpuIdIndex, NULL, NULL,=20 + NULL); =20 // // If the extended topology enumeration leaf is available, it @@ -1051,7= +1056,7 @@ GetProcessorLocationByApicId ( // TopologyLeafSupported =3D FALSE; if (MaxStandardCpuIdIndex >=3D CPUID_EXTENDED_TOPOLOGY) { - AsmCpuidEx( + AsmCpuidEx ( CPUID_EXTENDED_TOPOLOGY, 0, &ExtendedTopologyEax.Uint32, @@ -1072,7 +1077,7 @@ GetProcessorLocationByApicId ( // the SMT sub-field of x2APIC ID. // LevelType =3D ExtendedTopologyEcx.Bits.LevelType; - ASSERT(LevelType =3D=3D CPUID_EXTENDED_TOPOLOGY_LEVEL_TYPE_SMT); + ASSERT (LevelType =3D=3D CPUID_EXTENDED_TOPOLOGY_LEVEL_TYPE_SMT); ThreadBits =3D ExtendedTopologyEax.Bits.ApicIdShift; =20 // @@ -1081,7 +1086,7 @@ GetProcessorLocationByApicId ( // SubIndex =3D 1; do { - AsmCpuidEx( + AsmCpuidEx ( CPUID_EXTENDED_TOPOLOGY, SubIndex, &ExtendedTopologyEax.Uint32, @@ -1103,7 +1108,7 @@ GetProcessorLocationByApicId ( // // Get logical processor count // - AsmCpuid(CPUID_VERSION_INFO, NULL, &VersionInfoEbx.Uint32, NULL, NULL)= ; + AsmCpuid (CPUID_VERSION_INFO, NULL, &VersionInfoEbx.Uint32, NULL,=20 + NULL); MaxLogicProcessorsPerPackage =3D VersionInfoEbx.Bits.MaximumAddressabl= eIdsForLogicalProcessors; =20 // @@ -1114,11 +1119,11 @@ GetProcessorLocationByApicId ( // // Check for topology extensions on AMD processor // - if (StandardSignatureIsAuthenticAMD()) { + if (StandardSignatureIsAuthenticAMD ()) { if (MaxExtendedCpuIdIndex >=3D CPUID_AMD_PROCESSOR_TOPOLOGY) { - AsmCpuid(CPUID_EXTENDED_CPU_SIG, NULL, NULL, &AmdExtendedCpuSigEcx= .Uint32, NULL); + AsmCpuid (CPUID_EXTENDED_CPU_SIG, NULL, NULL,=20 + &AmdExtendedCpuSigEcx.Uint32, NULL); if (AmdExtendedCpuSigEcx.Bits.TopologyExtensions !=3D 0) { - AsmCpuid(CPUID_AMD_PROCESSOR_TOPOLOGY, NULL, &AmdProcessorTopolo= gyEbx.Uint32, + AsmCpuid (CPUID_AMD_PROCESSOR_TOPOLOGY, NULL,=20 + &AmdProcessorTopologyEbx.Uint32, &AmdProcessorTopologyEcx.Uint32, NULL); // // Get cores per processor package @@ -1128,7 +1133,7 @@ GetProc= essorLocationByApicId ( // // Account for actual thread count (e.g., SMT disabled) // - AsmCpuid(CPUID_VIR_PHY_ADDRESS_SIZE, NULL, NULL, &AmdVirPhyAddre= ssSizeEcx.Uint32, NULL); + AsmCpuid (CPUID_VIR_PHY_ADDRESS_SIZE, NULL, NULL,=20 + &AmdVirPhyAddressSizeEcx.Uint32, NULL); MaxThreadPerPackageMask =3D 1 << AmdVirPhyAddressSizeEcx.Bits.Ap= icIdCoreIdSize; ActualThreadPerPackageMask =3D 1; while (ActualThreadPerPackageMask < MaxLogicProcessorsPerPackage= ) { @@ -1136,7 +1141,7 @@ GetProcessorLocationByApicId ( } =20 // - // Adjust APIC Id to report concatenation of Package|Core|Thread= . + // Adjust APIC Id to report concatenation of Core|Thread. // if (ActualThreadPerPackageMask < MaxThreadPerPackageMask) { MaxCoresPerNode =3D MaxCoresPerPackage / (AmdProcessorTopology= Ecx.Bits.NodesPerProcessor + 1); @@ -1148,13 +1153,20 @@ GetProcessorLocati= onByApicId ( CorePerNodeMask -=3D 1; =20 ApicIdShift =3D 0; + ApicIdMask =3D ActualThreadPerPackageMask; do { ApicIdShift +=3D 1; - ActualThreadPerPackageMask <<=3D 1; - } while (ActualThreadPerPackageMask < MaxThreadPerPackageMask)= ; + ApicIdMask <<=3D 1; + } while (ApicIdMask < MaxThreadPerPackageMask); =20 InitialApicId =3D ((InitialApicId & ~CorePerNodeMask) >> ApicI= dShift) | (InitialApicId & CorePerNodeMask); } + // + // Adjust APIC Id to report concatenation of Package|Core|Thread= . + // + if ((InitialApicId & ~(MaxThreadPerPackageMask - 1)) !=3D 0) { + InitialApicId =3D (InitialApicId & (ActualThreadPerPackageMask= - 1)) | ActualThreadPerPackageMask; + } } } } @@ -1163,7 +1175,7 @@ GetProcessorLocationByApicId ( // Extract core count based on CACHE information // if (MaxStandardCpuIdIndex >=3D CPUID_CACHE_PARAMS) { - AsmCpuidEx(CPUID_CACHE_PARAMS, 0, &CacheParamsEax.Uint32, NULL, NU= LL, NULL); + AsmCpuidEx (CPUID_CACHE_PARAMS, 0, &CacheParamsEax.Uint32,=20 + NULL, NULL, NULL); if (CacheParamsEax.Uint32 !=3D 0) { MaxCoresPerPackage =3D CacheParamsEax.Bits.MaximumAddressableIds= ForLogicalProcessors + 1; } diff --git a/UefiCpuPkg/Library/BaseXApicX2ApicLib/BaseXApicX2ApicLib.c b/U= efiCpuPkg/Library/BaseXApicX2ApicLib/BaseXApicX2ApicLib.c index d5d4efa..5945b55 100644 --- a/UefiCpuPkg/Library/BaseXApicX2ApicLib/BaseXApicX2ApicLib.c +++ b/UefiCpuPkg/Library/BaseXApicX2ApicLib/BaseXApicX2ApicLib.c @@ -649,8 +649,10 @@ SendInitSipiSipi ( IcrLow.Bits.DeliveryMode =3D LOCAL_APIC_DELIVERY_MODE_STARTUP; IcrLow.Bits.Level =3D 1; SendIpi (IcrLow.Uint32, ApicId); - MicroSecondDelay (200); - SendIpi (IcrLow.Uint32, ApicId); + if (!StandardSignatureIsAuthenticAMD ()) { + MicroSecondDelay (200); + SendIpi (IcrLow.Uint32, ApicId); + } } =20 /** @@ -683,8 +685,10 @@ SendInitSipiSipiAllExcludingSelf ( IcrLow.Bits.Level =3D 1; IcrLow.Bits.DestinationShorthand =3D LOCAL_APIC_DESTINATION_SHORTHAND_AL= L_EXCLUDING_SELF; SendIpi (IcrLow.Uint32, 0); - MicroSecondDelay (200); - SendIpi (IcrLow.Uint32, 0); + if (!StandardSignatureIsAuthenticAMD ()) { + MicroSecondDelay (200); + SendIpi (IcrLow.Uint32, 0); + } } =20 /** @@ -1108,13 +1112,14 @@ GetProcessorLocationByApicId ( UINT32 MaxCoresPerNode; UINT32 CorePerNodeMask; UINT32 ApicIdShift; + UINT32 ApicIdMask; UINTN ThreadBits; UINTN CoreBits; =20 // // Check if the processor is capable of supporting more than one logical= processor. // - AsmCpuid(CPUID_VERSION_INFO, NULL, NULL, NULL, &VersionInfoEdx.Uint32); + AsmCpuid (CPUID_VERSION_INFO, NULL, NULL, NULL,=20 + &VersionInfoEdx.Uint32); if (VersionInfoEdx.Bits.HTT =3D=3D 0) { if (Thread !=3D NULL) { *Thread =3D 0; @@ -1137,8 +1142,8 @@ GetProcessorLocationByApicId ( // // Get max index of CPUID // - AsmCpuid(CPUID_SIGNATURE, &MaxStandardCpuIdIndex, NULL, NULL, NULL); - AsmCpuid(CPUID_EXTENDED_FUNCTION, &MaxExtendedCpuIdIndex, NULL, NULL, NU= LL); + AsmCpuid (CPUID_SIGNATURE, &MaxStandardCpuIdIndex, NULL, NULL, NULL); =20 + AsmCpuid (CPUID_EXTENDED_FUNCTION, &MaxExtendedCpuIdIndex, NULL, NULL,=20 + NULL); =20 // // If the extended topology enumeration leaf is available, it @@ -1146,7= +1151,7 @@ GetProcessorLocationByApicId ( // TopologyLeafSupported =3D FALSE; if (MaxStandardCpuIdIndex >=3D CPUID_EXTENDED_TOPOLOGY) { - AsmCpuidEx( + AsmCpuidEx ( CPUID_EXTENDED_TOPOLOGY, 0, &ExtendedTopologyEax.Uint32, @@ -1167,7 +1172,7 @@ GetProcessorLocationByApicId ( // the SMT sub-field of x2APIC ID. // LevelType =3D ExtendedTopologyEcx.Bits.LevelType; - ASSERT(LevelType =3D=3D CPUID_EXTENDED_TOPOLOGY_LEVEL_TYPE_SMT); + ASSERT (LevelType =3D=3D CPUID_EXTENDED_TOPOLOGY_LEVEL_TYPE_SMT); ThreadBits =3D ExtendedTopologyEax.Bits.ApicIdShift; =20 // @@ -1176,7 +1181,7 @@ GetProcessorLocationByApicId ( // SubIndex =3D 1; do { - AsmCpuidEx( + AsmCpuidEx ( CPUID_EXTENDED_TOPOLOGY, SubIndex, &ExtendedTopologyEax.Uint32, @@ -1198,7 +1203,7 @@ GetProcessorLocationByApicId ( // // Get logical processor count // - AsmCpuid(CPUID_VERSION_INFO, NULL, &VersionInfoEbx.Uint32, NULL, NULL)= ; + AsmCpuid (CPUID_VERSION_INFO, NULL, &VersionInfoEbx.Uint32, NULL,=20 + NULL); MaxLogicProcessorsPerPackage =3D VersionInfoEbx.Bits.MaximumAddressabl= eIdsForLogicalProcessors; =20 // @@ -1209,11 +1214,11 @@ GetProcessorLocationByApicId ( // // Check for topology extensions on AMD processor // - if (StandardSignatureIsAuthenticAMD()) { + if (StandardSignatureIsAuthenticAMD ()) { if (MaxExtendedCpuIdIndex >=3D CPUID_AMD_PROCESSOR_TOPOLOGY) { - AsmCpuid(CPUID_EXTENDED_CPU_SIG, NULL, NULL, &AmdExtendedCpuSigEcx= .Uint32, NULL); + AsmCpuid (CPUID_EXTENDED_CPU_SIG, NULL, NULL,=20 + &AmdExtendedCpuSigEcx.Uint32, NULL); if (AmdExtendedCpuSigEcx.Bits.TopologyExtensions !=3D 0) { - AsmCpuid(CPUID_AMD_PROCESSOR_TOPOLOGY, NULL, &AmdProcessorTopolo= gyEbx.Uint32, + AsmCpuid (CPUID_AMD_PROCESSOR_TOPOLOGY, NULL,=20 + &AmdProcessorTopologyEbx.Uint32, &AmdProcessorTopologyEcx.Uint32, NULL); // // Get cores per processor package @@ -1223,7 +1228,7 @@ GetProc= essorLocationByApicId ( // // Account for actual thread count (e.g., SMT disabled) // - AsmCpuid(CPUID_VIR_PHY_ADDRESS_SIZE, NULL, NULL, &AmdVirPhyAddre= ssSizeEcx.Uint32, NULL); + AsmCpuid (CPUID_VIR_PHY_ADDRESS_SIZE, NULL, NULL,=20 + &AmdVirPhyAddressSizeEcx.Uint32, NULL); MaxThreadPerPackageMask =3D 1 << AmdVirPhyAddressSizeEcx.Bits.Ap= icIdCoreIdSize; ActualThreadPerPackageMask =3D 1; while (ActualThreadPerPackageMask < MaxLogicProcessorsPerPackage= ) { @@ -1231,7 +1236,7 @@ GetProcessorLocationByApicId ( } =20 // - // Adjust APIC Id to report concatenation of Package|Core|Thread= . + // Adjust APIC Id to report concatenation of Core|Thread. // if (ActualThreadPerPackageMask < MaxThreadPerPackageMask) { MaxCoresPerNode =3D MaxCoresPerPackage / (AmdProcessorTopology= Ecx.Bits.NodesPerProcessor + 1); @@ -1243,13 +1248,20 @@ GetProcessorLocati= onByApicId ( CorePerNodeMask -=3D 1; =20 ApicIdShift =3D 0; + ApicIdMask =3D ActualThreadPerPackageMask; do { ApicIdShift +=3D 1; - ActualThreadPerPackageMask <<=3D 1; - } while (ActualThreadPerPackageMask < MaxThreadPerPackageMask)= ; + ApicIdMask <<=3D 1; + } while (ApicIdMask < MaxThreadPerPackageMask); =20 InitialApicId =3D ((InitialApicId & ~CorePerNodeMask) >> ApicI= dShift) | (InitialApicId & CorePerNodeMask); } + // + // Adjust APIC Id to report concatenation of Package|Core|Thread= . + // + if ((InitialApicId & ~(MaxThreadPerPackageMask - 1)) !=3D 0) { + InitialApicId =3D (InitialApicId & (ActualThreadPerPackageMask= - 1)) | ActualThreadPerPackageMask; + } } } } @@ -1258,7 +1270,7 @@ GetProcessorLocationByApicId ( // Extract core count based on CACHE information // if (MaxStandardCpuIdIndex >=3D CPUID_CACHE_PARAMS) { - AsmCpuidEx(CPUID_CACHE_PARAMS, 0, &CacheParamsEax.Uint32, NULL, NU= LL, NULL); + AsmCpuidEx (CPUID_CACHE_PARAMS, 0, &CacheParamsEax.Uint32,=20 + NULL, NULL, NULL); if (CacheParamsEax.Uint32 !=3D 0) { MaxCoresPerPackage =3D CacheParamsEax.Bits.MaximumAddressableIds= ForLogicalProcessors + 1; } -- 2.7.4