From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by mx.groups.io with SMTP id smtpd.web08.4457.1614046952810039671 for ; Mon, 22 Feb 2021 18:22:33 -0800 Authentication-Results: mx.groups.io; dkim=pass header.i=@intel.onmicrosoft.com header.s=selector2-intel-onmicrosoft-com header.b=PzsHqrW4; spf=pass (domain: intel.com, ip: 192.55.52.88, mailfrom: ray.ni@intel.com) IronPort-SDR: q6rtXHunEPQYvljOsER2oDucc4QdhzJkidN+DjCrqzsUFjy/BTECFS3lgwt91lEQl8xby5WMTL +Oj1X6dkczvw== X-IronPort-AV: E=McAfee;i="6000,8403,9903"; a="204081785" X-IronPort-AV: E=Sophos;i="5.81,198,1610438400"; d="scan'208";a="204081785" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Feb 2021 18:22:31 -0800 IronPort-SDR: LabEs9tOTFsfyUDRx5cIgk0ai3qEAh7Mdsm0BmJCnPsLhupx33SBEZgZlqFMLyOPGoeN3g+T9M vExvcDIuYyqw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.81,198,1610438400"; d="scan'208";a="366406507" Received: from orsmsx603.amr.corp.intel.com ([10.22.229.16]) by orsmga006.jf.intel.com with ESMTP; 22 Feb 2021 18:22:31 -0800 Received: from orsmsx609.amr.corp.intel.com (10.22.229.22) by ORSMSX603.amr.corp.intel.com (10.22.229.16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2106.2; Mon, 22 Feb 2021 18:22:31 -0800 Received: from orsmsx603.amr.corp.intel.com (10.22.229.16) by ORSMSX609.amr.corp.intel.com (10.22.229.22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2106.2; Mon, 22 Feb 2021 18:22:30 -0800 Received: from orsedg603.ED.cps.intel.com (10.7.248.4) by orsmsx603.amr.corp.intel.com (10.22.229.16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2106.2 via Frontend Transport; Mon, 22 Feb 2021 18:22:30 -0800 Received: from NAM10-DM6-obe.outbound.protection.outlook.com (104.47.58.104) by edgegateway.intel.com (134.134.137.100) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2106.2; Mon, 22 Feb 2021 18:22:30 -0800 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=EZHdjQ7cQgnxBSRAATbM4GM3QE4fzcn3yEkZ1RuYFZNTIAIr6A9a8nzPgcl07R8BzcGzB7ehZlekL7QYf0aZL+rFFd/mmKXJv3HXIZTHunBUTusZ4bSVrcLF5FG/1r8zqlhclFrUHePhX804AQbCt2mJCfGjI/CumXQdKWXSPSZ6jupdMH5ndSOHX1fIkIPbH34IXBA45Wki5JjwrsO7L0dB6iYd+x1h5YICuY0Um18bN8G5MzPrHZoCthCteqNf313lZ+IPTxC9vQd/LwCFqAyQ4pJqLWnSSBEOHQ5qvSerIM/oxpkNRV/agpq07R4lW531PfFj17VVdBUj1ny/Mg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=YPwoK/i7NVkffIEc/+LiG3RKyjHkJUSvF/96qMHrqP0=; b=SOHh84velD656U63mLn6QYeiPxGsAq77+F2wN/0RNxUGFwNDq/NAwMmbopXojZ5e4U7RTr8h573sWlmp47hzGy3JlY/jGGFF/DJKyDiEqQoqRrQCXbdahezrYuwulEL2d+Ey+Fpk3TA59uaDYGWS8rBYv1ZIPsavRb2rEZ6/T52b8mFCWPYc3nNjLPS0H/ThMNIkOd1x1aHVSrt/BsWbwgy7nkfUL8ilCM26WW2lYZhTXi9mjfFV7ISAJ9W9J+qWPaOpyRW0Z6ivhAoPyNQTbLUEzwhQXiSDV61/C85v5Uamx8wwi1MgP9M5y0MsGq3wbMXcJaRJaxeiS4/dYL763Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel.onmicrosoft.com; s=selector2-intel-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=YPwoK/i7NVkffIEc/+LiG3RKyjHkJUSvF/96qMHrqP0=; b=PzsHqrW4fUFb4lTdZHecsxZ2w3yTrcsxbfssvcbOmv/3EdPe6SlIkYF3//xBsSdeJ5HD29j9cDtY95qT6D7dAGiqDEey4vvgpI0kycVM292b4tTwABSHgVeyQ3vHMqUB5MmxI69uIgEtWkqHjF9ddh9FADaDS/6NDOj6qVNuxWQ= Received: from CO1PR11MB4930.namprd11.prod.outlook.com (2603:10b6:303:9b::11) by CO1PR11MB4996.namprd11.prod.outlook.com (2603:10b6:303:90::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3846.25; Tue, 23 Feb 2021 02:22:29 +0000 Received: from CO1PR11MB4930.namprd11.prod.outlook.com ([fe80::8d64:91ed:c259:e95]) by CO1PR11MB4930.namprd11.prod.outlook.com ([fe80::8d64:91ed:c259:e95%7]) with mapi id 15.20.3868.033; Tue, 23 Feb 2021 02:22:29 +0000 From: "Ni, Ray" To: "Kinney, Michael D" CC: "Dong, Eric" , Laszlo Ersek , "Kumar, Rahul1" , "devel@edk2.groups.io" , "Ni, Ray" Subject: Re: [edk2-devel] [PATCH v3 1/4] UefiCpuPkg/MpInitLib: Use XADD to avoid lock acquire/release Thread-Topic: [edk2-devel] [PATCH v3 1/4] UefiCpuPkg/MpInitLib: Use XADD to avoid lock acquire/release Thread-Index: AQHW/u4/fc83lOVJ3keYCCywvP7L2qplFAsQ Date: Tue, 23 Feb 2021 02:22:29 +0000 Message-ID: References: <20210209141634.1999-1-ray.ni@intel.com> <166219FF4C25D9C5.16853@groups.io> In-Reply-To: <166219FF4C25D9C5.16853@groups.io> Accept-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: intel.com; dkim=none (message not signed) header.d=none;intel.com; dmarc=none action=none header.from=intel.com; x-originating-ip: [192.198.147.194] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 00428220-4f5e-4ff1-c648-08d8d7a1de97 x-ms-traffictypediagnostic: CO1PR11MB4996: x-ms-exchange-transport-forked: True x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:6430; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: OWIV/62pgHeAunGPLYrFdhJI7gVDpuaohprJ1VUJxbkK1Rrq1zaawZKBPuiiXPM0BXAovc5Z16ba/g+UFakKdmi0mIQH/5xDJ5doBj2Be8V2kuQEKjIXSAnsRsXaTfjRmx8hzjqcWDFzEzPeXs5nL9j/ZVt4C/loGRVBigxrhQHFLlqN1tv/We90qRESYH/ekwqUyionA4x6O3CdRHS0Q4rZIfRL3E+8r5ci3mZmnRBjXJs8jTDtzfCH2wgHxy6SnSfrL86lSUa8X/fK2memUNg0sqe8hy4VKhITbnNu1oDn1FshTFBsktHLeRK9C4sU8lHiJUqNuvBlC+WDLQNnJ+u+Rzo5uvEhduuvfk8YOvCLeXNmtLSvL3HK6Z04wKlJpBMP5/3IFxmbmLqEA/q4tqUyJva0NEhLdv8TZ2CN2vlpIKEXpJ7PeeXd6zkT4sAwV2un1P/AUZmTBoLYJHFobvxdz8HUZzb08x4pw22xAzKuSip2+xDt+ek7uKc+/Dq88jFedBU++zT5pZW1HsHQ0V+EOkFjiBulozjpNEjedzgblAC2mCniEojZ3LaJB6RIIGcwwV53rQZG/l0KCdeebeOakKu4pW+NFd2rWoa0RW4= x-forefront-antispam-report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:CO1PR11MB4930.namprd11.prod.outlook.com;PTR:;CAT:NONE;SFS:(6029001)(136003)(39860400002)(346002)(396003)(376002)(366004)(5660300002)(186003)(66446008)(55016002)(9686003)(107886003)(316002)(19627235002)(7696005)(8676002)(478600001)(66476007)(66556008)(33656002)(86362001)(4326008)(26005)(66946007)(83380400001)(52536014)(64756008)(2906002)(6862004)(6636002)(6506007)(8936002)(71200400001)(966005)(53546011)(54906003)(76116006);DIR:OUT;SFP:1102; x-ms-exchange-antispam-messagedata: =?us-ascii?Q?4Hbf9T6fdXB6ETUD/W45CSffJ/uyBi7Yyt9RPdx+xjAyUbL6+1YWVloaP3Vk?= =?us-ascii?Q?vUg9j4vbmeug6iyAHb0fwgI0DBMBZSVuK6g3fgX49cwzn8aqC1wJHiXGSr2p?= =?us-ascii?Q?p9elkEVbzJVzO32cuZfyOjg7v2V1UnD5zYlXPcfMmb7S35mGNMwc4HiZJxYa?= =?us-ascii?Q?kfn9P8ji9E0uCtlTM4fKJA02zJiFooBdb46jobprTgCxTjrDN4JWOrneVnMb?= =?us-ascii?Q?/fvN+6qy5nP2lgQxicBybX8QYnZ1pMALaCzSIZe8DiULAY3znSaGk+l7zWTn?= =?us-ascii?Q?kmZecmPr8gWzdutHCeJpCaEl7a+nughIhMAMpADVYgnRLDFs635lzt2iAk3s?= =?us-ascii?Q?us9p0wU+QyXs1xEAF8DEmX/wV9hFDmmZ7O46vE4Onuo2hN5lrDmoOUiBj9OD?= =?us-ascii?Q?aJvWdKMzTXHBK+44lkcUnw23UVXfAFV2efvrBPCO9Cpfh+nXCVCHX3o5xq7G?= =?us-ascii?Q?Fdykmpa/I0LmfkoEOKO+KEIrMbfYdlDnam7XoMqIDQPqXN1c7kxKeKWWv6UI?= =?us-ascii?Q?1BYhd6d4COXts/XT8CFv8z03iRyefDuIWL8aWE1JJeVD10n7uT9to4ezXUbM?= =?us-ascii?Q?ZOxz8InWs1TDTeY6CkPkxn1Z1WGVJhXavBUYvu1QjE1JFS9tesut1MYnBGss?= =?us-ascii?Q?3twSTLg2KAIzqEOfx4CQi7jMzMhglzTm4vG8N2cHh05sAXS7ds2yST6YZ7Zq?= =?us-ascii?Q?BqXklfo290KBC6ZcLl1Zf76516vk5noeP8OB0OciVllllzJByhfVcwQA5/ZI?= =?us-ascii?Q?J9T/r9VH/qm7UsKwp7lOXJDsbHzeNRdfSDDLX4SFB6QlQipspeaijDLxWDIA?= =?us-ascii?Q?hgat8+Arw0k/xywVQmEG2yUt78kEWOmVYm+df9uPreTmkXSjqv49kLGiCSx6?= =?us-ascii?Q?YRBO6ElMxz2Xscj1ljySsBy1q2WeqMFAfKWl5dPFX3MRKooxfWRm+5/hsyGK?= =?us-ascii?Q?3bA+AyvTtjAuGHycgoD8QrfaA0vTzAMozVaLEOO3yGdhkRy1Hy8a9VPe9KXe?= =?us-ascii?Q?H0/jr/0lGZQ6ivG5LaoggLQFByOxCMkrieIn7OKSlO4jw6wqOZ1JjDgOcYRM?= =?us-ascii?Q?C9MkxOnrSfkhJM6EJqooN2bLslJ237dvjYF3ec/PRijYvn271KQ3cfcB4jd1?= =?us-ascii?Q?+WLGm9XH3JePhU9zY4xzqMM36g8kyylAkuL/JFXqiirid41LJeDKKQ5ilnCY?= =?us-ascii?Q?xaSGFxJzLcoPrvxDV3YqFHAMez1Gj9/Pz3j53BeeWQ2LAPCv5d/LROzjtqGe?= =?us-ascii?Q?36sCPTthACoBrX63a5x55zk1YWzE40BpOClSfLi2wKrPthSDYQVQ9RHjwQNc?= =?us-ascii?Q?orlBY4wn0z2ACEc8gdd0USN+?= MIME-Version: 1.0 X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: CO1PR11MB4930.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 00428220-4f5e-4ff1-c648-08d8d7a1de97 X-MS-Exchange-CrossTenant-originalarrivaltime: 23 Feb 2021 02:22:29.6453 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: woWrfw1ldWOMfCtUi5yaasUCGh/5M+xyxTOWBejdAM6/I0r7e8Uz53WuRBtq+Qruw+fHDXyNwEwObJbCTS3fvQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: CO1PR11MB4996 Return-Path: ray.ni@intel.com X-OriginatorOrg: intel.com Content-Language: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Mike, This patch follows your suggestion to fix the performance issue first, clea= n up the code next. Can you check specifically this patch? Thanks, Ray > -----Original Message----- > From: devel@edk2.groups.io On Behalf Of Ni, Ray > Sent: Tuesday, February 9, 2021 10:17 PM > To: devel@edk2.groups.io > Cc: Dong, Eric ; Laszlo Ersek ; > Kumar, Rahul1 > Subject: [edk2-devel] [PATCH v3 1/4] UefiCpuPkg/MpInitLib: Use XADD to > avoid lock acquire/release >=20 > When AP firstly wakes up, MpFuncs.nasm contains below logic to assign > an unique ApIndex to each AP according to who comes first: > ---ASM--- > TestLock: > xchg [edi], eax > cmp eax, NotVacantFlag > jz TestLock >=20 > mov ecx, esi > add ecx, ApIndexLocation > inc dword [ecx] > mov ebx, [ecx] >=20 > Releaselock: > mov eax, VacantFlag > xchg [edi], eax > ---ASM END--- >=20 > "lock inc" cannot be used to increase ApIndex because not only the > global ApIndex should be increased, but also the result should be > stored to a local general purpose register EBX. >=20 > This patch learns from the NASM implementation of > InternalSyncIncrement() to use "XADD" instruction which can increase > the global ApIndex and store the original ApIndex to EBX in one > instruction. >=20 > With this patch, OVMF when running in a 255 threads QEMU spends about > one second to wakeup all APs. Original implementation needs more than > 10 seconds. >=20 > Signed-off-by: Ray Ni > Cc: Eric Dong > Cc: Laszlo Ersek > Cc: Rahul Kumar > --- > .../Library/MpInitLib/Ia32/MpFuncs.nasm | 20 ++++++------------- > UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm | 18 ++++++----------- > 2 files changed, 12 insertions(+), 26 deletions(-) >=20 > diff --git a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm > b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm > index 7e81d24aa6..2eaddc93bc 100644 > --- a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm > +++ b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm > @@ -1,5 +1,5 @@ > ;-----------------------------------------------------------------------= ------- ; >=20 > -; Copyright (c) 2015 - 2019, Intel Corporation. All rights reserved.
>=20 > +; Copyright (c) 2015 - 2021, Intel Corporation. All rights reserved.
>=20 > ; SPDX-License-Identifier: BSD-2-Clause-Patent >=20 > ; >=20 > ; Module Name: >=20 > @@ -125,19 +125,11 @@ SkipEnableExecuteDisable: > add edi, LockLocation >=20 > mov eax, NotVacantFlag >=20 >=20 >=20 > -TestLock: >=20 > - xchg [edi], eax >=20 > - cmp eax, NotVacantFlag >=20 > - jz TestLock >=20 > - >=20 > - mov ecx, esi >=20 > - add ecx, ApIndexLocation >=20 > - inc dword [ecx] >=20 > - mov ebx, [ecx] >=20 > - >=20 > -Releaselock: >=20 > - mov eax, VacantFlag >=20 > - xchg [edi], eax >=20 > + mov edi, esi >=20 > + add edi, ApIndexLocation >=20 > + mov ebx, 1 >=20 > + lock xadd dword [edi], ebx ; EBX =3D ApIndex++ >=20 > + inc ebx ; EBX is CpuNumber >=20 >=20 >=20 > mov edi, esi >=20 > add edi, StackSizeLocation >=20 > diff --git a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm > b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm > index aecfd07bc0..5b588f2dcb 100644 > --- a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm > +++ b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm > @@ -1,5 +1,5 @@ > ;-----------------------------------------------------------------------= ------- ; >=20 > -; Copyright (c) 2015 - 2019, Intel Corporation. All rights reserved.
>=20 > +; Copyright (c) 2015 - 2021, Intel Corporation. All rights reserved.
>=20 > ; SPDX-License-Identifier: BSD-2-Clause-Patent >=20 > ; >=20 > ; Module Name: >=20 > @@ -161,18 +161,12 @@ LongModeStart: > add edi, LockLocation >=20 > mov rax, NotVacantFlag >=20 >=20 >=20 > -TestLock: >=20 > - xchg qword [edi], rax >=20 > - cmp rax, NotVacantFlag >=20 > - jz TestLock >=20 > - >=20 > - lea ecx, [esi + ApIndexLocation] >=20 > - inc dword [ecx] >=20 > - mov ebx, [ecx] >=20 > + mov edi, esi >=20 > + add edi, ApIndexLocation >=20 > + mov ebx, 1 >=20 > + lock xadd dword [edi], ebx ; EBX =3D ApIndex++ >=20 > + inc ebx ; EBX is CpuNumber >=20 >=20 >=20 > -Releaselock: >=20 > - mov rax, VacantFlag >=20 > - xchg qword [edi], rax >=20 > ; program stack >=20 > mov edi, esi >=20 > add edi, StackSizeLocation >=20 > -- > 2.27.0.windows.1 >=20 >=20 >=20 > -=3D-=3D-=3D-=3D-=3D-=3D > Groups.io Links: You receive all messages sent to this group. > View/Reply Online (#71517): https://edk2.groups.io/g/devel/message/71517 > Mute This Topic: https://groups.io/mt/80504936/1712937 > Group Owner: devel+owner@edk2.groups.io > Unsubscribe: https://edk2.groups.io/g/devel/unsub [ray.ni@intel.com] > -=3D-=3D-=3D-=3D-=3D-=3D >=20