From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) (using TLSv1 with cipher CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id D34771A1E98 for ; Tue, 6 Sep 2016 20:03:15 -0700 (PDT) Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga101.jf.intel.com with ESMTP; 06 Sep 2016 20:03:15 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.30,294,1470726000"; d="scan'208";a="5789615" Received: from fmsmsx105.amr.corp.intel.com ([10.18.124.203]) by orsmga004.jf.intel.com with ESMTP; 06 Sep 2016 20:03:15 -0700 Received: from fmsmsx119.amr.corp.intel.com (10.18.124.207) by FMSMSX105.amr.corp.intel.com (10.18.124.203) with Microsoft SMTP Server (TLS) id 14.3.248.2; Tue, 6 Sep 2016 20:03:15 -0700 Received: from shsmsx152.ccr.corp.intel.com (10.239.6.52) by FMSMSX119.amr.corp.intel.com (10.18.124.207) with Microsoft SMTP Server (TLS) id 14.3.248.2; Tue, 6 Sep 2016 20:03:14 -0700 Received: from shsmsx102.ccr.corp.intel.com ([169.254.2.109]) by SHSMSX152.ccr.corp.intel.com ([169.254.6.166]) with mapi id 14.03.0248.002; Wed, 7 Sep 2016 11:03:12 +0800 From: "Gao, Liming" To: Ard Biesheuvel , "edk2-devel@lists.01.org" , "leif.lindholm@linaro.org" CC: "Kinney, Michael D" Thread-Topic: [PATCH v2 1/3] MdePkg/BaseMemoryLib: widen aligned accesses to 32 or 64 bits Thread-Index: AQHSCEpU1wiVr6qyTEyZFOAjXr2nC6BtV6FQ Date: Wed, 7 Sep 2016 03:03:11 +0000 Message-ID: <4A89E2EF3DFEDB4C8BFDE51014F606A14B3E2F42@shsmsx102.ccr.corp.intel.com> References: <1473171813-24595-1-git-send-email-ard.biesheuvel@linaro.org> <1473171813-24595-2-git-send-email-ard.biesheuvel@linaro.org> In-Reply-To: <1473171813-24595-2-git-send-email-ard.biesheuvel@linaro.org> Accept-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.239.127.40] MIME-Version: 1.0 Subject: Re: [PATCH v2 1/3] MdePkg/BaseMemoryLib: widen aligned accesses to 32 or 64 bits X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Sep 2016 03:03:16 -0000 Content-Language: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Ard: In InternalMemSetMem, Value64 =3D (((UINT64)Value32) << 32) | Value32; may = cause the below link error with VS IA32 build. It (<<) should be replaced b= y BaseLib LShift API BaseMemoryLib.lib(SetMem.obj) : error LNK2001: unresolved external symbol _= _allshl Thanks Liming > -----Original Message----- > From: Ard Biesheuvel [mailto:ard.biesheuvel@linaro.org] > Sent: Tuesday, September 06, 2016 10:24 PM > To: edk2-devel@lists.01.org; leif.lindholm@linaro.org; Gao, Liming > > Cc: Kinney, Michael D ; Ard Biesheuvel > > Subject: [PATCH v2 1/3] MdePkg/BaseMemoryLib: widen aligned accesses to > 32 or 64 bits >=20 > Since the default BaseMemoryLib should be callable from any context, > including ones where unaligned accesses are not allowed, it implements > InternalCopyMem() and InternalSetMem() using byte accesses only. > However, especially in a context where the MMU is off, such narrow > accesses may be disproportionately costly, and so if the size and > alignment of the access allow it, use 32-bit or even 64-bit loads and > stores (the latter may be beneficial even on a 32-bit architectures like > ARM, which has load pair/store pair instructions) >=20 > Contributed-under: TianoCore Contribution Agreement 1.0 > Signed-off-by: Ard Biesheuvel > --- > MdePkg/Library/BaseMemoryLib/BaseMemoryLib.inf | 2 +- > MdePkg/Library/BaseMemoryLib/CopyMem.c | 112 > ++++++++++++++++++-- > MdePkg/Library/BaseMemoryLib/SetMem.c | 40 ++++++- > 3 files changed, 140 insertions(+), 14 deletions(-) >=20 > diff --git a/MdePkg/Library/BaseMemoryLib/BaseMemoryLib.inf > b/MdePkg/Library/BaseMemoryLib/BaseMemoryLib.inf > index 6d906e93faf3..358eeed4f449 100644 > --- a/MdePkg/Library/BaseMemoryLib/BaseMemoryLib.inf > +++ b/MdePkg/Library/BaseMemoryLib/BaseMemoryLib.inf > @@ -26,7 +26,7 @@ [Defines] >=20 >=20 > # > -# VALID_ARCHITECTURES =3D IA32 X64 IPF EBC > +# VALID_ARCHITECTURES =3D IA32 X64 IPF EBC ARM AARCH64 > # >=20 > [Sources] > diff --git a/MdePkg/Library/BaseMemoryLib/CopyMem.c > b/MdePkg/Library/BaseMemoryLib/CopyMem.c > index 37f03660df5f..6f4fd900df5d 100644 > --- a/MdePkg/Library/BaseMemoryLib/CopyMem.c > +++ b/MdePkg/Library/BaseMemoryLib/CopyMem.c > @@ -4,6 +4,9 @@ > particular platform easily if an optimized version is desired. >=20 > Copyright (c) 2006 - 2010, Intel Corporation. All rights reserved.
> + Copyright (c) 2012 - 2013, ARM Ltd. All rights reserved.
> + Copyright (c) 2016, Linaro Ltd. All rights reserved.
> + > This program and the accompanying materials > are licensed and made available under the terms and conditions of the = BSD > License > which accompanies this distribution. The full text of the license may= be > found at > @@ -44,18 +47,107 @@ InternalMemCopyMem ( > // > volatile UINT8 *Destination8; > CONST UINT8 *Source8; > + volatile UINT32 *Destination32; > + CONST UINT32 *Source32; > + volatile UINT64 *Destination64; > + CONST UINT64 *Source64; > + UINTN Alignment; > + > + if ((((UINTN)DestinationBuffer & 0x7) =3D=3D 0) && (((UINTN)SourceBuff= er & > 0x7) =3D=3D 0) && (Length >=3D 8)) { > + if (SourceBuffer > DestinationBuffer) { > + Destination64 =3D (UINT64*)DestinationBuffer; > + Source64 =3D (CONST UINT64*)SourceBuffer; > + while (Length >=3D 8) { > + *(Destination64++) =3D *(Source64++); > + Length -=3D 8; > + } > + > + // Finish if there are still some bytes to copy > + Destination8 =3D (UINT8*)Destination64; > + Source8 =3D (CONST UINT8*)Source64; > + while (Length-- !=3D 0) { > + *(Destination8++) =3D *(Source8++); > + } > + } else if (SourceBuffer < DestinationBuffer) { > + Destination64 =3D (UINT64*)((UINTN)DestinationBuffer + Length); > + Source64 =3D (CONST UINT64*)((UINTN)SourceBuffer + Length); > + > + // Destination64 and Source64 were aligned on a 64-bit boundary > + // but if length is not a multiple of 8 bytes then they won't be > + // anymore. > + > + Alignment =3D Length & 0x7; > + if (Alignment !=3D 0) { > + Destination8 =3D (UINT8*)Destination64; > + Source8 =3D (CONST UINT8*)Source64; > + > + while (Alignment-- !=3D 0) { > + *(--Destination8) =3D *(--Source8); > + --Length; > + } > + Destination64 =3D (UINT64*)Destination8; > + Source64 =3D (CONST UINT64*)Source8; > + } > + > + while (Length > 0) { > + *(--Destination64) =3D *(--Source64); > + Length -=3D 8; > + } > + } > + } else if ((((UINTN)DestinationBuffer & 0x3) =3D=3D 0) && > (((UINTN)SourceBuffer & 0x3) =3D=3D 0) && (Length >=3D 4)) { > + if (SourceBuffer > DestinationBuffer) { > + Destination32 =3D (UINT32*)DestinationBuffer; > + Source32 =3D (CONST UINT32*)SourceBuffer; > + while (Length >=3D 4) { > + *(Destination32++) =3D *(Source32++); > + Length -=3D 4; > + } > + > + // Finish if there are still some bytes to copy > + Destination8 =3D (UINT8*)Destination32; > + Source8 =3D (CONST UINT8*)Source32; > + while (Length-- !=3D 0) { > + *(Destination8++) =3D *(Source8++); > + } > + } else if (SourceBuffer < DestinationBuffer) { > + Destination32 =3D (UINT32*)((UINTN)DestinationBuffer + Length); > + Source32 =3D (CONST UINT32*)((UINTN)SourceBuffer + Length); > + > + // Destination32 and Source32 were aligned on a 32-bit boundary > + // but if length is not a multiple of 4 bytes then they won't be > + // anymore. > + > + Alignment =3D Length & 0x3; > + if (Alignment !=3D 0) { > + Destination8 =3D (UINT8*)Destination32; > + Source8 =3D (CONST UINT8*)Source32; > + > + while (Alignment-- !=3D 0) { > + *(--Destination8) =3D *(--Source8); > + --Length; > + } > + Destination32 =3D (UINT32*)Destination8; > + Source32 =3D (CONST UINT32*)Source8; > + } >=20 > - if (SourceBuffer > DestinationBuffer) { > - Destination8 =3D (UINT8*)DestinationBuffer; > - Source8 =3D (CONST UINT8*)SourceBuffer; > - while (Length-- !=3D 0) { > - *(Destination8++) =3D *(Source8++); > + while (Length > 0) { > + *(--Destination32) =3D *(--Source32); > + Length -=3D 4; > + } > } > - } else if (SourceBuffer < DestinationBuffer) { > - Destination8 =3D (UINT8*)DestinationBuffer + Length; > - Source8 =3D (CONST UINT8*)SourceBuffer + Length; > - while (Length-- !=3D 0) { > - *(--Destination8) =3D *(--Source8); > + } else { > + if (SourceBuffer > DestinationBuffer) { > + Destination8 =3D (UINT8*)DestinationBuffer; > + Source8 =3D (CONST UINT8*)SourceBuffer; > + while (Length-- !=3D 0) { > + *(Destination8++) =3D *(Source8++); > + } > + } else if (SourceBuffer < DestinationBuffer) { > + Destination8 =3D (UINT8*)DestinationBuffer + Length; > + Source8 =3D (CONST UINT8*)SourceBuffer + Length; > + while (Length-- !=3D 0) { > + *(--Destination8) =3D *(--Source8); > + } > } > } > return DestinationBuffer; > diff --git a/MdePkg/Library/BaseMemoryLib/SetMem.c > b/MdePkg/Library/BaseMemoryLib/SetMem.c > index 5e74085c56f0..932c61e959f3 100644 > --- a/MdePkg/Library/BaseMemoryLib/SetMem.c > +++ b/MdePkg/Library/BaseMemoryLib/SetMem.c > @@ -5,6 +5,9 @@ > is desired. >=20 > Copyright (c) 2006 - 2010, Intel Corporation. All rights reserved.
> + Copyright (c) 2012 - 2013, ARM Ltd. All rights reserved.
> + Copyright (c) 2016, Linaro Ltd. All rights reserved.
> + > This program and the accompanying materials > are licensed and made available under the terms and conditions of the = BSD > License > which accompanies this distribution. The full text of the license may= be > found at > @@ -43,11 +46,42 @@ InternalMemSetMem ( > // volatile to prevent the optimizer from replacing this function with > // the intrinsic memset() > // > - volatile UINT8 *Pointer; > + volatile UINT8 *Pointer8; > + volatile UINT32 *Pointer32; > + volatile UINT64 *Pointer64; > + UINT32 Value32; > + UINT64 Value64; > + > + if ((((UINTN)Buffer & 0x7) =3D=3D 0) && (Length >=3D 8)) { > + // Generate the 64bit value > + Value32 =3D (Value << 24) | (Value << 16) | (Value << 8) | Value; > + Value64 =3D (((UINT64)Value32) << 32) | Value32; > + > + Pointer64 =3D (UINT64*)Buffer; > + while (Length >=3D 8) { > + *(Pointer64++) =3D Value64; > + Length -=3D 8; > + } >=20 > - Pointer =3D (UINT8*)Buffer; > + // Finish with bytes if needed > + Pointer8 =3D (UINT8*)Pointer64; > + } else if ((((UINTN)Buffer & 0x3) =3D=3D 0) && (Length >=3D 4)) { > + // Generate the 32bit value > + Value32 =3D (Value << 24) | (Value << 16) | (Value << 8) | Value; > + > + Pointer32 =3D (UINT32*)Buffer; > + while (Length >=3D 4) { > + *(Pointer32++) =3D Value32; > + Length -=3D 4; > + } > + > + // Finish with bytes if needed > + Pointer8 =3D (UINT8*)Pointer32; > + } else { > + Pointer8 =3D (UINT8*)Buffer; > + } > while (Length-- > 0) { > - *(Pointer++) =3D Value; > + *(Pointer8++) =3D Value; > } > return Buffer; > } > -- > 2.7.4