From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received-SPF: Pass (sender SPF authorized) identity=mailfrom; client-ip=192.55.52.43; helo=mga05.intel.com; envelope-from=bob.c.feng@intel.com; receiver=edk2-devel@lists.01.org Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id CF60821CAD998 for ; Thu, 8 Nov 2018 19:25:08 -0800 (PST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 08 Nov 2018 19:25:07 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,481,1534834800"; d="scan'208";a="87850923" Received: from fmsmsx104.amr.corp.intel.com ([10.18.124.202]) by orsmga007.jf.intel.com with ESMTP; 08 Nov 2018 19:25:07 -0800 Received: from fmsmsx125.amr.corp.intel.com (10.18.125.40) by fmsmsx104.amr.corp.intel.com (10.18.124.202) with Microsoft SMTP Server (TLS) id 14.3.408.0; Thu, 8 Nov 2018 19:25:07 -0800 Received: from shsmsx152.ccr.corp.intel.com (10.239.6.52) by FMSMSX125.amr.corp.intel.com (10.18.125.40) with Microsoft SMTP Server (TLS) id 14.3.408.0; Thu, 8 Nov 2018 19:25:07 -0800 Received: from shsmsx101.ccr.corp.intel.com ([169.254.1.102]) by SHSMSX152.ccr.corp.intel.com ([169.254.6.214]) with mapi id 14.03.0415.000; Fri, 9 Nov 2018 11:25:04 +0800 From: "Feng, Bob C" To: Leif Lindholm CC: "edk2-devel@lists.01.org" , "Carsey, Jaben" , "Gao, Liming" Thread-Topic: [edk2] [Patch] BaseTools: Optimize string concatenation Thread-Index: AQHUd0w33kqZL81XV0agjUNN2Boo7KVFkjQAgAEi7jA= Date: Fri, 9 Nov 2018 03:25:04 +0000 Message-ID: <08650203BA1BD64D8AD9B6D5D74A85D15FFFBED5@SHSMSX101.ccr.corp.intel.com> References: <20181108101625.41364-1-bob.c.feng@intel.com> <20181108165238.vhdbx2mc42kefvag@bivouac.eciton.net> In-Reply-To: <20181108165238.vhdbx2mc42kefvag@bivouac.eciton.net> Accept-Language: zh-CN, en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.239.127.40] MIME-Version: 1.0 Subject: Re: [Patch] BaseTools: Optimize string concatenation X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 09 Nov 2018 03:25:09 -0000 Content-Language: en-US Content-Type: text/plain; charset="iso-8859-7" Content-Transfer-Encoding: quoted-printable Hi Leif, Yes. I should show the data. My unites scripts is as below. The parameter lines is a string list which s= ize is 43395. The test result is ''.join(String list) time: 0.042262 String +=3D String time : 3.822699 def TestPlus(lines): str_target =3D "" =20 for line in lines: str_target +=3D line =20 return str_target def TestJoin(lines): str_target =3D [] =20 for line in lines: str_target.append(line) =20 return "".join(str_target) def CompareStrCat(): lines =3D GetStrings() print (len(lines)) =20 begin =3D time.perf_counter() for _ in range(10): TestJoin(lines) end =3D time.perf_counter() - begin print ("''.join(String list) time: %f" % end) =20 begin =3D time.perf_counter() for _ in range(10): TestPlus(lines) end =3D time.perf_counter() - begin print ("String +=3D String time: %f" % end) For build OvmfX64, it's not very effective, it saves 2~3 second in Parse/Au= toGen phase, because OvmfX64 is relatively simple. It does not enable much = features such as Multiple SKU and structure PCD by default and there is no = big size Autogen.c/Autogen.h/Makefile generated either. but for the complex= platform, this patch will be much effective. The unites above simulates a = real case that there is a 43395 lines of Autogen.c generated. Since this patch mostly effect the Parser/AutoGen phase, I just use "build = genmake" to show the improvement data.=20 The final result for clean build is: Current code: 17 seconds After patch: 15 seconds Details: Current data: d:\edk2 (master -> origin) =EB build genmake -p OvmfPkg\OvmfPkgIa32X64.dsc -a IA32 -a X64 -t VS2015x86 Build environment: Windows-10-10.0.10240 Build start time: 10:12:32, Nov.09 2018 WORKSPACE =3D d:\edk2 ECP_SOURCE =3D d:\edk2\edkcompatibilitypkg EDK_SOURCE =3D d:\edk2\edkcompatibilitypkg EFI_SOURCE =3D d:\edk2\edkcompatibilitypkg EDK_TOOLS_PATH =3D d:\edk2\basetools EDK_TOOLS_BIN =3D d:\edk2\basetools\bin\win32 CONF_PATH =3D d:\edk2\conf Architecture(s) =3D IA32 X64 Build target =3D DEBUG Toolchain =3D VS2015x86 Active Platform =3D d:\edk2\OvmfPkg\OvmfPkgIa32X64.dsc Flash Image Definition =3D d:\edk2\OvmfPkg\OvmfPkgIa32X64.fdf Processing meta-data ....... done! Generating code . done! Generating makefile . done! Generating code .. done! Generating makefile ...... done! - Done - Build end time: 10:12:49, Nov.09 2018 Build total time: 00:00:17 After applying this patch: d:\edk2 (master -> origin) =20 =EB build genmake -p OvmfPkg\OvmfPkgIa32X64.dsc -a IA32 -a X64 - Build environment: Windows-10-10.0.10240 =20 Build start time: 10:11:41, Nov.09 2018 =20 =20 WORKSPACE =3D d:\edk2 =20 ECP_SOURCE =3D d:\edk2\edkcompatibilitypkg =20 EDK_SOURCE =3D d:\edk2\edkcompatibilitypkg =20 EFI_SOURCE =3D d:\edk2\edkcompatibilitypkg =20 EDK_TOOLS_PATH =3D d:\edk2\basetools =20 EDK_TOOLS_BIN =3D d:\edk2\basetools\bin\win32 =20 CONF_PATH =3D d:\edk2\conf =20 =20 =20 Architecture(s) =3D IA32 X64 =20 Build target =3D DEBUG =20 Toolchain =3D VS2015x86 =20 =20 Active Platform =3D d:\edk2\OvmfPkg\OvmfPkgIa32X64.dsc=20 Flash Image Definition =3D d:\edk2\OvmfPkg\OvmfPkgIa32X64.fdf=20 =20 Processing meta-data ..... done! =20 Generating code . done! =20 Generating makefile . done! =20 Generating code .. done! =20 Generating makefile ...... done! =20 =20 - Done - =20 Build end time: 10:11:56, Nov.09 2018 =20 Build total time: 00:00:15 =20 Thanks, Bob -----Original Message----- From: Leif Lindholm [mailto:leif.lindholm@linaro.org]=20 Sent: Friday, November 9, 2018 12:53 AM To: Feng, Bob C Cc: edk2-devel@lists.01.org; Carsey, Jaben ; Gao, L= iming Subject: Re: [edk2] [Patch] BaseTools: Optimize string concatenation On Thu, Nov 08, 2018 at 06:16:25PM +0800, BobCF wrote: > https://bugzilla.tianocore.org/show_bug.cgi?id=3D1288 >=20 > This patch is one of build tool performance improvement series=20 > patches. >=20 > This patch is going to use join function instead of string +=3D string2=20 > statement. >=20 > Current code use string +=3D string2 in a loop to combine a string.=20 > while creating a string list in a loop and using > "".join(stringlist) after the loop will be much faster. Do you have any numbers on the level of improvement seen? Either for the individual scripts when called identically, or (if measurable) on the build of an entire platform (say OvmfX64?). Regards, Leif > Contributed-under: TianoCore Contribution Agreement 1.1 > Signed-off-by: BobCF > Cc: Liming Gao > Cc: Jaben Carsey > --- > BaseTools/Source/Python/AutoGen/StrGather.py | 39 +++++++++++++------ > BaseTools/Source/Python/Common/Misc.py | 21 +++++----- > .../Source/Python/Workspace/InfBuildData.py | 4 +- > .../Python/Workspace/WorkspaceCommon.py | 11 ++---- > 4 files changed, 44 insertions(+), 31 deletions(-) >=20 > diff --git a/BaseTools/Source/Python/AutoGen/StrGather.py=20 > b/BaseTools/Source/Python/AutoGen/StrGather.py > index 361d499076..d34a9e9447 100644 > --- a/BaseTools/Source/Python/AutoGen/StrGather.py > +++ b/BaseTools/Source/Python/AutoGen/StrGather.py > @@ -135,11 +135,11 @@ def AscToHexList(Ascii): > # @param UniGenCFlag UniString is generated into AutoGen C file whe= n it is set to True > # > # @retval Str: A string of .h file content > # > def CreateHFileContent(BaseName, UniObjectClass, IsCompatibleMode, UniGe= nCFlag): > - Str =3D '' > + Str =3D [] > ValueStartPtr =3D 60 > Line =3D COMMENT_DEFINE_STR + ' ' + LANGUAGE_NAME_STRING_NAME + ' ' = * (ValueStartPtr - len(DEFINE_STR + LANGUAGE_NAME_STRING_NAME)) + DecToHexS= tr(0, 4) + COMMENT_NOT_REFERENCED > Str =3D WriteLine(Str, Line) > Line =3D COMMENT_DEFINE_STR + ' ' + PRINTABLE_LANGUAGE_NAME_STRING_N= AME + ' ' * (ValueStartPtr - len(DEFINE_STR + PRINTABLE_LANGUAGE_NAME_STRIN= G_NAME)) + DecToHexStr(1, 4) + COMMENT_NOT_REFERENCED > Str =3D WriteLine(Str, Line) > @@ -164,16 +164,16 @@ def CreateHFileContent(BaseName, UniObjectClass, Is= CompatibleMode, UniGenCFlag): > Line =3D COMMENT_DEFINE_STR + ' ' + Name + ' ' + Dec= ToHexStr(Token, 4) + COMMENT_NOT_REFERENCED > else: > Line =3D COMMENT_DEFINE_STR + ' ' + Name + ' ' * (Va= lueStartPtr - len(DEFINE_STR + Name)) + DecToHexStr(Token, 4) + COMMENT_NOT= _REFERENCED > UnusedStr =3D WriteLine(UnusedStr, Line) > =20 > - Str =3D ''.join([Str, UnusedStr]) > + Str.extend( UnusedStr) > =20 > Str =3D WriteLine(Str, '') > if IsCompatibleMode or UniGenCFlag: > Str =3D WriteLine(Str, 'extern unsigned char ' + BaseName + 'Str= ings[];') > - return Str > + return "".join(Str) > =20 > ## Create a complete .h file > # > # Create a complet .h file with file header and file content # @@=20 > -185,11 +185,11 @@ def CreateHFileContent(BaseName, UniObjectClass, IsCom= patibleMode, UniGenCFlag): > # @retval Str: A string of complete .h file > # > def CreateHFile(BaseName, UniObjectClass, IsCompatibleMode, UniGenCFlag)= : > HFile =3D WriteLine('', CreateHFileContent(BaseName,=20 > UniObjectClass, IsCompatibleMode, UniGenCFlag)) > =20 > - return HFile > + return "".join(HFile) > =20 > ## Create a buffer to store all items in an array # > # @param BinBuffer Buffer to contain Binary data. > # @param Array: The array need to be formatted > @@ -209,11 +209,11 @@ def CreateBinBuffer(BinBuffer, Array): > # > def CreateArrayItem(Array, Width =3D 16): > MaxLength =3D Width > Index =3D 0 > Line =3D ' ' > - ArrayItem =3D '' > + ArrayItem =3D [] > =20 > for Item in Array: > if Index < MaxLength: > Line =3D Line + Item + ', ' > Index =3D Index + 1 > @@ -221,11 +221,11 @@ def CreateArrayItem(Array, Width =3D 16): > ArrayItem =3D WriteLine(ArrayItem, Line) > Line =3D ' ' + Item + ', ' > Index =3D 1 > ArrayItem =3D Write(ArrayItem, Line.rstrip()) > =20 > - return ArrayItem > + return "".join(ArrayItem) > =20 > ## CreateCFileStringValue > # > # Create a line with string value > # > @@ -236,11 +236,11 @@ def CreateArrayItem(Array, Width =3D 16): > =20 > def CreateCFileStringValue(Value): > Value =3D [StringBlockType] + Value > Str =3D WriteLine('', CreateArrayItem(Value)) > =20 > - return Str > + return "".join(Str) > =20 > ## GetFilteredLanguage > # > # apply get best language rules to the UNI language code list # @@=20 > -438,11 +438,11 @@ def CreateCFileContent(BaseName, UniObjectClass, IsCom= patibleMode, UniBinBuffer, > # > # Join package data > # > AllStr =3D Write(AllStr, Str) > =20 > - return AllStr > + return "".join(AllStr) > =20 > ## Create end of .c file > # > # Create end of .c file > # > @@ -465,11 +465,11 @@ def CreateCFileEnd(): > # > def CreateCFile(BaseName, UniObjectClass, IsCompatibleMode, FilterInfo): > CFile =3D '' > CFile =3D WriteLine(CFile, CreateCFileContent(BaseName, UniObjectCla= ss, IsCompatibleMode, None, FilterInfo)) > CFile =3D WriteLine(CFile, CreateCFileEnd()) > - return CFile > + return "".join(CFile) > =20 > ## GetFileList > # > # Get a list for all files > # > @@ -572,17 +572,34 @@ def GetStringFiles(UniFilList, SourceFileList,=20 > IncludeList, IncludePathList, Ski > =20 > # > # Write an item > # > def Write(Target, Item): > - return ''.join([Target, Item]) > + if isinstance(Target,str): > + Target =3D [Target] > + if not Target: > + Target =3D [] > + if isinstance(Item,list): > + Target.extend(Item) > + else: > + Target.append(Item) > + return Target > =20 > # > # Write an item with a break line > # > def WriteLine(Target, Item): > - return ''.join([Target, Item, '\n']) > + if isinstance(Target,str): > + Target =3D [Target] > + if not Target: > + Target =3D [] > + if isinstance(Item, list): > + Target.extend(Item) > + else: > + Target.append(Item) > + Target.append('\n') > + return Target > =20 > # This acts like the main() function for the script, unless it is=20 > 'import'ed into another # script. > if __name__ =3D=3D '__main__': > EdkLogger.info('start') > diff --git a/BaseTools/Source/Python/Common/Misc.py=20 > b/BaseTools/Source/Python/Common/Misc.py > index 80236db160..8dcbe141ae 100644 > --- a/BaseTools/Source/Python/Common/Misc.py > +++ b/BaseTools/Source/Python/Common/Misc.py > @@ -777,21 +777,21 @@ class TemplateString(object): > =20 > return "".join(StringList) > =20 > ## Constructor > def __init__(self, Template=3DNone): > - self.String =3D '' > + self.String =3D [] > self.IsBinary =3D False > self._Template =3D Template > self._TemplateSectionList =3D self._Parse(Template) > =20 > ## str() operator > # > # @retval string The string replaced > # > def __str__(self): > - return self.String > + return "".join(self.String) > =20 > ## Split the template string into fragments per the ${BEGIN} and ${E= ND} flags > # > # @retval list A list of TemplateString.Section objects > # > @@ -835,13 +835,16 @@ class TemplateString(object): > # @param Dictionary The placeholder dictionaries > # > def Append(self, AppendString, Dictionary=3DNone): > if Dictionary: > SectionList =3D self._Parse(AppendString) > - self.String +=3D "".join(S.Instantiate(Dictionary) for S in = SectionList) > + self.String.append( "".join(S.Instantiate(Dictionary) for=20 > + S in SectionList)) > else: > - self.String +=3D AppendString > + if isinstance(AppendString,list): > + self.String.extend(AppendString) > + else: > + self.String.append(AppendString) > =20 > ## Replace the string template with dictionary of placeholders > # > # @param Dictionary The placeholder dictionaries > # > @@ -1741,27 +1744,21 @@ class PathClass(object): > # > # @retval False The two PathClass are different > # @retval True The two PathClass are the same > # > def __eq__(self, Other): > - if isinstance(Other, type(self)): > - return self.Path =3D=3D Other.Path > - else: > - return self.Path =3D=3D str(Other) > + return self.Path =3D=3D str(Other) > =20 > ## Override __cmp__ function > # > # Customize the comparsion operation of two PathClass > # > # @retval 0 The two PathClass are different > # @retval -1 The first PathClass is less than the second PathClas= s > # @retval 1 The first PathClass is Bigger than the second PathCl= ass > def __cmp__(self, Other): > - if isinstance(Other, type(self)): > - OtherKey =3D Other.Path > - else: > - OtherKey =3D str(Other) > + OtherKey =3D str(Other) > =20 > SelfKey =3D self.Path > if SelfKey =3D=3D OtherKey: > return 0 > elif SelfKey > OtherKey: > diff --git a/BaseTools/Source/Python/Workspace/InfBuildData.py=20 > b/BaseTools/Source/Python/Workspace/InfBuildData.py > index 44d44d24eb..d615cccdf7 100644 > --- a/BaseTools/Source/Python/Workspace/InfBuildData.py > +++ b/BaseTools/Source/Python/Workspace/InfBuildData.py > @@ -612,11 +612,13 @@ class InfBuildData(ModuleBuildClassObject): > for Record in RecordList: > Lib =3D Record[0] > Instance =3D Record[1] > if Instance: > Instance =3D NormPath(Instance, self._Macros) > - RetVal[Lib] =3D Instance > + RetVal[Lib] =3D Instance > + else: > + RetVal[Lib] =3D None > return RetVal > =20 > ## Retrieve library names (for Edk.x style of modules) > @cached_property > def Libraries(self): > diff --git a/BaseTools/Source/Python/Workspace/WorkspaceCommon.py=20 > b/BaseTools/Source/Python/Workspace/WorkspaceCommon.py > index 8d8a3e2789..55d01fa4b2 100644 > --- a/BaseTools/Source/Python/Workspace/WorkspaceCommon.py > +++ b/BaseTools/Source/Python/Workspace/WorkspaceCommon.py > @@ -126,17 +126,14 @@ def GetModuleLibInstances(Module, Platform, BuildDa= tabase, Arch, Target, Toolcha > while len(LibraryConsumerList) > 0: > M =3D LibraryConsumerList.pop() > for LibraryClassName in M.LibraryClasses: > if LibraryClassName not in LibraryInstance: > # override library instance for this module > - if LibraryClassName in Platform.Modules[str(Module)].Lib= raryClasses: > - LibraryPath =3D Platform.Modules[str(Module)].Librar= yClasses[LibraryClassName] > - else: > - LibraryPath =3D Platform.LibraryClasses[LibraryClass= Name, ModuleType] > - if LibraryPath is None or LibraryPath =3D=3D "": > - LibraryPath =3D M.LibraryClasses[LibraryClassName] > - if LibraryPath is None or LibraryPath =3D=3D "": > + LibraryPath =3D Platform.Modules[str(Module)].LibraryCla= sses.get(LibraryClassName,Platform.LibraryClasses[LibraryClassName, ModuleT= ype]) > + if LibraryPath is None: > + LibraryPath =3D M.LibraryClasses.get(LibraryClassNam= e) > + if LibraryPath is None: > if FileName: > EdkLogger.error("build", RESOURCE_NOT_AVAILA= BLE, > "Instance of library class [= %s] is not found" % LibraryClassName, > File=3DFileName, > ExtraData=3D"in [%s]=20 > [%s]\n\tconsumed by module [%s]" % (str(M), Arch, str(Module))) > -- > 2.19.1.windows.1 >=20 > _______________________________________________ > edk2-devel mailing list > edk2-devel@lists.01.org > https://lists.01.org/mailman/listinfo/edk2-devel