From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received-SPF: Pass (sender SPF authorized) identity=mailfrom; client-ip=2a00:1450:4864:20::342; helo=mail-wm1-x342.google.com; envelope-from=leif.lindholm@linaro.org; receiver=edk2-devel@lists.01.org Received: from mail-wm1-x342.google.com (mail-wm1-x342.google.com [IPv6:2a00:1450:4864:20::342]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 7C4AF21A07A92 for ; Fri, 9 Nov 2018 03:48:45 -0800 (PST) Received: by mail-wm1-x342.google.com with SMTP id v70-v6so1238226wmd.1 for ; Fri, 09 Nov 2018 03:48:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to :user-agent; bh=qF/3IT9NXZhTrmhP65TjFbQyOXkguTEY2bd2KY7q51M=; b=I2IZpZpv2Fwwir2cxI2KhX2tfjZHME60bu4ZxGU2UpVbtJ2jXya6JVj39oEN4Q0WyN 0CseV7vtbyTNxz7leIN5EBLr6g3KIsk3pTA8N/TnnE8TmqjXUf1EGcKMvlt6DClrGzb1 SqrN6NJXcS0t4e3iSIN/l6gSWHlaBFA/YVOTU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to:user-agent; bh=qF/3IT9NXZhTrmhP65TjFbQyOXkguTEY2bd2KY7q51M=; b=UcmlSHT/a+bNEHBFyIDkhZgInVMQE0Sv80zfAePrvwZE5SswOB9l7B1j8qA2q++az2 w3SxbYeYB4Ke7ataqAEGHi16mpaxkLM6Y4EQxuCmRDjKZsnMN0zC67wNK6HLSXIoDWj+ D41hhReRgFg6PoVH6i+fAM3ySPRiFLY7GxWM5P6AeMRqJlGZQr3fAiPFsAduyneO3GZx UVLwVBQG2TZygt23BWbzLpen8UA3D0nyNC4OZSWAelSvc1MyvkOf22WreTvUyltUyZP9 P4g+A3gwZ0Nxnz/1oGVxBZJo9KQRfr8hMMwhoRI64zliubSS6e8QTVPb21aIhCu7dxut yHNw== X-Gm-Message-State: AGRZ1gIll7QmpLaWfnUEG3DcpFfcupFddCWlGt052h8hzOUpaHWH59ZA gwEgOW+obZ/jMMrVPYzH3s0qudLIVnc= X-Google-Smtp-Source: AJdET5cZWrmd0d/YKJqie5Rrp2vnIItfc0HkjAg5Z1azbQep3SSZp67KdOFdbxC2c2Ul96Jgz8blbw== X-Received: by 2002:a1c:1792:: with SMTP id 140-v6mr4381520wmx.117.1541764123489; Fri, 09 Nov 2018 03:48:43 -0800 (PST) Received: from bivouac.eciton.net (bivouac.eciton.net. [2a00:1098:0:86:1000:23:0:2]) by smtp.gmail.com with ESMTPSA id k3-v6sm9604914wro.9.2018.11.09.03.48.42 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 09 Nov 2018 03:48:42 -0800 (PST) Date: Fri, 9 Nov 2018 11:48:40 +0000 From: Leif Lindholm To: "Feng, Bob C" Cc: "edk2-devel@lists.01.org" , "Carsey, Jaben" , "Gao, Liming" Message-ID: <20181109114840.txogb6tz5u23o4ng@bivouac.eciton.net> References: <20181108101625.41364-1-bob.c.feng@intel.com> <20181108165238.vhdbx2mc42kefvag@bivouac.eciton.net> <08650203BA1BD64D8AD9B6D5D74A85D15FFFBED5@SHSMSX101.ccr.corp.intel.com> MIME-Version: 1.0 In-Reply-To: <08650203BA1BD64D8AD9B6D5D74A85D15FFFBED5@SHSMSX101.ccr.corp.intel.com> User-Agent: NeoMutt/20170113 (1.7.2) Subject: Re: [Patch] BaseTools: Optimize string concatenation X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 09 Nov 2018 11:48:46 -0000 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit Hi Bob, On Fri, Nov 09, 2018 at 03:25:04AM +0000, Feng, Bob C wrote: > Yes. I should show the data. > > My unites scripts is as below. The parameter lines is a string list which size is 43395. The test result is > > ''.join(String list) time: 0.042262 > String += String time : 3.822699 > > def TestPlus(lines): > str_target = "" > > for line in lines: > str_target += line > > return str_target > > def TestJoin(lines): > str_target = [] > > for line in lines: > str_target.append(line) > > return "".join(str_target) > > def CompareStrCat(): > lines = GetStrings() > print (len(lines)) > > begin = time.perf_counter() > for _ in range(10): > TestJoin(lines) > end = time.perf_counter() - begin > print ("''.join(String list) time: %f" % end) > > begin = time.perf_counter() > for _ in range(10): > TestPlus(lines) > end = time.perf_counter() - begin > print ("String += String time: %f" % end) > > For build OvmfX64, it's not very effective, it saves 2~3 second in > Parse/AutoGen phase, because OvmfX64 is relatively simple. It does > not enable much features such as Multiple SKU and structure PCD by > default and there is no big size Autogen.c/Autogen.h/Makefile > generated either. but for the complex platform, this patch will be > much effective. The unites above simulates a real case that there is > a 43395 lines of Autogen.c generated. I beg to differ. Shaving 2-3 seconds off the autogen phase is a substantial improvement. However, on my wimpy 24-core Cortex-A53 system, the effect is not noticeable (fluctuates between 1:56-1:58 whether with or without this patch). And even on my x86 workstation, I see no measurable difference (12-13s). What is the hardware you are benchmarking on? It does not appear to be detrimental to performance on any of my platforms, but I would like to see some more measurements, and I would like to see that logged with some more detail in bugzilla. Regards, Leif > Since this patch mostly effect the Parser/AutoGen phase, I just use > "build genmake" to show the improvement data. > The final result for clean build is: > Current code: 17 seconds > After patch: 15 seconds > > Details: > Current data: > > d:\edk2 (master -> origin) > λ build genmake -p OvmfPkg\OvmfPkgIa32X64.dsc -a IA32 -a X64 -t VS2015x86 > Build environment: Windows-10-10.0.10240 > Build start time: 10:12:32, Nov.09 2018 > > WORKSPACE = d:\edk2 > ECP_SOURCE = d:\edk2\edkcompatibilitypkg > EDK_SOURCE = d:\edk2\edkcompatibilitypkg > EFI_SOURCE = d:\edk2\edkcompatibilitypkg > EDK_TOOLS_PATH = d:\edk2\basetools > EDK_TOOLS_BIN = d:\edk2\basetools\bin\win32 > CONF_PATH = d:\edk2\conf > > Architecture(s) = IA32 X64 > Build target = DEBUG > Toolchain = VS2015x86 > > Active Platform = d:\edk2\OvmfPkg\OvmfPkgIa32X64.dsc > Flash Image Definition = d:\edk2\OvmfPkg\OvmfPkgIa32X64.fdf > > Processing meta-data ....... done! > Generating code . done! > Generating makefile . done! > Generating code .. done! > Generating makefile ...... done! > > - Done - > Build end time: 10:12:49, Nov.09 2018 > Build total time: 00:00:17 > > After applying this patch: > > d:\edk2 (master -> origin) > λ build genmake -p OvmfPkg\OvmfPkgIa32X64.dsc -a IA32 -a X64 - > Build environment: Windows-10-10.0.10240 > Build start time: 10:11:41, Nov.09 2018 > > WORKSPACE = d:\edk2 > ECP_SOURCE = d:\edk2\edkcompatibilitypkg > EDK_SOURCE = d:\edk2\edkcompatibilitypkg > EFI_SOURCE = d:\edk2\edkcompatibilitypkg > EDK_TOOLS_PATH = d:\edk2\basetools > EDK_TOOLS_BIN = d:\edk2\basetools\bin\win32 > CONF_PATH = d:\edk2\conf > > > Architecture(s) = IA32 X64 > Build target = DEBUG > Toolchain = VS2015x86 > > Active Platform = d:\edk2\OvmfPkg\OvmfPkgIa32X64.dsc > Flash Image Definition = d:\edk2\OvmfPkg\OvmfPkgIa32X64.fdf > > Processing meta-data ..... done! > Generating code . done! > Generating makefile . done! > Generating code .. done! > Generating makefile ...... done! > > - Done - > Build end time: 10:11:56, Nov.09 2018 > Build total time: 00:00:15 > > > Thanks, > Bob > > > -----Original Message----- > From: Leif Lindholm [mailto:leif.lindholm@linaro.org] > Sent: Friday, November 9, 2018 12:53 AM > To: Feng, Bob C > Cc: edk2-devel@lists.01.org; Carsey, Jaben ; Gao, Liming > Subject: Re: [edk2] [Patch] BaseTools: Optimize string concatenation > > On Thu, Nov 08, 2018 at 06:16:25PM +0800, BobCF wrote: > > https://bugzilla.tianocore.org/show_bug.cgi?id=1288 > > > > This patch is one of build tool performance improvement series > > patches. > > > > This patch is going to use join function instead of string += string2 > > statement. > > > > Current code use string += string2 in a loop to combine a string. > > while creating a string list in a loop and using > > "".join(stringlist) after the loop will be much faster. > > Do you have any numbers on the level of improvement seen? > > Either for the individual scripts when called identically, or (if > measurable) on the build of an entire platform (say OvmfX64?). > > Regards, > > Leif > > > Contributed-under: TianoCore Contribution Agreement 1.1 > > Signed-off-by: BobCF > > Cc: Liming Gao > > Cc: Jaben Carsey > > --- > > BaseTools/Source/Python/AutoGen/StrGather.py | 39 +++++++++++++------ > > BaseTools/Source/Python/Common/Misc.py | 21 +++++----- > > .../Source/Python/Workspace/InfBuildData.py | 4 +- > > .../Python/Workspace/WorkspaceCommon.py | 11 ++---- > > 4 files changed, 44 insertions(+), 31 deletions(-) > > > > diff --git a/BaseTools/Source/Python/AutoGen/StrGather.py > > b/BaseTools/Source/Python/AutoGen/StrGather.py > > index 361d499076..d34a9e9447 100644 > > --- a/BaseTools/Source/Python/AutoGen/StrGather.py > > +++ b/BaseTools/Source/Python/AutoGen/StrGather.py > > @@ -135,11 +135,11 @@ def AscToHexList(Ascii): > > # @param UniGenCFlag UniString is generated into AutoGen C file when it is set to True > > # > > # @retval Str: A string of .h file content > > # > > def CreateHFileContent(BaseName, UniObjectClass, IsCompatibleMode, UniGenCFlag): > > - Str = '' > > + Str = [] > > ValueStartPtr = 60 > > Line = COMMENT_DEFINE_STR + ' ' + LANGUAGE_NAME_STRING_NAME + ' ' * (ValueStartPtr - len(DEFINE_STR + LANGUAGE_NAME_STRING_NAME)) + DecToHexStr(0, 4) + COMMENT_NOT_REFERENCED > > Str = WriteLine(Str, Line) > > Line = COMMENT_DEFINE_STR + ' ' + PRINTABLE_LANGUAGE_NAME_STRING_NAME + ' ' * (ValueStartPtr - len(DEFINE_STR + PRINTABLE_LANGUAGE_NAME_STRING_NAME)) + DecToHexStr(1, 4) + COMMENT_NOT_REFERENCED > > Str = WriteLine(Str, Line) > > @@ -164,16 +164,16 @@ def CreateHFileContent(BaseName, UniObjectClass, IsCompatibleMode, UniGenCFlag): > > Line = COMMENT_DEFINE_STR + ' ' + Name + ' ' + DecToHexStr(Token, 4) + COMMENT_NOT_REFERENCED > > else: > > Line = COMMENT_DEFINE_STR + ' ' + Name + ' ' * (ValueStartPtr - len(DEFINE_STR + Name)) + DecToHexStr(Token, 4) + COMMENT_NOT_REFERENCED > > UnusedStr = WriteLine(UnusedStr, Line) > > > > - Str = ''.join([Str, UnusedStr]) > > + Str.extend( UnusedStr) > > > > Str = WriteLine(Str, '') > > if IsCompatibleMode or UniGenCFlag: > > Str = WriteLine(Str, 'extern unsigned char ' + BaseName + 'Strings[];') > > - return Str > > + return "".join(Str) > > > > ## Create a complete .h file > > # > > # Create a complet .h file with file header and file content # @@ > > -185,11 +185,11 @@ def CreateHFileContent(BaseName, UniObjectClass, IsCompatibleMode, UniGenCFlag): > > # @retval Str: A string of complete .h file > > # > > def CreateHFile(BaseName, UniObjectClass, IsCompatibleMode, UniGenCFlag): > > HFile = WriteLine('', CreateHFileContent(BaseName, > > UniObjectClass, IsCompatibleMode, UniGenCFlag)) > > > > - return HFile > > + return "".join(HFile) > > > > ## Create a buffer to store all items in an array # > > # @param BinBuffer Buffer to contain Binary data. > > # @param Array: The array need to be formatted > > @@ -209,11 +209,11 @@ def CreateBinBuffer(BinBuffer, Array): > > # > > def CreateArrayItem(Array, Width = 16): > > MaxLength = Width > > Index = 0 > > Line = ' ' > > - ArrayItem = '' > > + ArrayItem = [] > > > > for Item in Array: > > if Index < MaxLength: > > Line = Line + Item + ', ' > > Index = Index + 1 > > @@ -221,11 +221,11 @@ def CreateArrayItem(Array, Width = 16): > > ArrayItem = WriteLine(ArrayItem, Line) > > Line = ' ' + Item + ', ' > > Index = 1 > > ArrayItem = Write(ArrayItem, Line.rstrip()) > > > > - return ArrayItem > > + return "".join(ArrayItem) > > > > ## CreateCFileStringValue > > # > > # Create a line with string value > > # > > @@ -236,11 +236,11 @@ def CreateArrayItem(Array, Width = 16): > > > > def CreateCFileStringValue(Value): > > Value = [StringBlockType] + Value > > Str = WriteLine('', CreateArrayItem(Value)) > > > > - return Str > > + return "".join(Str) > > > > ## GetFilteredLanguage > > # > > # apply get best language rules to the UNI language code list # @@ > > -438,11 +438,11 @@ def CreateCFileContent(BaseName, UniObjectClass, IsCompatibleMode, UniBinBuffer, > > # > > # Join package data > > # > > AllStr = Write(AllStr, Str) > > > > - return AllStr > > + return "".join(AllStr) > > > > ## Create end of .c file > > # > > # Create end of .c file > > # > > @@ -465,11 +465,11 @@ def CreateCFileEnd(): > > # > > def CreateCFile(BaseName, UniObjectClass, IsCompatibleMode, FilterInfo): > > CFile = '' > > CFile = WriteLine(CFile, CreateCFileContent(BaseName, UniObjectClass, IsCompatibleMode, None, FilterInfo)) > > CFile = WriteLine(CFile, CreateCFileEnd()) > > - return CFile > > + return "".join(CFile) > > > > ## GetFileList > > # > > # Get a list for all files > > # > > @@ -572,17 +572,34 @@ def GetStringFiles(UniFilList, SourceFileList, > > IncludeList, IncludePathList, Ski > > > > # > > # Write an item > > # > > def Write(Target, Item): > > - return ''.join([Target, Item]) > > + if isinstance(Target,str): > > + Target = [Target] > > + if not Target: > > + Target = [] > > + if isinstance(Item,list): > > + Target.extend(Item) > > + else: > > + Target.append(Item) > > + return Target > > > > # > > # Write an item with a break line > > # > > def WriteLine(Target, Item): > > - return ''.join([Target, Item, '\n']) > > + if isinstance(Target,str): > > + Target = [Target] > > + if not Target: > > + Target = [] > > + if isinstance(Item, list): > > + Target.extend(Item) > > + else: > > + Target.append(Item) > > + Target.append('\n') > > + return Target > > > > # This acts like the main() function for the script, unless it is > > 'import'ed into another # script. > > if __name__ == '__main__': > > EdkLogger.info('start') > > diff --git a/BaseTools/Source/Python/Common/Misc.py > > b/BaseTools/Source/Python/Common/Misc.py > > index 80236db160..8dcbe141ae 100644 > > --- a/BaseTools/Source/Python/Common/Misc.py > > +++ b/BaseTools/Source/Python/Common/Misc.py > > @@ -777,21 +777,21 @@ class TemplateString(object): > > > > return "".join(StringList) > > > > ## Constructor > > def __init__(self, Template=None): > > - self.String = '' > > + self.String = [] > > self.IsBinary = False > > self._Template = Template > > self._TemplateSectionList = self._Parse(Template) > > > > ## str() operator > > # > > # @retval string The string replaced > > # > > def __str__(self): > > - return self.String > > + return "".join(self.String) > > > > ## Split the template string into fragments per the ${BEGIN} and ${END} flags > > # > > # @retval list A list of TemplateString.Section objects > > # > > @@ -835,13 +835,16 @@ class TemplateString(object): > > # @param Dictionary The placeholder dictionaries > > # > > def Append(self, AppendString, Dictionary=None): > > if Dictionary: > > SectionList = self._Parse(AppendString) > > - self.String += "".join(S.Instantiate(Dictionary) for S in SectionList) > > + self.String.append( "".join(S.Instantiate(Dictionary) for > > + S in SectionList)) > > else: > > - self.String += AppendString > > + if isinstance(AppendString,list): > > + self.String.extend(AppendString) > > + else: > > + self.String.append(AppendString) > > > > ## Replace the string template with dictionary of placeholders > > # > > # @param Dictionary The placeholder dictionaries > > # > > @@ -1741,27 +1744,21 @@ class PathClass(object): > > # > > # @retval False The two PathClass are different > > # @retval True The two PathClass are the same > > # > > def __eq__(self, Other): > > - if isinstance(Other, type(self)): > > - return self.Path == Other.Path > > - else: > > - return self.Path == str(Other) > > + return self.Path == str(Other) > > > > ## Override __cmp__ function > > # > > # Customize the comparsion operation of two PathClass > > # > > # @retval 0 The two PathClass are different > > # @retval -1 The first PathClass is less than the second PathClass > > # @retval 1 The first PathClass is Bigger than the second PathClass > > def __cmp__(self, Other): > > - if isinstance(Other, type(self)): > > - OtherKey = Other.Path > > - else: > > - OtherKey = str(Other) > > + OtherKey = str(Other) > > > > SelfKey = self.Path > > if SelfKey == OtherKey: > > return 0 > > elif SelfKey > OtherKey: > > diff --git a/BaseTools/Source/Python/Workspace/InfBuildData.py > > b/BaseTools/Source/Python/Workspace/InfBuildData.py > > index 44d44d24eb..d615cccdf7 100644 > > --- a/BaseTools/Source/Python/Workspace/InfBuildData.py > > +++ b/BaseTools/Source/Python/Workspace/InfBuildData.py > > @@ -612,11 +612,13 @@ class InfBuildData(ModuleBuildClassObject): > > for Record in RecordList: > > Lib = Record[0] > > Instance = Record[1] > > if Instance: > > Instance = NormPath(Instance, self._Macros) > > - RetVal[Lib] = Instance > > + RetVal[Lib] = Instance > > + else: > > + RetVal[Lib] = None > > return RetVal > > > > ## Retrieve library names (for Edk.x style of modules) > > @cached_property > > def Libraries(self): > > diff --git a/BaseTools/Source/Python/Workspace/WorkspaceCommon.py > > b/BaseTools/Source/Python/Workspace/WorkspaceCommon.py > > index 8d8a3e2789..55d01fa4b2 100644 > > --- a/BaseTools/Source/Python/Workspace/WorkspaceCommon.py > > +++ b/BaseTools/Source/Python/Workspace/WorkspaceCommon.py > > @@ -126,17 +126,14 @@ def GetModuleLibInstances(Module, Platform, BuildDatabase, Arch, Target, Toolcha > > while len(LibraryConsumerList) > 0: > > M = LibraryConsumerList.pop() > > for LibraryClassName in M.LibraryClasses: > > if LibraryClassName not in LibraryInstance: > > # override library instance for this module > > - if LibraryClassName in Platform.Modules[str(Module)].LibraryClasses: > > - LibraryPath = Platform.Modules[str(Module)].LibraryClasses[LibraryClassName] > > - else: > > - LibraryPath = Platform.LibraryClasses[LibraryClassName, ModuleType] > > - if LibraryPath is None or LibraryPath == "": > > - LibraryPath = M.LibraryClasses[LibraryClassName] > > - if LibraryPath is None or LibraryPath == "": > > + LibraryPath = Platform.Modules[str(Module)].LibraryClasses.get(LibraryClassName,Platform.LibraryClasses[LibraryClassName, ModuleType]) > > + if LibraryPath is None: > > + LibraryPath = M.LibraryClasses.get(LibraryClassName) > > + if LibraryPath is None: > > if FileName: > > EdkLogger.error("build", RESOURCE_NOT_AVAILABLE, > > "Instance of library class [%s] is not found" % LibraryClassName, > > File=FileName, > > ExtraData="in [%s] > > [%s]\n\tconsumed by module [%s]" % (str(M), Arch, str(Module))) > > -- > > 2.19.1.windows.1 > > > > _______________________________________________ > > edk2-devel mailing list > > edk2-devel@lists.01.org > > https://lists.01.org/mailman/listinfo/edk2-devel