From: "Feng, Bob C" <bob.c.feng@intel.com>
To: Leif Lindholm <leif.lindholm@linaro.org>
Cc: "edk2-devel@lists.01.org" <edk2-devel@lists.01.org>,
"Carsey, Jaben" <jaben.carsey@intel.com>,
"Gao, Liming" <liming.gao@intel.com>
Subject: Re: [Patch] BaseTools: Optimize string concatenation
Date: Sun, 11 Nov 2018 00:41:07 +0000 [thread overview]
Message-ID: <08650203BA1BD64D8AD9B6D5D74A85D15FFFCFF1@SHSMSX101.ccr.corp.intel.com> (raw)
In-Reply-To: <20181109114840.txogb6tz5u23o4ng@bivouac.eciton.net>
Hi Leif,
I use my desktop to do the benchmark.
CPU: Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz
Memory: 16G
OS: Win10
I'll add the performance detailed data to BZ.
-Bob
-----Original Message-----
From: Leif Lindholm [mailto:leif.lindholm@linaro.org]
Sent: Friday, November 9, 2018 7:49 PM
To: Feng, Bob C <bob.c.feng@intel.com>
Cc: edk2-devel@lists.01.org; Carsey, Jaben <jaben.carsey@intel.com>; Gao, Liming <liming.gao@intel.com>
Subject: Re: [edk2] [Patch] BaseTools: Optimize string concatenation
Hi Bob,
On Fri, Nov 09, 2018 at 03:25:04AM +0000, Feng, Bob C wrote:
> Yes. I should show the data.
>
> My unites scripts is as below. The parameter lines is a string list
> which size is 43395. The test result is
>
> ''.join(String list) time: 0.042262
> String += String time : 3.822699
>
> def TestPlus(lines):
> str_target = ""
>
> for line in lines:
> str_target += line
>
> return str_target
>
> def TestJoin(lines):
> str_target = []
>
> for line in lines:
> str_target.append(line)
>
> return "".join(str_target)
>
> def CompareStrCat():
> lines = GetStrings()
> print (len(lines))
>
> begin = time.perf_counter()
> for _ in range(10):
> TestJoin(lines)
> end = time.perf_counter() - begin
> print ("''.join(String list) time: %f" % end)
>
> begin = time.perf_counter()
> for _ in range(10):
> TestPlus(lines)
> end = time.perf_counter() - begin
> print ("String += String time: %f" % end)
>
> For build OvmfX64, it's not very effective, it saves 2~3 second in
> Parse/AutoGen phase, because OvmfX64 is relatively simple. It does not
> enable much features such as Multiple SKU and structure PCD by default
> and there is no big size Autogen.c/Autogen.h/Makefile generated
> either. but for the complex platform, this patch will be much
> effective. The unites above simulates a real case that there is a
> 43395 lines of Autogen.c generated.
I beg to differ. Shaving 2-3 seconds off the autogen phase is a substantial improvement.
However, on my wimpy 24-core Cortex-A53 system, the effect is not noticeable (fluctuates between 1:56-1:58 whether with or without this patch).
And even on my x86 workstation, I see no measurable difference (12-13s).
What is the hardware you are benchmarking on?
It does not appear to be detrimental to performance on any of my platforms, but I would like to see some more measurements, and I would like to see that logged with some more detail in bugzilla.
Regards,
Leif
> Since this patch mostly effect the Parser/AutoGen phase, I just use
> "build genmake" to show the improvement data.
> The final result for clean build is:
> Current code: 17 seconds
> After patch: 15 seconds
>
> Details:
> Current data:
>
> d:\edk2 (master -> origin)
> λ build genmake -p OvmfPkg\OvmfPkgIa32X64.dsc -a IA32 -a X64 -t
> VS2015x86 Build environment: Windows-10-10.0.10240 Build start time:
> 10:12:32, Nov.09 2018
>
> WORKSPACE = d:\edk2
> ECP_SOURCE = d:\edk2\edkcompatibilitypkg
> EDK_SOURCE = d:\edk2\edkcompatibilitypkg
> EFI_SOURCE = d:\edk2\edkcompatibilitypkg
> EDK_TOOLS_PATH = d:\edk2\basetools
> EDK_TOOLS_BIN = d:\edk2\basetools\bin\win32
> CONF_PATH = d:\edk2\conf
>
> Architecture(s) = IA32 X64
> Build target = DEBUG
> Toolchain = VS2015x86
>
> Active Platform = d:\edk2\OvmfPkg\OvmfPkgIa32X64.dsc
> Flash Image Definition = d:\edk2\OvmfPkg\OvmfPkgIa32X64.fdf
>
> Processing meta-data ....... done!
> Generating code . done!
> Generating makefile . done!
> Generating code .. done!
> Generating makefile ...... done!
>
> - Done -
> Build end time: 10:12:49, Nov.09 2018
> Build total time: 00:00:17
>
> After applying this patch:
>
> d:\edk2 (master -> origin)
> λ build genmake -p OvmfPkg\OvmfPkgIa32X64.dsc -a IA32 -a X64 -
> Build environment: Windows-10-10.0.10240
> Build start time: 10:11:41, Nov.09 2018
>
> WORKSPACE = d:\edk2
> ECP_SOURCE = d:\edk2\edkcompatibilitypkg
> EDK_SOURCE = d:\edk2\edkcompatibilitypkg
> EFI_SOURCE = d:\edk2\edkcompatibilitypkg
> EDK_TOOLS_PATH = d:\edk2\basetools
> EDK_TOOLS_BIN = d:\edk2\basetools\bin\win32
> CONF_PATH = d:\edk2\conf
>
>
> Architecture(s) = IA32 X64
> Build target = DEBUG
> Toolchain = VS2015x86
>
> Active Platform = d:\edk2\OvmfPkg\OvmfPkgIa32X64.dsc
> Flash Image Definition = d:\edk2\OvmfPkg\OvmfPkgIa32X64.fdf
>
> Processing meta-data ..... done!
> Generating code . done!
> Generating makefile . done!
> Generating code .. done!
> Generating makefile ...... done!
>
> - Done -
> Build end time: 10:11:56, Nov.09 2018
> Build total time: 00:00:15
>
>
> Thanks,
> Bob
>
>
> -----Original Message-----
> From: Leif Lindholm [mailto:leif.lindholm@linaro.org]
> Sent: Friday, November 9, 2018 12:53 AM
> To: Feng, Bob C <bob.c.feng@intel.com>
> Cc: edk2-devel@lists.01.org; Carsey, Jaben <jaben.carsey@intel.com>;
> Gao, Liming <liming.gao@intel.com>
> Subject: Re: [edk2] [Patch] BaseTools: Optimize string concatenation
>
> On Thu, Nov 08, 2018 at 06:16:25PM +0800, BobCF wrote:
> > https://bugzilla.tianocore.org/show_bug.cgi?id=1288
> >
> > This patch is one of build tool performance improvement series
> > patches.
> >
> > This patch is going to use join function instead of string +=
> > string2 statement.
> >
> > Current code use string += string2 in a loop to combine a string.
> > while creating a string list in a loop and using
> > "".join(stringlist) after the loop will be much faster.
>
> Do you have any numbers on the level of improvement seen?
>
> Either for the individual scripts when called identically, or (if
> measurable) on the build of an entire platform (say OvmfX64?).
>
> Regards,
>
> Leif
>
> > Contributed-under: TianoCore Contribution Agreement 1.1
> > Signed-off-by: BobCF <bob.c.feng@intel.com>
> > Cc: Liming Gao <liming.gao@intel.com>
> > Cc: Jaben Carsey <jaben.carsey@intel.com>
> > ---
> > BaseTools/Source/Python/AutoGen/StrGather.py | 39 +++++++++++++------
> > BaseTools/Source/Python/Common/Misc.py | 21 +++++-----
> > .../Source/Python/Workspace/InfBuildData.py | 4 +-
> > .../Python/Workspace/WorkspaceCommon.py | 11 ++----
> > 4 files changed, 44 insertions(+), 31 deletions(-)
> >
> > diff --git a/BaseTools/Source/Python/AutoGen/StrGather.py
> > b/BaseTools/Source/Python/AutoGen/StrGather.py
> > index 361d499076..d34a9e9447 100644
> > --- a/BaseTools/Source/Python/AutoGen/StrGather.py
> > +++ b/BaseTools/Source/Python/AutoGen/StrGather.py
> > @@ -135,11 +135,11 @@ def AscToHexList(Ascii):
> > # @param UniGenCFlag UniString is generated into AutoGen C file when it is set to True
> > #
> > # @retval Str: A string of .h file content
> > #
> > def CreateHFileContent(BaseName, UniObjectClass, IsCompatibleMode, UniGenCFlag):
> > - Str = ''
> > + Str = []
> > ValueStartPtr = 60
> > Line = COMMENT_DEFINE_STR + ' ' + LANGUAGE_NAME_STRING_NAME + ' ' * (ValueStartPtr - len(DEFINE_STR + LANGUAGE_NAME_STRING_NAME)) + DecToHexStr(0, 4) + COMMENT_NOT_REFERENCED
> > Str = WriteLine(Str, Line)
> > Line = COMMENT_DEFINE_STR + ' ' + PRINTABLE_LANGUAGE_NAME_STRING_NAME + ' ' * (ValueStartPtr - len(DEFINE_STR + PRINTABLE_LANGUAGE_NAME_STRING_NAME)) + DecToHexStr(1, 4) + COMMENT_NOT_REFERENCED
> > Str = WriteLine(Str, Line)
> > @@ -164,16 +164,16 @@ def CreateHFileContent(BaseName, UniObjectClass, IsCompatibleMode, UniGenCFlag):
> > Line = COMMENT_DEFINE_STR + ' ' + Name + ' ' + DecToHexStr(Token, 4) + COMMENT_NOT_REFERENCED
> > else:
> > Line = COMMENT_DEFINE_STR + ' ' + Name + ' ' * (ValueStartPtr - len(DEFINE_STR + Name)) + DecToHexStr(Token, 4) + COMMENT_NOT_REFERENCED
> > UnusedStr = WriteLine(UnusedStr, Line)
> >
> > - Str = ''.join([Str, UnusedStr])
> > + Str.extend( UnusedStr)
> >
> > Str = WriteLine(Str, '')
> > if IsCompatibleMode or UniGenCFlag:
> > Str = WriteLine(Str, 'extern unsigned char ' + BaseName + 'Strings[];')
> > - return Str
> > + return "".join(Str)
> >
> > ## Create a complete .h file
> > #
> > # Create a complet .h file with file header and file content # @@
> > -185,11 +185,11 @@ def CreateHFileContent(BaseName, UniObjectClass, IsCompatibleMode, UniGenCFlag):
> > # @retval Str: A string of complete .h file
> > #
> > def CreateHFile(BaseName, UniObjectClass, IsCompatibleMode, UniGenCFlag):
> > HFile = WriteLine('', CreateHFileContent(BaseName,
> > UniObjectClass, IsCompatibleMode, UniGenCFlag))
> >
> > - return HFile
> > + return "".join(HFile)
> >
> > ## Create a buffer to store all items in an array #
> > # @param BinBuffer Buffer to contain Binary data.
> > # @param Array: The array need to be formatted
> > @@ -209,11 +209,11 @@ def CreateBinBuffer(BinBuffer, Array):
> > #
> > def CreateArrayItem(Array, Width = 16):
> > MaxLength = Width
> > Index = 0
> > Line = ' '
> > - ArrayItem = ''
> > + ArrayItem = []
> >
> > for Item in Array:
> > if Index < MaxLength:
> > Line = Line + Item + ', '
> > Index = Index + 1
> > @@ -221,11 +221,11 @@ def CreateArrayItem(Array, Width = 16):
> > ArrayItem = WriteLine(ArrayItem, Line)
> > Line = ' ' + Item + ', '
> > Index = 1
> > ArrayItem = Write(ArrayItem, Line.rstrip())
> >
> > - return ArrayItem
> > + return "".join(ArrayItem)
> >
> > ## CreateCFileStringValue
> > #
> > # Create a line with string value
> > #
> > @@ -236,11 +236,11 @@ def CreateArrayItem(Array, Width = 16):
> >
> > def CreateCFileStringValue(Value):
> > Value = [StringBlockType] + Value
> > Str = WriteLine('', CreateArrayItem(Value))
> >
> > - return Str
> > + return "".join(Str)
> >
> > ## GetFilteredLanguage
> > #
> > # apply get best language rules to the UNI language code list # @@
> > -438,11 +438,11 @@ def CreateCFileContent(BaseName, UniObjectClass, IsCompatibleMode, UniBinBuffer,
> > #
> > # Join package data
> > #
> > AllStr = Write(AllStr, Str)
> >
> > - return AllStr
> > + return "".join(AllStr)
> >
> > ## Create end of .c file
> > #
> > # Create end of .c file
> > #
> > @@ -465,11 +465,11 @@ def CreateCFileEnd():
> > #
> > def CreateCFile(BaseName, UniObjectClass, IsCompatibleMode, FilterInfo):
> > CFile = ''
> > CFile = WriteLine(CFile, CreateCFileContent(BaseName, UniObjectClass, IsCompatibleMode, None, FilterInfo))
> > CFile = WriteLine(CFile, CreateCFileEnd())
> > - return CFile
> > + return "".join(CFile)
> >
> > ## GetFileList
> > #
> > # Get a list for all files
> > #
> > @@ -572,17 +572,34 @@ def GetStringFiles(UniFilList, SourceFileList,
> > IncludeList, IncludePathList, Ski
> >
> > #
> > # Write an item
> > #
> > def Write(Target, Item):
> > - return ''.join([Target, Item])
> > + if isinstance(Target,str):
> > + Target = [Target]
> > + if not Target:
> > + Target = []
> > + if isinstance(Item,list):
> > + Target.extend(Item)
> > + else:
> > + Target.append(Item)
> > + return Target
> >
> > #
> > # Write an item with a break line
> > #
> > def WriteLine(Target, Item):
> > - return ''.join([Target, Item, '\n'])
> > + if isinstance(Target,str):
> > + Target = [Target]
> > + if not Target:
> > + Target = []
> > + if isinstance(Item, list):
> > + Target.extend(Item)
> > + else:
> > + Target.append(Item)
> > + Target.append('\n')
> > + return Target
> >
> > # This acts like the main() function for the script, unless it is
> > 'import'ed into another # script.
> > if __name__ == '__main__':
> > EdkLogger.info('start')
> > diff --git a/BaseTools/Source/Python/Common/Misc.py
> > b/BaseTools/Source/Python/Common/Misc.py
> > index 80236db160..8dcbe141ae 100644
> > --- a/BaseTools/Source/Python/Common/Misc.py
> > +++ b/BaseTools/Source/Python/Common/Misc.py
> > @@ -777,21 +777,21 @@ class TemplateString(object):
> >
> > return "".join(StringList)
> >
> > ## Constructor
> > def __init__(self, Template=None):
> > - self.String = ''
> > + self.String = []
> > self.IsBinary = False
> > self._Template = Template
> > self._TemplateSectionList = self._Parse(Template)
> >
> > ## str() operator
> > #
> > # @retval string The string replaced
> > #
> > def __str__(self):
> > - return self.String
> > + return "".join(self.String)
> >
> > ## Split the template string into fragments per the ${BEGIN} and ${END} flags
> > #
> > # @retval list A list of TemplateString.Section objects
> > #
> > @@ -835,13 +835,16 @@ class TemplateString(object):
> > # @param Dictionary The placeholder dictionaries
> > #
> > def Append(self, AppendString, Dictionary=None):
> > if Dictionary:
> > SectionList = self._Parse(AppendString)
> > - self.String += "".join(S.Instantiate(Dictionary) for S in SectionList)
> > + self.String.append( "".join(S.Instantiate(Dictionary)
> > + for S in SectionList))
> > else:
> > - self.String += AppendString
> > + if isinstance(AppendString,list):
> > + self.String.extend(AppendString)
> > + else:
> > + self.String.append(AppendString)
> >
> > ## Replace the string template with dictionary of placeholders
> > #
> > # @param Dictionary The placeholder dictionaries
> > #
> > @@ -1741,27 +1744,21 @@ class PathClass(object):
> > #
> > # @retval False The two PathClass are different
> > # @retval True The two PathClass are the same
> > #
> > def __eq__(self, Other):
> > - if isinstance(Other, type(self)):
> > - return self.Path == Other.Path
> > - else:
> > - return self.Path == str(Other)
> > + return self.Path == str(Other)
> >
> > ## Override __cmp__ function
> > #
> > # Customize the comparsion operation of two PathClass
> > #
> > # @retval 0 The two PathClass are different
> > # @retval -1 The first PathClass is less than the second PathClass
> > # @retval 1 The first PathClass is Bigger than the second PathClass
> > def __cmp__(self, Other):
> > - if isinstance(Other, type(self)):
> > - OtherKey = Other.Path
> > - else:
> > - OtherKey = str(Other)
> > + OtherKey = str(Other)
> >
> > SelfKey = self.Path
> > if SelfKey == OtherKey:
> > return 0
> > elif SelfKey > OtherKey:
> > diff --git a/BaseTools/Source/Python/Workspace/InfBuildData.py
> > b/BaseTools/Source/Python/Workspace/InfBuildData.py
> > index 44d44d24eb..d615cccdf7 100644
> > --- a/BaseTools/Source/Python/Workspace/InfBuildData.py
> > +++ b/BaseTools/Source/Python/Workspace/InfBuildData.py
> > @@ -612,11 +612,13 @@ class InfBuildData(ModuleBuildClassObject):
> > for Record in RecordList:
> > Lib = Record[0]
> > Instance = Record[1]
> > if Instance:
> > Instance = NormPath(Instance, self._Macros)
> > - RetVal[Lib] = Instance
> > + RetVal[Lib] = Instance
> > + else:
> > + RetVal[Lib] = None
> > return RetVal
> >
> > ## Retrieve library names (for Edk.x style of modules)
> > @cached_property
> > def Libraries(self):
> > diff --git a/BaseTools/Source/Python/Workspace/WorkspaceCommon.py
> > b/BaseTools/Source/Python/Workspace/WorkspaceCommon.py
> > index 8d8a3e2789..55d01fa4b2 100644
> > --- a/BaseTools/Source/Python/Workspace/WorkspaceCommon.py
> > +++ b/BaseTools/Source/Python/Workspace/WorkspaceCommon.py
> > @@ -126,17 +126,14 @@ def GetModuleLibInstances(Module, Platform, BuildDatabase, Arch, Target, Toolcha
> > while len(LibraryConsumerList) > 0:
> > M = LibraryConsumerList.pop()
> > for LibraryClassName in M.LibraryClasses:
> > if LibraryClassName not in LibraryInstance:
> > # override library instance for this module
> > - if LibraryClassName in Platform.Modules[str(Module)].LibraryClasses:
> > - LibraryPath = Platform.Modules[str(Module)].LibraryClasses[LibraryClassName]
> > - else:
> > - LibraryPath = Platform.LibraryClasses[LibraryClassName, ModuleType]
> > - if LibraryPath is None or LibraryPath == "":
> > - LibraryPath = M.LibraryClasses[LibraryClassName]
> > - if LibraryPath is None or LibraryPath == "":
> > + LibraryPath = Platform.Modules[str(Module)].LibraryClasses.get(LibraryClassName,Platform.LibraryClasses[LibraryClassName, ModuleType])
> > + if LibraryPath is None:
> > + LibraryPath = M.LibraryClasses.get(LibraryClassName)
> > + if LibraryPath is None:
> > if FileName:
> > EdkLogger.error("build", RESOURCE_NOT_AVAILABLE,
> > "Instance of library class [%s] is not found" % LibraryClassName,
> > File=FileName,
> > ExtraData="in [%s]
> > [%s]\n\tconsumed by module [%s]" % (str(M), Arch, str(Module)))
> > --
> > 2.19.1.windows.1
> >
> > _______________________________________________
> > edk2-devel mailing list
> > edk2-devel@lists.01.org
> > https://lists.01.org/mailman/listinfo/edk2-devel
next prev parent reply other threads:[~2018-11-11 0:41 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-08 10:16 [Patch] BaseTools: Optimize string concatenation BobCF
2018-11-08 15:26 ` Carsey, Jaben
2018-11-08 16:40 ` Kinney, Michael D
2018-11-09 14:16 ` Gao, Liming
2018-11-09 17:01 ` Kinney, Michael D
2018-11-08 16:52 ` Leif Lindholm
2018-11-09 3:25 ` Feng, Bob C
2018-11-09 11:48 ` Leif Lindholm
2018-11-11 0:41 ` Feng, Bob C [this message]
2018-12-05 2:51 ` Feng, Bob C
2018-12-10 10:47 ` Leif Lindholm
2018-12-10 12:09 ` Feng, Bob C
2018-12-10 12:35 ` Leif Lindholm
2018-12-11 8:48 ` Feng, Bob C
2018-12-12 18:37 ` Leif Lindholm
2018-12-13 1:50 ` Gao, Liming
2018-12-13 10:08 ` Leif Lindholm
2018-12-13 14:01 ` Gao, Liming
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-list from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=08650203BA1BD64D8AD9B6D5D74A85D15FFFCFF1@SHSMSX101.ccr.corp.intel.com \
--to=devel@edk2.groups.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox