public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
From: "Feng, Bob C" <bob.c.feng@intel.com>
To: Leif Lindholm <leif.lindholm@linaro.org>
Cc: "edk2-devel@lists.01.org" <edk2-devel@lists.01.org>,
	"Carsey, Jaben" <jaben.carsey@intel.com>,
	"Gao, Liming" <liming.gao@intel.com>
Subject: Re: [Patch] BaseTools: Optimize string concatenation
Date: Fri, 9 Nov 2018 03:25:04 +0000	[thread overview]
Message-ID: <08650203BA1BD64D8AD9B6D5D74A85D15FFFBED5@SHSMSX101.ccr.corp.intel.com> (raw)
In-Reply-To: <20181108165238.vhdbx2mc42kefvag@bivouac.eciton.net>

Hi Leif,

Yes. I should show the data.

My unites scripts is as below. The parameter lines is a string list which size is 43395. The test result is

''.join(String list) time:  0.042262
String += String time  :  3.822699

def TestPlus(lines):
    str_target = ""
    
    for line in lines:
        str_target += line
        
    return str_target

def TestJoin(lines):
    str_target = []
    
    for line in lines:
        str_target.append(line)
        
    return "".join(str_target)

def CompareStrCat():
    lines = GetStrings()
    print (len(lines))
    
    begin = time.perf_counter()
    for _ in range(10):
        TestJoin(lines)
    end = time.perf_counter() - begin
    print ("''.join(String list) time: %f" % end)
    
    begin = time.perf_counter()
    for _ in range(10):
        TestPlus(lines)
    end = time.perf_counter() - begin
    print ("String += String time: %f" % end)

For build OvmfX64, it's not very effective, it saves 2~3 second in Parse/AutoGen phase, because OvmfX64 is relatively simple. It does not enable much features such as Multiple SKU and structure PCD by default and there is no big size Autogen.c/Autogen.h/Makefile generated either. but for the complex platform, this patch will be much effective. The unites above simulates a real case that there is a 43395 lines of Autogen.c generated.

Since this patch mostly effect the Parser/AutoGen phase, I just use "build genmake" to show the improvement data. 
The final result for clean build is:
Current code:  17 seconds
After patch:      15 seconds

Details:
Current data:

d:\edk2 (master -> origin)
λ build genmake -p OvmfPkg\OvmfPkgIa32X64.dsc -a IA32 -a X64 -t VS2015x86
Build environment: Windows-10-10.0.10240
Build start time: 10:12:32, Nov.09 2018

WORKSPACE        = d:\edk2
ECP_SOURCE       = d:\edk2\edkcompatibilitypkg
EDK_SOURCE       = d:\edk2\edkcompatibilitypkg
EFI_SOURCE       = d:\edk2\edkcompatibilitypkg
EDK_TOOLS_PATH   = d:\edk2\basetools
EDK_TOOLS_BIN    = d:\edk2\basetools\bin\win32
CONF_PATH        = d:\edk2\conf

Architecture(s)  = IA32 X64
Build target     = DEBUG
Toolchain        = VS2015x86

Active Platform          = d:\edk2\OvmfPkg\OvmfPkgIa32X64.dsc
Flash Image Definition   = d:\edk2\OvmfPkg\OvmfPkgIa32X64.fdf

Processing meta-data ....... done!
Generating code . done!
Generating makefile . done!
Generating code .. done!
Generating makefile ...... done!

- Done -
Build end time: 10:12:49, Nov.09 2018
Build total time: 00:00:17

After applying this patch:

d:\edk2 (master -> origin)                                    
λ build genmake -p OvmfPkg\OvmfPkgIa32X64.dsc -a IA32 -a X64 -
Build environment: Windows-10-10.0.10240                      
Build start time: 10:11:41, Nov.09 2018                       
                                                              
WORKSPACE        = d:\edk2                                    
ECP_SOURCE       = d:\edk2\edkcompatibilitypkg                
EDK_SOURCE       = d:\edk2\edkcompatibilitypkg                
EFI_SOURCE       = d:\edk2\edkcompatibilitypkg                
EDK_TOOLS_PATH   = d:\edk2\basetools                          
EDK_TOOLS_BIN    = d:\edk2\basetools\bin\win32                
CONF_PATH        = d:\edk2\conf                               
                                                              
                                                              
Architecture(s)  = IA32 X64                                   
Build target     = DEBUG                                      
Toolchain        = VS2015x86                                  
                                                              
Active Platform          = d:\edk2\OvmfPkg\OvmfPkgIa32X64.dsc 
Flash Image Definition   = d:\edk2\OvmfPkg\OvmfPkgIa32X64.fdf 
                                                              
Processing meta-data ..... done!                              
Generating code . done!                                       
Generating makefile . done!                                   
Generating code .. done!                                      
Generating makefile ...... done!                              
                                                              
- Done -                                                      
Build end time: 10:11:56, Nov.09 2018                         
Build total time: 00:00:15                                    


Thanks,
Bob


-----Original Message-----
From: Leif Lindholm [mailto:leif.lindholm@linaro.org] 
Sent: Friday, November 9, 2018 12:53 AM
To: Feng, Bob C <bob.c.feng@intel.com>
Cc: edk2-devel@lists.01.org; Carsey, Jaben <jaben.carsey@intel.com>; Gao, Liming <liming.gao@intel.com>
Subject: Re: [edk2] [Patch] BaseTools: Optimize string concatenation

On Thu, Nov 08, 2018 at 06:16:25PM +0800, BobCF wrote:
> https://bugzilla.tianocore.org/show_bug.cgi?id=1288
> 
> This patch is one of build tool performance improvement series 
> patches.
> 
> This patch is going to use join function instead of string += string2 
> statement.
> 
> Current code use string += string2 in a loop to combine a string. 
> while creating a string list in a loop and using
> "".join(stringlist) after the loop will be much faster.

Do you have any numbers on the level of improvement seen?

Either for the individual scripts when called identically, or (if
measurable) on the build of an entire platform (say OvmfX64?).

Regards,

Leif

> Contributed-under: TianoCore Contribution Agreement 1.1
> Signed-off-by: BobCF <bob.c.feng@intel.com>
> Cc: Liming Gao <liming.gao@intel.com>
> Cc: Jaben Carsey <jaben.carsey@intel.com>
> ---
>  BaseTools/Source/Python/AutoGen/StrGather.py  | 39 +++++++++++++------
>  BaseTools/Source/Python/Common/Misc.py        | 21 +++++-----
>  .../Source/Python/Workspace/InfBuildData.py   |  4 +-
>  .../Python/Workspace/WorkspaceCommon.py       | 11 ++----
>  4 files changed, 44 insertions(+), 31 deletions(-)
> 
> diff --git a/BaseTools/Source/Python/AutoGen/StrGather.py 
> b/BaseTools/Source/Python/AutoGen/StrGather.py
> index 361d499076..d34a9e9447 100644
> --- a/BaseTools/Source/Python/AutoGen/StrGather.py
> +++ b/BaseTools/Source/Python/AutoGen/StrGather.py
> @@ -135,11 +135,11 @@ def AscToHexList(Ascii):
>  # @param UniGenCFlag      UniString is generated into AutoGen C file when it is set to True
>  #
>  # @retval Str:           A string of .h file content
>  #
>  def CreateHFileContent(BaseName, UniObjectClass, IsCompatibleMode, UniGenCFlag):
> -    Str = ''
> +    Str = []
>      ValueStartPtr = 60
>      Line = COMMENT_DEFINE_STR + ' ' + LANGUAGE_NAME_STRING_NAME + ' ' * (ValueStartPtr - len(DEFINE_STR + LANGUAGE_NAME_STRING_NAME)) + DecToHexStr(0, 4) + COMMENT_NOT_REFERENCED
>      Str = WriteLine(Str, Line)
>      Line = COMMENT_DEFINE_STR + ' ' + PRINTABLE_LANGUAGE_NAME_STRING_NAME + ' ' * (ValueStartPtr - len(DEFINE_STR + PRINTABLE_LANGUAGE_NAME_STRING_NAME)) + DecToHexStr(1, 4) + COMMENT_NOT_REFERENCED
>      Str = WriteLine(Str, Line)
> @@ -164,16 +164,16 @@ def CreateHFileContent(BaseName, UniObjectClass, IsCompatibleMode, UniGenCFlag):
>                      Line = COMMENT_DEFINE_STR + ' ' + Name + ' ' + DecToHexStr(Token, 4) + COMMENT_NOT_REFERENCED
>                  else:
>                      Line = COMMENT_DEFINE_STR + ' ' + Name + ' ' * (ValueStartPtr - len(DEFINE_STR + Name)) + DecToHexStr(Token, 4) + COMMENT_NOT_REFERENCED
>                  UnusedStr = WriteLine(UnusedStr, Line)
>  
> -    Str = ''.join([Str, UnusedStr])
> +    Str.extend( UnusedStr)
>  
>      Str = WriteLine(Str, '')
>      if IsCompatibleMode or UniGenCFlag:
>          Str = WriteLine(Str, 'extern unsigned char ' + BaseName + 'Strings[];')
> -    return Str
> +    return "".join(Str)
>  
>  ## Create a complete .h file
>  #
>  # Create a complet .h file with file header and file content  # @@ 
> -185,11 +185,11 @@ def CreateHFileContent(BaseName, UniObjectClass, IsCompatibleMode, UniGenCFlag):
>  # @retval Str:           A string of complete .h file
>  #
>  def CreateHFile(BaseName, UniObjectClass, IsCompatibleMode, UniGenCFlag):
>      HFile = WriteLine('', CreateHFileContent(BaseName, 
> UniObjectClass, IsCompatibleMode, UniGenCFlag))
>  
> -    return HFile
> +    return "".join(HFile)
>  
>  ## Create a buffer to store all items in an array  #
>  # @param BinBuffer   Buffer to contain Binary data.
>  # @param Array:      The array need to be formatted
> @@ -209,11 +209,11 @@ def CreateBinBuffer(BinBuffer, Array):
>  #
>  def CreateArrayItem(Array, Width = 16):
>      MaxLength = Width
>      Index = 0
>      Line = '  '
> -    ArrayItem = ''
> +    ArrayItem = []
>  
>      for Item in Array:
>          if Index < MaxLength:
>              Line = Line + Item + ',  '
>              Index = Index + 1
> @@ -221,11 +221,11 @@ def CreateArrayItem(Array, Width = 16):
>              ArrayItem = WriteLine(ArrayItem, Line)
>              Line = '  ' + Item + ',  '
>              Index = 1
>      ArrayItem = Write(ArrayItem, Line.rstrip())
>  
> -    return ArrayItem
> +    return "".join(ArrayItem)
>  
>  ## CreateCFileStringValue
>  #
>  # Create a line with string value
>  #
> @@ -236,11 +236,11 @@ def CreateArrayItem(Array, Width = 16):
>  
>  def CreateCFileStringValue(Value):
>      Value = [StringBlockType] + Value
>      Str = WriteLine('', CreateArrayItem(Value))
>  
> -    return Str
> +    return "".join(Str)
>  
>  ## GetFilteredLanguage
>  #
>  # apply get best language rules to the UNI language code list  # @@ 
> -438,11 +438,11 @@ def CreateCFileContent(BaseName, UniObjectClass, IsCompatibleMode, UniBinBuffer,
>      #
>      # Join package data
>      #
>      AllStr = Write(AllStr, Str)
>  
> -    return AllStr
> +    return "".join(AllStr)
>  
>  ## Create end of .c file
>  #
>  # Create end of .c file
>  #
> @@ -465,11 +465,11 @@ def CreateCFileEnd():
>  #
>  def CreateCFile(BaseName, UniObjectClass, IsCompatibleMode, FilterInfo):
>      CFile = ''
>      CFile = WriteLine(CFile, CreateCFileContent(BaseName, UniObjectClass, IsCompatibleMode, None, FilterInfo))
>      CFile = WriteLine(CFile, CreateCFileEnd())
> -    return CFile
> +    return "".join(CFile)
>  
>  ## GetFileList
>  #
>  # Get a list for all files
>  #
> @@ -572,17 +572,34 @@ def GetStringFiles(UniFilList, SourceFileList, 
> IncludeList, IncludePathList, Ski
>  
>  #
>  # Write an item
>  #
>  def Write(Target, Item):
> -    return ''.join([Target, Item])
> +    if isinstance(Target,str):
> +        Target = [Target]
> +    if not Target:
> +        Target = []
> +    if isinstance(Item,list):
> +        Target.extend(Item)
> +    else:
> +        Target.append(Item)
> +    return Target
>  
>  #
>  # Write an item with a break line
>  #
>  def WriteLine(Target, Item):
> -    return ''.join([Target, Item, '\n'])
> +    if isinstance(Target,str):
> +        Target = [Target]
> +    if not Target:
> +        Target = []
> +    if isinstance(Item, list):
> +        Target.extend(Item)
> +    else:
> +        Target.append(Item)
> +    Target.append('\n')
> +    return Target
>  
>  # This acts like the main() function for the script, unless it is 
> 'import'ed into another  # script.
>  if __name__ == '__main__':
>      EdkLogger.info('start')
> diff --git a/BaseTools/Source/Python/Common/Misc.py 
> b/BaseTools/Source/Python/Common/Misc.py
> index 80236db160..8dcbe141ae 100644
> --- a/BaseTools/Source/Python/Common/Misc.py
> +++ b/BaseTools/Source/Python/Common/Misc.py
> @@ -777,21 +777,21 @@ class TemplateString(object):
>  
>              return "".join(StringList)
>  
>      ## Constructor
>      def __init__(self, Template=None):
> -        self.String = ''
> +        self.String = []
>          self.IsBinary = False
>          self._Template = Template
>          self._TemplateSectionList = self._Parse(Template)
>  
>      ## str() operator
>      #
>      #   @retval     string  The string replaced
>      #
>      def __str__(self):
> -        return self.String
> +        return "".join(self.String)
>  
>      ## Split the template string into fragments per the ${BEGIN} and ${END} flags
>      #
>      #   @retval     list    A list of TemplateString.Section objects
>      #
> @@ -835,13 +835,16 @@ class TemplateString(object):
>      #   @param      Dictionary      The placeholder dictionaries
>      #
>      def Append(self, AppendString, Dictionary=None):
>          if Dictionary:
>              SectionList = self._Parse(AppendString)
> -            self.String += "".join(S.Instantiate(Dictionary) for S in SectionList)
> +            self.String.append( "".join(S.Instantiate(Dictionary) for 
> + S in SectionList))
>          else:
> -            self.String += AppendString
> +            if isinstance(AppendString,list):
> +                self.String.extend(AppendString)
> +            else:
> +                self.String.append(AppendString)
>  
>      ## Replace the string template with dictionary of placeholders
>      #
>      #   @param      Dictionary      The placeholder dictionaries
>      #
> @@ -1741,27 +1744,21 @@ class PathClass(object):
>      #
>      # @retval False The two PathClass are different
>      # @retval True  The two PathClass are the same
>      #
>      def __eq__(self, Other):
> -        if isinstance(Other, type(self)):
> -            return self.Path == Other.Path
> -        else:
> -            return self.Path == str(Other)
> +        return self.Path == str(Other)
>  
>      ## Override __cmp__ function
>      #
>      # Customize the comparsion operation of two PathClass
>      #
>      # @retval 0     The two PathClass are different
>      # @retval -1    The first PathClass is less than the second PathClass
>      # @retval 1     The first PathClass is Bigger than the second PathClass
>      def __cmp__(self, Other):
> -        if isinstance(Other, type(self)):
> -            OtherKey = Other.Path
> -        else:
> -            OtherKey = str(Other)
> +        OtherKey = str(Other)
>  
>          SelfKey = self.Path
>          if SelfKey == OtherKey:
>              return 0
>          elif SelfKey > OtherKey:
> diff --git a/BaseTools/Source/Python/Workspace/InfBuildData.py 
> b/BaseTools/Source/Python/Workspace/InfBuildData.py
> index 44d44d24eb..d615cccdf7 100644
> --- a/BaseTools/Source/Python/Workspace/InfBuildData.py
> +++ b/BaseTools/Source/Python/Workspace/InfBuildData.py
> @@ -612,11 +612,13 @@ class InfBuildData(ModuleBuildClassObject):
>          for Record in RecordList:
>              Lib = Record[0]
>              Instance = Record[1]
>              if Instance:
>                  Instance = NormPath(Instance, self._Macros)
> -            RetVal[Lib] = Instance
> +                RetVal[Lib] = Instance
> +            else:
> +                RetVal[Lib] = None
>          return RetVal
>  
>      ## Retrieve library names (for Edk.x style of modules)
>      @cached_property
>      def Libraries(self):
> diff --git a/BaseTools/Source/Python/Workspace/WorkspaceCommon.py 
> b/BaseTools/Source/Python/Workspace/WorkspaceCommon.py
> index 8d8a3e2789..55d01fa4b2 100644
> --- a/BaseTools/Source/Python/Workspace/WorkspaceCommon.py
> +++ b/BaseTools/Source/Python/Workspace/WorkspaceCommon.py
> @@ -126,17 +126,14 @@ def GetModuleLibInstances(Module, Platform, BuildDatabase, Arch, Target, Toolcha
>      while len(LibraryConsumerList) > 0:
>          M = LibraryConsumerList.pop()
>          for LibraryClassName in M.LibraryClasses:
>              if LibraryClassName not in LibraryInstance:
>                  # override library instance for this module
> -                if LibraryClassName in Platform.Modules[str(Module)].LibraryClasses:
> -                    LibraryPath = Platform.Modules[str(Module)].LibraryClasses[LibraryClassName]
> -                else:
> -                    LibraryPath = Platform.LibraryClasses[LibraryClassName, ModuleType]
> -                if LibraryPath is None or LibraryPath == "":
> -                    LibraryPath = M.LibraryClasses[LibraryClassName]
> -                    if LibraryPath is None or LibraryPath == "":
> +                LibraryPath = Platform.Modules[str(Module)].LibraryClasses.get(LibraryClassName,Platform.LibraryClasses[LibraryClassName, ModuleType])
> +                if LibraryPath is None:
> +                    LibraryPath = M.LibraryClasses.get(LibraryClassName)
> +                    if LibraryPath is None:
>                          if FileName:
>                              EdkLogger.error("build", RESOURCE_NOT_AVAILABLE,
>                                              "Instance of library class [%s] is not found" % LibraryClassName,
>                                              File=FileName,
>                                              ExtraData="in [%s] 
> [%s]\n\tconsumed by module [%s]" % (str(M), Arch, str(Module)))
> --
> 2.19.1.windows.1
> 
> _______________________________________________
> edk2-devel mailing list
> edk2-devel@lists.01.org
> https://lists.01.org/mailman/listinfo/edk2-devel


  reply	other threads:[~2018-11-09  3:25 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-08 10:16 [Patch] BaseTools: Optimize string concatenation BobCF
2018-11-08 15:26 ` Carsey, Jaben
2018-11-08 16:40 ` Kinney, Michael D
2018-11-09 14:16   ` Gao, Liming
2018-11-09 17:01     ` Kinney, Michael D
2018-11-08 16:52 ` Leif Lindholm
2018-11-09  3:25   ` Feng, Bob C [this message]
2018-11-09 11:48     ` Leif Lindholm
2018-11-11  0:41       ` Feng, Bob C
2018-12-05  2:51         ` Feng, Bob C
2018-12-10 10:47           ` Leif Lindholm
2018-12-10 12:09             ` Feng, Bob C
2018-12-10 12:35               ` Leif Lindholm
2018-12-11  8:48                 ` Feng, Bob C
2018-12-12 18:37                   ` Leif Lindholm
2018-12-13  1:50                     ` Gao, Liming
2018-12-13 10:08                       ` Leif Lindholm
2018-12-13 14:01                         ` Gao, Liming

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=08650203BA1BD64D8AD9B6D5D74A85D15FFFBED5@SHSMSX101.ccr.corp.intel.com \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox