public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
From: "Feng, Bob C" <bob.c.feng@intel.com>
To: Felix Polyudov <Felixp@ami.com>, "Ni, Ruiyu" <ruiyu.ni@intel.com>,
	"edk2-devel@lists.01.org" <edk2-devel@lists.01.org>
Cc: "Carsey, Jaben" <jaben.carsey@intel.com>,
	"Gao, Liming" <liming.gao@intel.com>
Subject: Re: [Patch] BaseTools: Replace the sqlite database with list
Date: Fri, 9 Nov 2018 06:29:29 +0000	[thread overview]
Message-ID: <08650203BA1BD64D8AD9B6D5D74A85D15FFFC060@SHSMSX101.ccr.corp.intel.com> (raw)
In-Reply-To: <9333E191E0D52B4999CE63A99BA663A00302C3534D@atlms1.us.megatrends.com>

Hi Felix,

Yes. There will not be build.db under Conf/.cache folder after this patch.

I'll take parsing dsc file as example to explain how the build.db works in current code firstly. 
For the first build, from a .dsc file to a dscbuild data, there are several steps:
1. read .dsc file and create a table in build.db
2. Parse each line of dsc file and translate the line string into a data item in dsc table.
     The data have the structure as:
     (ID, Model,Value1,Value2,Value3,Scope1,Scope2,Scope3,BelongsToItem,FromItem,StartLine,StartColumn,EndLine,EndColumn,Enabled )
     At this time, the Macro and the condition in !IF are not evaluated, the !INCLUDE is not extend
3. Do the post_process()
     Read the all content from that table and process the Macro, condition, !INCLUDE and etc. And save the processed data into a temporary table.
4. Generate build data from the temporary table. The temporary table is in memory, it's not saved in build.db

For the second build, tool will check each file's timestamp and compare it with the timestamp saved in build.db, if metadata file is not changed, skip the step 1 and 2.
Step 3 and 4 execute always. So the build.db for build performance is limited.

Here is the performance data. I used cProfile to profile each function's time for OvmfX64. You can see after applying patch, the method 'execute' of 'sqlite3.Cursor' objects time cost is gone. And also nt.stat calls are reduced. This patch can improve the performance for both clean build and incremental build.

Note: Since cProfile tool adds some overload, the time here is bigger than real case.

Current code:
Clean build
22972010 function calls (22055610 primitive calls) in 16.308 seconds

   Ordered by: internal time
   List reduced from 766 to 50 due to restriction <50>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   784804    5.003    0.000    5.003    0.000 {nt.stat}
    49583    1.361    0.000    1.361    0.000 {method 'execute' of 'sqlite3.Cursor' objects}
    46057    1.241    0.000    7.644    0.000 C:\Python27\lib\traceback.py:281(extract_stack)
   772326    0.650    0.000    5.536    0.000 C:\Python27\lib\linecache.py:47(checkcache)
675949/1441    0.532    0.000    1.056    0.001 C:\Python27\lib\copy.py:145(deepcopy)
   772326    0.335    0.000    0.610    0.000 C:\Python27\lib\linecache.py:13(getline)
   689358    0.276    0.000    0.471    0.000 D:\edk2\BaseTools\Source\Python\Common\Misc.py:1744(__eq__)
    59520    0.238    0.000    0.279    0.000 D:\edk2\BaseTools\Source\Python\Common\StringUtils.py:402(CleanString2)

Incremental build:

15253810 function calls (14517810 primitive calls) in 8.392 seconds

   Ordered by: internal time
   List reduced from 732 to 50 due to restriction <50>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   202227    1.341    0.000    1.341    0.000 {nt.stat}
    16595    0.972    0.000    0.972    0.000 {method 'execute' of 'sqlite3.Cursor' objects}
499562/974    0.388    0.000    0.774    0.001 C:\Python27\lib\copy.py:145(deepcopy)
    13069    0.321    0.000    1.942    0.000 C:\Python27\lib\traceback.py:281(extract_stack)
   689358    0.278    0.000    0.473    0.000 D:\edk2\BaseTools\Source\Python\Common\Misc.py:1744(__eq__)
     2136    0.180    0.000    0.486    0.000 D:\edk2\BaseTools\Source\Python\Workspace\MetaFileTable.py:231(GetValidExpression)
    51545    0.178    0.000    0.336    0.000 C:\Python27\lib\ntpath.py:415(normpath)
        1    0.175    0.175    0.255    0.255 D:\edk2\BaseTools\Source\Python\Common\ToolDefClassObject.py:69(LoadToolDefFile)
   189857    0.166    0.000    1.393    0.000 C:\Python27\lib\linecache.py:47(checkcache)
    10950    0.164    0.000    0.245    0.000 D:\edk2\BaseTools\Source\Python\GenFds\FdfParser.py:285(_SkipWhiteSpace)

After patch:
Clean build is the same as Incremental build

16557449 function calls (15641049 primitive calls) in 6.662 seconds

   Ordered by: internal time
   List reduced from 746 to 50 due to restriction <50>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
675949/1441    0.538    0.000    1.064    0.001 C:\Python27\lib\copy.py:145(deepcopy)
   689358    0.275    0.000    0.468    0.000 D:\edk2\BaseTools\Source\Python\Common\Misc.py:1744(__eq__)
     2136    0.236    0.000    0.251    0.000 D:\edk2\BaseTools\Source\Python\ParserDb\MetaFileTable.py:269(GetValidExpression)
    59520    0.232    0.000    0.268    0.000 D:\edk2\BaseTools\Source\Python\Common\StringUtils.py:402(CleanString2)
    27612    0.213    0.000    0.213    0.000 {nt.stat}
   666283    0.196    0.000    0.257    0.000 C:\Python27\lib\copy.py:267(_keep_alive)
    51667    0.178    0.000    0.315    0.000 C:\Python27\lib\ntpath.py:415(normpath)

Thanks,
Bob

-----Original Message-----
From: Felix Polyudov [mailto:Felixp@ami.com] 
Sent: Friday, November 9, 2018 12:48 AM
To: Feng, Bob C <bob.c.feng@intel.com>; Ni, Ruiyu <ruiyu.ni@intel.com>; edk2-devel@lists.01.org
Cc: Carsey, Jaben <jaben.carsey@intel.com>; Gao, Liming <liming.gao@intel.com>
Subject: RE: [edk2] [Patch] BaseTools: Replace the sqlite database with list

Bob,

Does it mean that after this patch the build data is no longer saved to a file and is recreated on every build?
Do you have any data regarding build process performance improvements after applying the patch?
Does this patch improve full build time and incremental build time?

Thanks
Felix

-----Original Message-----
From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of Feng, Bob C
Sent: Thursday, November 08, 2018 12:39 AM
To: Ni, Ruiyu; edk2-devel@lists.01.org
Cc: Carsey, Jaben; Gao, Liming
Subject: Re: [edk2] [Patch] BaseTools: Replace the sqlite database with list

Hi Ray,

Right. No SQL dependency any more after this patch.

Thanks,
Bob

-----Original Message-----
From: Ni, Ruiyu
Sent: Thursday, November 8, 2018 1:37 PM
To: Feng, Bob C <bob.c.feng@intel.com>; edk2-devel@lists.01.org
Cc: Carsey, Jaben <jaben.carsey@intel.com>; Gao, Liming <liming.gao@intel.com>
Subject: Re: [edk2] [Patch] BaseTools: Replace the sqlite database with list

On 11/8/2018 11:15 AM, BobCF wrote:
> https://bugzilla.tianocore.org/show_bug.cgi?id=1288
> 
> This patch is one of build tool performance improvement series 
> patches.
> 
> This patch is going to use python list to store the parser data 
> instead of using sqlite database.
> 
> The replacement solution is as below:
> 
> SQL insert: list.append()
> SQL select: list comprehension. for example:
> Select * from table where field = “something”
> ->
> [ item for item in table if item[3] == “something”]
> 
> SQL update: python map function. for example:
> Update table set field1=newvalue where filed2 = “something”.
> -> map(lambda x: x[1] = newvalue,
>     [item for item in table if item[2] == “something”])
> 
> SQL delete: list comprehension.
> 
> With this change, We can save the time of interpreting SQL statement 
> and the time of write database to file system
> 
> Contributed-under: TianoCore Contribution Agreement 1.1
> Signed-off-by: BobCF <bob.c.feng@intel.com>
> Cc: Liming Gao <liming.gao@intel.com>
> Cc: Jaben Carsey <jaben.carsey@intel.com>
> ---

Bob,
I am curious. After this patch, there is no SQL dependency in build tool?

--
Thanks,
Ray
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel

Please consider the environment before printing this email.

The information contained in this message may be confidential and proprietary to American Megatrends, Inc.  This communication is intended to be read only by the individual or entity to whom it is addressed or by their designee. If the reader of this message is not the intended recipient, you are on notice that any distribution of this message, in any form, is strictly prohibited.  Please promptly notify the sender by reply e-mail or by telephone at 770-246-8600, and then delete or destroy all copies of the transmission.

      reply	other threads:[~2018-11-09  6:29 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-08  3:15 [Patch] BaseTools: Replace the sqlite database with list BobCF
2018-11-08  5:36 ` Ni, Ruiyu
2018-11-08  5:38   ` Feng, Bob C
2018-11-08 16:47     ` Felix Polyudov
2018-11-09  6:29       ` Feng, Bob C [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=08650203BA1BD64D8AD9B6D5D74A85D15FFFC060@SHSMSX101.ccr.corp.intel.com \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox