public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
From: "davidr via groups.io" <davidr=ghs.com@groups.io>
To: devel@edk2.groups.io
Subject: [edk2-devel] FDF parser performance degrades rapidly on non-trivially sized inputs
Date: Wed, 27 Nov 2024 23:55:04 -0800	[thread overview]
Message-ID: <wbQR.1732780504597534029.y2AR@groups.io> (raw)

[-- Attachment #1: Type: text/plain, Size: 3150 bytes --]

Hi,

I was testing out dumping a raw OVMF_VARS.fd into an FDF data section and noticed that my rebuilds of OVMF with no code changes went from 15 seconds to over 1 minute. The only change was the data in the NV_VARIABLE_STORE data section in OvmfPkg/Include/Fdf/VarStore.fdf.inc which changed the size of the file from about 4 KiB to about 256 KiB. I was rather curious as to why my build times changed so much and profiled the build process with cProfile. Specifically https://github.com/tianocore/edk2/blob/master/BaseTools/Source/Python/GenFds/FdfParser.py#L279 takes the vast majority of the time.

You can reproduce this by adding 256 KiB of "#" characters to the end of OvmfPkg/Include/Fdf/VarStore.fdf.inc and building OVMF, which produced this result for me:
Build total time: 00:01:31
34883442 function calls (34368868 primitive calls) in 91.191 seconds
Ordered by: internal time
ncalls  tottime  percall  cumtime  percall filename:lineno(function)
18769   73.501    0.004   75.942    0.004 FdfParser.py:276(_SkipWhiteSpace)
728    5.738    0.008    5.738    0.008 {method 'acquire' of '_thread.lock' objects}
12    1.569    0.131    1.569    0.131 {method 'poll' of 'select.poll' objects}
2849255    1.248    0.000    1.439    0.000 FdfParser.py:354(_GetOneChar)
2305959    0.873    0.000    1.131    0.000 FdfParser.py:293(_EndOfFile)
5374641    0.831    0.000    0.831    0.000 FdfParser.py:368(_CurrentChar)
2    0.526    0.263    1.278    0.639 FdfParser.py:497(PreprocessFile)

Changing _SkippedChars from a string to StringIO reduced my build time to 19 seconds:
Build total time: 00:00:19
36552029 function calls (36037563 primitive calls) in 18.618 seconds
Ordered by: internal time
ncalls  tottime  percall  cumtime  percall filename:lineno(function)
728    5.391    0.007    5.391    0.007 {method 'acquire' of '_thread.lock' objects}
18769    1.593    0.000    3.452    0.000 FdfParser.py:277(_SkipWhiteSpace)
12    1.551    0.129    1.551    0.129 {method 'poll' of 'select.poll' objects}
2849255    0.850    0.000    0.979    0.000 FdfParser.py:355(_GetOneChar)
5374641    0.742    0.000    0.742    0.000 FdfParser.py:369(_CurrentChar)
2305959    0.741    0.000    0.951    0.000 FdfParser.py:294(_EndOfFile)
2    0.511    0.256    1.237    0.618 FdfParser.py:498(PreprocessFile)

All of these tests were run using the python3 binary provided in the docker container created by https://github.com/tianocore/containers/tree/main/Ubuntu-22/Dockerfile ( https://github.com/tianocore/containers/blob/main/Ubuntu-22/Dockerfile ).

This seems like an easy change to make builds a tiny bit faster.

Thanks,
David


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#120865): https://edk2.groups.io/g/devel/message/120865
Mute This Topic: https://groups.io/mt/109914552/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



[-- Attachment #2: Type: text/html, Size: 4883 bytes --]

             reply	other threads:[~2024-12-04  4:20 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-28  7:55 davidr via groups.io [this message]
2024-12-04 10:22 ` [edk2-devel] FDF parser performance degrades rapidly on non-trivially sized inputs Ard Biesheuvel via groups.io
2024-12-04 18:00   ` davidr via groups.io
2024-12-05  0:50     ` 回复: " gaoliming via groups.io
2024-12-06  0:54       ` [edk2-devel] " davidr via groups.io
2024-12-06  1:41         ` 回复: " gaoliming via groups.io

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=wbQR.1732780504597534029.y2AR@groups.io \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox