From: "Bob Feng" <bob.c.feng@intel.com>
To: Laszlo Ersek <lersek@redhat.com>
Cc: "devel@edk2.groups.io" <devel@edk2.groups.io>,
"Philippe Mathieu-Daudé" <philmd@redhat.com>
Subject: Re: [edk2-devel] [Patch 0/9] Enable multiple process AutoGen
Date: Mon, 22 Jul 2019 09:11:09 +0000 [thread overview]
Message-ID: <08650203BA1BD64D8AD9B6D5D74A85D160B454F1@SHSMSX105.ccr.corp.intel.com> (raw)
In-Reply-To: <fb27fbd7-df15-872e-e02c-2fbf8aca28a0@redhat.com>
Hi Laszlo,
Thanks for your detailed testing and comments.
I send out patch serial V2 to resolve comment 1#. V2 also fixed some issues found by others and was generated based on master latest version.
For 3#, I can reproduce the issue, I'll fix it in patch serial V3. My understanding is that when I fix 3#, 2# will be resolved accordingly. Right?
For 5#-2, When using "-n 1", Autogen should start one worker subprocess to generate autogen files that will be the same as run in serial.
I'll make the change in patches V3
For the keyboard interrupt handling mentioned in 4# and 5#-1, I'll do more testing on it. And do the fix in V3.
Before sending the patches, I tested Ctrl+C to stop the build but not the "pick up".
Thanks,
Bob
-----Original Message-----
From: Laszlo Ersek [mailto:lersek@redhat.com]
Sent: Sunday, July 21, 2019 8:26 PM
To: Feng, Bob C <bob.c.feng@intel.com>
Cc: devel@edk2.groups.io; Philippe Mathieu-Daudé <philmd@redhat.com>
Subject: Re: [edk2-devel] [Patch 0/9] Enable multiple process AutoGen
Hi Bob,
On 07/18/19 08:14, Bob Feng wrote:
> BZ: https://bugzilla.tianocore.org/show_bug.cgi?id=1875
>
> In order to improve the build performance, we implemented
> multiple-processes AutoGen. This change will reduce 20% time for
> AutoGen phase.
>
> The design document can be got from:
> https://edk2.groups.io/g/devel/files/Designs/2019/0627/Multiple-thread
> -AutoGen.pdf
>
> This patch serial pass the build of Ovmf, MinKabylake, MinPurley,
> packages under Edk2 repository and intel client and server platforms.
I've done some basic regression-testing with this set, applied on top of current master (= commit 8ff68cd5e4c9, "ShellPkg: acpiview: DBG2: Remove redundant forward declarations", 2019-07-19).
(1) The build process now seems to produce files called
GlobalVar_37d7e986-f7e9-45c2-8067-e371421a626c_AARCH64.bin
GlobalVar_37d7e986-f7e9-45c2-8067-e371421a626c_ARM.bin
GlobalVar_5a9e7754-d81b-49ea-85ad-69eaa7b1539b_IA32.bin
GlobalVar_5a9e7754-d81b-49ea-85ad-69eaa7b1539b_X64.bin
in the project root directory. (The GUIDs match the PLATFORM_GUID
values from ArmVirtQemu.dsc and OvmfPkg*.dsc.)
They seem to come from patch "BaseTools: Enable Multiple Process
AutoGen".
Can we place these files under the Build/ directory somewhere,
instead?
(2) The QEMU project now bundles edk2 (as a git submodule for the edk2
tree, and some firmware binaries built from that). The build scripts
carried by QEMU set PYTHON_COMMAND to python2, in order to work
around TianoCore BZ#1607.
For regression-testing this set, I ran
$ make -C roms efi
with QEMU's edk2 submodule advanced to commit 8ff68cd5e4c9, plus
this set applied (see above), and with QEMU itself being at
e2b47666fe15.
The build itself passes fine, so that's good.
But, some (not all) of the firmware binaries are *misbuilt*. I'll
elaborate below (independently of QEMU).
(3) Consider the following test. (Note that this is independent of the
QEMU project entirely.)
(3a) Set up a pristine build environment:
# Open a new shell, then:
$ git clean -ffdx
$ git reset --hard
$ git submodule deinit --force --all
$ git checkout master
$ git submodule update --init
$ export PYTHON_COMMAND=python2
$ source edksetup.sh
$ nice make -C "$EDK_TOOLS_PATH" \
-j $(getconf _NPROCESSORS_ONLN)
(3b) Build OvmfPkgIA32 with one set of features, and capture a
checksum of the resultant firmware binary:
$ nice build -a IA32 -b NOOPT -p OvmfPkg/OvmfPkgIA32.dsc \
-t GCC48
$ md5sum Build/OvmfIa32/NOOPT_GCC48/FV/OVMF_CODE.fd
8d56575dd3b5c89ac6f1829ae4f41b55
(3c) *Re*build the same platform with a different set of features,
and capture a checksum again:
$ nice build -a IA32 -b NOOPT -p OvmfPkg/OvmfPkgIA32.dsc \
-t GCC48 -D SECURE_BOOT_ENABLE -D SMM_REQUIRE
$ md5sum Build/OvmfIa32/NOOPT_GCC48/FV/OVMF_CODE.fd
ed79052e9add242174ccefc78a5bddb3
Okay, now repeat all three steps from zero, but with the present
feature patch set applied:
(3d) Repeat (3a), but rather than checking out the master branch
with "git checkout", check out the topic branch that is the
same "master" commit, *plus* the present series applied on top.
Don't forget to start a new shell first!
(3e) Repeat (3b):
$ nice build -a IA32 -b NOOPT -p OvmfPkg/OvmfPkgIA32.dsc \
-t GCC48
$ md5sum Build/OvmfIa32/NOOPT_GCC48/FV/OVMF_CODE.fd
8d56575dd3b5c89ac6f1829ae4f41b55
Notice that the checksum is identical to that from (3b), so
all's good.
(3f) Repeat (3c):
$ nice build -a IA32 -b NOOPT -p OvmfPkg/OvmfPkgIA32.dsc \
-t GCC48 -D SECURE_BOOT_ENABLE -D SMM_REQUIRE
$ md5sum Build/OvmfIa32/NOOPT_GCC48/FV/OVMF_CODE.fd
baf08c5a9ecb0adc754f5b48410b6476
This is the problem: the checksum does not match the one from
(3c). And, the binary from (3f) is in fact mis-built, it
crashes immediately when launched on QEMU/KVM, with "emulation
failure".
Thus:
- Without the patch set, if I build a platform, then *rebuild* it
with different -D flags, then both builds produce proper outputs.
- With the patch series applied however, only the initial build
produces good results. The *rebuild*, with different -D flags, is
"poisoned" by the artifacts still laying around from the first
build.
In other words, this is a "stale cache" problem.
(4) Not a show stopper, but a divergence from previous behavior:
When one presses Ctrl-S (the "stop" character) in a terminal, the
terminal output / autoscrolling is suspended, until Ctrl-Q (the
"start" character) is pressed. This is generally used for two
things: scrolling back the terminal history manually, for reading
something that scrolled off the screen, and for briefly pausing
whatever processing the program is doing.
The idea being, if the program is blocked in a terminal write
operation, it cannot do anything else.
The present patch set changes the lastly mentioned behavior, because
the logging is now decoupled to a separate thread ("BaseTools: Add
LogAgent to support multiple process Autogen"). As a result, if the
*output* of LogAgent is blocked, that does not block the *input* of
LogAgent, and so the queue / buffer in LogAgent grows without bound,
and the build continues even though the terminal output is stopped.
This is a common initial symptom of programs with internal queues --
it's usually good to implement internal flow control, under which
any given queue has a limit, and if that limit is reached, the queue
blocks the producer threads feeding it.
Again, this is not a show stopper, just a divergence from previous
behavior which we should be aware of, and possibly document
somewhere.
Importantly, Ctrl-Z (the "suspend" character) works just fine, for
*both* purposes described above. (And then people can resume the
build with "fg", like always.)
(5) Two questions out of interest:
- Has this series been tested for restarting builds that were
interrupted with Ctrl-C? Can the new build "pick up" where the
previous build was interrupted?
- Do we intend to offer a flag for disabling this feature? I'm not
implying that the old code should be preserved as an alternative,
but perhaps some internal constants related to parallelization can
be tweaked such that the ultimate behavior becomes serial. Is the
"-n 1" option supposed to have that effect?
The second question is relevant because people might want to disable
the new feature until it stabilizes (see issue (3) for example).
Thanks,
Laszlo
next prev parent reply other threads:[~2019-07-22 9:11 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-07-18 6:14 [Patch 0/9] Enable multiple process AutoGen Bob Feng
2019-07-18 6:14 ` [Patch 1/9] BaseTools: Singleton the object to handle build conf file Bob Feng
2019-07-18 6:14 ` [Patch 2/9] BaseTools: Split WorkspaceAutoGen._InitWorker into multiple functions Bob Feng
2019-07-18 6:14 ` [Patch 3/9] BaseTools: Add functions to get platform scope build options Bob Feng
2019-07-18 6:14 ` [Patch 4/9] BaseTools: Decouple AutoGen Objects Bob Feng
2019-07-18 6:14 ` [Patch 5/9] BaseTools: Enable Multiple Process AutoGen Bob Feng
2019-07-18 6:14 ` [Patch 6/9] BaseTools: Add shared data for processes Bob Feng
2019-07-18 6:14 ` [Patch 7/9] BaseTools: Add LogAgent to support multiple process Autogen Bob Feng
2019-07-18 6:14 ` [Patch 8/9] BaseTools: Move BuildOption parser out of build.py Bob Feng
2019-07-18 6:14 ` [Patch 9/9] BaseTools: Add the support for python 2 Bob Feng
2019-07-21 12:26 ` [edk2-devel] [Patch 0/9] Enable multiple process AutoGen Laszlo Ersek
2019-07-21 15:55 ` Steven Shi
2019-07-22 9:11 ` Bob Feng [this message]
2019-07-22 19:50 ` Laszlo Ersek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-list from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=08650203BA1BD64D8AD9B6D5D74A85D160B454F1@SHSMSX105.ccr.corp.intel.com \
--to=devel@edk2.groups.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox