From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: mx.groups.io; dkim=missing; spf=pass (domain: redhat.com, ip: 209.132.183.28, mailfrom: lersek@redhat.com) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by groups.io with SMTP; Thu, 08 Aug 2019 06:08:25 -0700 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id E641269097; Thu, 8 Aug 2019 13:08:24 +0000 (UTC) Received: from lacos-laptop-7.usersys.redhat.com (ovpn-117-122.ams2.redhat.com [10.36.117.122]) by smtp.corp.redhat.com (Postfix) with ESMTP id 536FD5C226; Thu, 8 Aug 2019 13:08:23 +0000 (UTC) Subject: Re: [edk2-devel] [Patch 00/10 V8] Enable multiple process AutoGen To: bob.c.feng@intel.com References: <20190807042537.11928-1-bob.c.feng@intel.com> From: "Laszlo Ersek" Cc: devel@edk2.groups.io, Andrew Fish , Leif Lindholm , Michael Kinney , Liming Gao Message-ID: Date: Thu, 8 Aug 2019 15:08:22 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <20190807042537.11928-1-bob.c.feng@intel.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Thu, 08 Aug 2019 13:08:25 +0000 (UTC) Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit (+ Andrew, Leif, Mike; Liming) On 08/07/19 06:25, Bob Feng wrote: > BZ: https://bugzilla.tianocore.org/show_bug.cgi?id=1875 > > In order to improve the build performance, we implemented > multiple-processes AutoGen. This change will reduce 20% time > for AutoGen phase. > > The design document can be got from: > https://edk2.groups.io/g/devel/files/Designs/2019/0627/Multiple-thread-AutoGen.pdf > > This patch serial pass the build of Ovmf, MinKabylake, MinPurley, > packages under Edk2 repository and intel client and server platforms. > > V8: > 1. Fixed a regression issue that AsbuildInf files are not generated. > That was introduced by V7. > > V7: > 1. Fixed regression issue for build -m, build modules, build > libraries. implement this fix in patch 05/10. > > 2. Fixed the regression issue for duplicat items in PcdTokenNumber > file. implement this fix in patch 02/10. > > V6: > > This version fixed a regression issue on incremental build. > The change is in patch 04/10 > > In the autogen sub-process, we need to use > PlatformInfo.Pcds rather than PlatformInfo.Platform.Pcds > because we don't want to re-evaluate the Pcds in > the autogen sub-process. > > V5: > Restructure the patches of V4. Squash 11/11 patch of V4 into V5 4/10, > 5/10 and 10/10. > > The details of the changes are as following: > > 1. Update Log queue maxsize as the value of thread number * 10. > This change is implemented in patch 10/10 > > 2. Fixed the regression issue that the PCDs from build command line > are not passed into autogen sub-process correctly. > > 3. Fixed the regression issue that the exception raised inside > autogen sub-process is not well handled, including code exception > and Ctrl+C keyboard interrupt. > > The above 2 changes are implemented in patch 5/10 > > 4. Fixed the regression issue that multiple builds of the same > driver/application with FILE_GUID overrides is not handle correctly. > > 5. Fixed the regression issue that shared fixed PCD between lib > and module is not handle correctly, which cause the autogen.c and > autogen.h are different from orignal. > > The above 2 changes are implemented in patch 4/10 > Thank you for the above explanation! (1) In my clone of the QEMU repository (master @ v4.1.0-rc4), I advanced the roms/edk2 submodule to commit 96603b4f02b9 ("BaseTools/PatchCheck: Disable text conversion in 'git show'", 2019-08-07), and applied your patches on top. Then I ran $ nice make -j4 -C roms efi in the QEMU project root. This build forces the usage of Python2, due to . The build worked fine, Ctrl-S / Ctrl-Q worked fine, and the resultant firmware binaries all worked fine. All good. (2) QEMU carries a simpe UEFI application that exposes SMBIOS anchor addresses, and ACPI RSD PTR addresses, from the UEFI guest to some QEMU unit tests that run on the host. In the same environment as (1), I rebuilt this application: $ nice make -j4 -C tests/uefi-test-tools This command uses the "-m" switch of "build", internally. It also forces Python2, due to . The build completed fine, and I tested the AARCH64 and X64 binaries (bootable ISO images). They worked fine. Good. (3) In my normal edk2 clone, I cleaned the tree, applied your patches (again on top of commit 96603b4f02b9), and started a build: $ . edksetup.sh $ nice make -C "$EDK_TOOLS_PATH" -j $(getconf _NPROCESSORS_ONLN) $ nice -n 19 build \ -a IA32 \ -p OvmfPkg/OvmfPkgIa32.dsc \ -t GCC48 \ -b NOOPT \ -n 4 \ -D SMM_REQUIRE \ -D SECURE_BOOT_ENABLE \ -D NETWORK_TLS_ENABLE \ -D NETWORK_IP6_ENABLE \ -D NETWORK_HTTP_BOOT_ENABLE \ --report-file=.../build.ovmf.32.report \ --log=.../build.ovmf.32.log \ --cmd-len=65536 \ --hash \ --genfds-multi-thread This command located Python3: > WORKSPACE = .../edk2 > EDK_TOOLS_PATH = .../edk2/BaseTools > CONF_PATH = .../edk2/Conf > PYTHON_COMMAND = /usr/bin/python3.6 > > > Processing meta-data . > Architecture(s) = IA32 > Build target = NOOPT > Toolchain = GCC48 The build launched fine. After 10-20 seconds into the build, I interrupted it with Ctrl-C: > build.py... > : error 7000: Failed to execute command > make tbuild [.../edk2/Build/OvmfIa32/NOOPT_GCC48/IA32/ShellPkg/Library/UefiShellDebug1CommandsLib/UefiShellDebug1CommandsLib] > > > build.py... > : error 7000: Failed to execute command > make tbuild [.../edk2/Build/OvmfIa32/NOOPT_GCC48/IA32/ShellPkg/Library/UefiShellDriver1CommandsLib/UefiShellDriver1CommandsLib] > > > build.py... > : error 7000: Failed to execute command > make tbuild [.../edk2/Build/OvmfIa32/NOOPT_GCC48/IA32/CryptoPkg/Library/OpensslLib/OpensslLib] > > > build.py... > : error 7000: Failed to execute command > make tbuild [.../edk2/Build/OvmfIa32/NOOPT_GCC48/IA32/MdePkg/Library/BaseLib/BaseLib] > > - Aborted - > Build end time: 14:05:56, Aug.08 2019 > Build total time: 00:00:15 As next step, I repeated the same "build" command as above, in order to continue the interrupted build. Unfortunately, this failed: > WORKSPACE = .../edk2 > EDK_TOOLS_PATH = .../edk2/BaseTools > CONF_PATH = .../edk2/Conf > PYTHON_COMMAND = /usr/bin/python3.6 > > > Processing meta-data > .Architecture(s) = IA32 > Build target = NOOPT > Toolchain = GCC48 > > Active Platform = .../edk2/OvmfPkg/OvmfPkgIa32.dsc > ..... done! > > Fd File Name:OVMF (.../edk2/Build/OvmfIa32/NOOPT_GCC48/FV/OVMF.fd) > > Generate Region at Offset 0x0 > Region Size = 0x40000 > Region Name = DATA > > Generate Region at Offset 0x40000 > Region Size = 0x1000 > Region Name = None > > Generate Region at Offset 0x41000 > Region Size = 0x1000 > Region Name = DATA > > Generate Region at Offset 0x42000 > Region Size = 0x42000 > Region Name = None > > Generate Region at Offset 0x84000 > Region Size = 0x348000 > Region Name = FV > > Generating FVMAIN_COMPACT FV > > Generating PEIFV FV > ###### ['GenFv', '-a', '.../edk2/Build/OvmfIa32/NOOPT_GCC48/FV/Ffs/PEIFV.inf', '-o', '.../edk2/Build/OvmfIa32/NOOPT_GCC48/FV/PEIFV.Fv', '-i', '.../edk2/Build/OvmfIa32/NOOPT_GCC48/FV/PEIFV.inf'] > Return Value = 2 > GenFv: ERROR 0001: Error opening file > .../edk2/Build/OvmfIa32/NOOPT_GCC48/FV/Ffs/52C05B14-0B98-496c-BC3B-04B50211D680PeiCore/52C05B14-0B98-496c-BC3B-04B50211D680.ffs > > > > > build.py... > : error 7000: Failed to generate FV > > > > build.py... > : error 7000: Failed to execute command > > > - Failed - > Build end time: 14:06:25, Aug.08 2019 > Build total time: 00:00:06 To be honest, I'm not sure what to ask for, at this point. - On one hand, this is certainly not ideal. Continuing a manually interrupted build should preferably work -- that's a form of incremental build. And, it did work in my v3 testing; see bullet (5) in: http://mid.mail-archive.com/4ea3d3fa-2210-3642-2337-db525312d312@redhat.com https://edk2.groups.io/g/devel/message/44246 (Is this perhaps a regression from the V6 update, which was related to incremental builds?) - On the other hand, this is not necessarily show-stopper, and I'm quite out of capacity for testing further versions of this full patch set. Perhaps you can work on this issue incrementally -- bugfixes can be accepted during the freeze periods. I don't feel comfortable giving Tested-by or Regression-tested-by in this state, but I also won't block the patch set from being merged. Note that this problem appears repeatable, and it reproduces using Python2 as well. It should be possible for you to reproduce and to debug. (4) In this test, I repeated (3), but instead of interrupting the build with Ctrl-C, I introduced a syntax error to one of the C source files under OvmfPkg (I simply appended the constant "1" to the end of the file). As expected, the build failed (and correctly stopped, too): > .../edk2/OvmfPkg/VirtioNetDxe/SnpReceive.c:186:1: error: expected identifier or '(' before numeric constant > 1 > ^ > make: *** [.../edk2/Build/OvmfIa32/NOOPT_GCC48/IA32/OvmfPkg/VirtioNetDxe/VirtioNet/OUTPUT/SnpReceive.obj] Error 1 > > > build.py... > : error 7000: Failed to execute command > make tbuild [.../edk2/Build/OvmfIa32/NOOPT_GCC48/IA32/OvmfPkg/VirtioNetDxe/VirtioNet] > > > build.py... > : error F002: Failed to build module > .../edk2/OvmfPkg/VirtioNetDxe/VirtioNet.inf [IA32, GCC48, NOOPT] > > - Failed - > Build end time: 14:29:18, Aug.08 2019 > Build total time: 00:00:38 I undid the syntax error, and repeated the "build" command. The build resumed fine, and produced a functional OVMF binary. Good. (5) I also verified that changes to C files, made after the build completed successfully for the first time, would cause those files to be re-built, if the "build" command was repeated. So that's OK too. ... All in all, I think the series is mature enough to merge, in order to expose it to wider testing by the community, with the soft feature freeze just around the corner. The main functionality seems to work, there don't seem to be show-stoppers. IMO a BaseTools series doesn't have to be *perfect* -- as long as it doesn't get in the way of people doing their work, it should be possible to improve upon, incrementally. Therefore, from my side, I'm willing to give you a (somewhat reserved) Acked-by: Laszlo Ersek for the series. I suggest seeking feedback from the other stewards as well. To reiterate, the only issue I have found is that the build could not be resumed after I interrupted it with Ctrl-C, in section (3). If there is consensus to push the v8 series with that, I would suggest filing a TianoCore BZ about issue (3) first, and to reference the BZ as a "known issue" in the commit message of patch#4 or patch#5. Thanks Laszlo