From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: mx.groups.io; dkim=missing; spf=pass (domain: intel.com, ip: 134.134.136.20, mailfrom: bob.c.feng@intel.com) Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by groups.io with SMTP; Thu, 08 Aug 2019 08:38:33 -0700 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 08 Aug 2019 08:38:32 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.64,362,1559545200"; d="scan'208";a="199086467" Received: from fmsmsx104.amr.corp.intel.com ([10.18.124.202]) by fmsmga004.fm.intel.com with ESMTP; 08 Aug 2019 08:38:32 -0700 Received: from fmsmsx115.amr.corp.intel.com (10.18.116.19) by fmsmsx104.amr.corp.intel.com (10.18.124.202) with Microsoft SMTP Server (TLS) id 14.3.439.0; Thu, 8 Aug 2019 08:38:32 -0700 Received: from shsmsx106.ccr.corp.intel.com (10.239.4.159) by fmsmsx115.amr.corp.intel.com (10.18.116.19) with Microsoft SMTP Server (TLS) id 14.3.439.0; Thu, 8 Aug 2019 08:38:32 -0700 Received: from shsmsx105.ccr.corp.intel.com ([169.254.11.15]) by SHSMSX106.ccr.corp.intel.com ([169.254.10.204]) with mapi id 14.03.0439.000; Thu, 8 Aug 2019 23:38:30 +0800 From: "Bob Feng" To: "devel@edk2.groups.io" , "leif.lindholm@linaro.org" , Laszlo Ersek CC: Andrew Fish , "Kinney, Michael D" , "Gao, Liming" Subject: Re: [edk2-devel] [Patch 00/10 V8] Enable multiple process AutoGen Thread-Topic: [edk2-devel] [Patch 00/10 V8] Enable multiple process AutoGen Thread-Index: AQHVTepaL7pUwkMAmUKNiSADhe3226bwvRsAgACgIcA= Date: Thu, 8 Aug 2019 15:38:29 +0000 Message-ID: <08650203BA1BD64D8AD9B6D5D74A85D160B559E9@SHSMSX105.ccr.corp.intel.com> References: <20190807042537.11928-1-bob.c.feng@intel.com> <20190808134522.GY25813@bivouac.eciton.net> In-Reply-To: <20190808134522.GY25813@bivouac.eciton.net> Accept-Language: zh-CN, en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiMTMwZmI3OTItOGY5Ni00Mzk1LThiNDMtZTBmZDQ3YzQyZjNmIiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX05UIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE3LjEwLjE4MDQuNDkiLCJUcnVzdGVkTGFiZWxIYXNoIjoiYTlLOXRhcnZxUjlLY0RBelZpeEd6MjVUekl0VVZGV1RDOEp3a2lYekFiMUNTcmdqNFR3N3hYbEVoOW1nbHNtQiJ9 x-ctpclassification: CTP_NT dlp-product: dlpe-windows dlp-version: 11.2.0.6 dlp-reaction: no-action x-originating-ip: [10.239.127.40] MIME-Version: 1.0 Return-Path: bob.c.feng@intel.com Content-Language: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Hi Laszlo and Leif, Thanks for you detailed testing and comments. I'd like to explain the failure of the test 3#. I can reproduce the failur= e with your steps and I found this failure can also be reproduced without m= ultiple process autogen patch set. I debugged and found this failure is du= e to --hash build option. I double tested that if remove --hash build optio= n, the test 3# can pass. Would you please double verified test 3# without = --hash? I think we can enter a new BZ for the --hash bug. Thanks, Bob -----Original Message----- From: devel@edk2.groups.io [mailto:devel@edk2.groups.io] On Behalf Of Leif= Lindholm Sent: Thursday, August 8, 2019 9:45 PM To: Laszlo Ersek Cc: Feng, Bob C ; devel@edk2.groups.io; Andrew Fish = ; Kinney, Michael D ; Gao, Lim= ing Subject: Re: [edk2-devel] [Patch 00/10 V8] Enable multiple process AutoGen Hi Laszlo, Thanks for looping me in. On Thu, Aug 08, 2019 at 03:08:22PM +0200, Laszlo Ersek wrote: > (+ Andrew, Leif, Mike; Liming) >=20 > On 08/07/19 06:25, Bob Feng wrote: > (3) In my normal edk2 clone, I cleaned the tree, applied your patches=20 > (again on top of commit 96603b4f02b9), and started a build: >=20 > $ . edksetup.sh > $ nice make -C "$EDK_TOOLS_PATH" -j $(getconf _NPROCESSORS_ONLN) $=20 > nice -n 19 build \ > -a IA32 \ > -p OvmfPkg/OvmfPkgIa32.dsc \ > -t GCC48 \ > -b NOOPT \ > -n 4 \ > -D SMM_REQUIRE \ > -D SECURE_BOOT_ENABLE \ > -D NETWORK_TLS_ENABLE \ > -D NETWORK_IP6_ENABLE \ > -D NETWORK_HTTP_BOOT_ENABLE \ > --report-file=3D.../build.ovmf.32.report \ > --log=3D.../build.ovmf.32.log \ > --cmd-len=3D65536 \ > --hash \ > --genfds-multi-thread >=20 > This command located Python3: >=20 > > WORKSPACE =3D .../edk2 > > EDK_TOOLS_PATH =3D .../edk2/BaseTools > > CONF_PATH =3D .../edk2/Conf > > PYTHON_COMMAND =3D /usr/bin/python3.6 > > > > > > Processing meta-data . > > Architecture(s) =3D IA32 > > Build target =3D NOOPT > > Toolchain =3D GCC48 >=20 > The build launched fine. >=20 > After 10-20 seconds into the build, I interrupted it with Ctrl-C: >=20 > > build.py... > > : error 7000: Failed to execute command > > make tbuild=20 > > [.../edk2/Build/OvmfIa32/NOOPT_GCC48/IA32/ShellPkg/Library/UefiShell > > Debug1CommandsLib/UefiShellDebug1CommandsLib] > > > > > > build.py... > > : error 7000: Failed to execute command > > make tbuild=20 > > [.../edk2/Build/OvmfIa32/NOOPT_GCC48/IA32/ShellPkg/Library/UefiShell > > Driver1CommandsLib/UefiShellDriver1CommandsLib] > > > > > > build.py... > > : error 7000: Failed to execute command > > make tbuild=20 > > [.../edk2/Build/OvmfIa32/NOOPT_GCC48/IA32/CryptoPkg/Library/OpensslL > > ib/OpensslLib] > > > > > > build.py... > > : error 7000: Failed to execute command > > make tbuild=20 > > [.../edk2/Build/OvmfIa32/NOOPT_GCC48/IA32/MdePkg/Library/BaseLib/Bas > > eLib] > > > > - Aborted - > > Build end time: 14:05:56, Aug.08 2019 Build total time: 00:00:15 >=20 > As next step, I repeated the same "build" command as above, in order=20 > to continue the interrupted build. Unfortunately, this failed: >=20 > > WORKSPACE =3D .../edk2 > > EDK_TOOLS_PATH =3D .../edk2/BaseTools > > CONF_PATH =3D .../edk2/Conf > > PYTHON_COMMAND =3D /usr/bin/python3.6 > > > > > > Processing meta-data > > .Architecture(s) =3D IA32 > > Build target =3D NOOPT > > Toolchain =3D GCC48 > > > > Active Platform =3D .../edk2/OvmfPkg/OvmfPkgIa32.dsc > > ..... done! > > > > Fd File Name:OVMF (.../edk2/Build/OvmfIa32/NOOPT_GCC48/FV/OVMF.fd) > > > > Generate Region at Offset 0x0 > > Region Size =3D 0x40000 > > Region Name =3D DATA > > > > Generate Region at Offset 0x40000 > > Region Size =3D 0x1000 > > Region Name =3D None > > > > Generate Region at Offset 0x41000 > > Region Size =3D 0x1000 > > Region Name =3D DATA > > > > Generate Region at Offset 0x42000 > > Region Size =3D 0x42000 > > Region Name =3D None > > > > Generate Region at Offset 0x84000 > > Region Size =3D 0x348000 > > Region Name =3D FV > > > > Generating FVMAIN_COMPACT FV > > > > Generating PEIFV FV > > ###### ['GenFv', '-a',=20 > > '.../edk2/Build/OvmfIa32/NOOPT_GCC48/FV/Ffs/PEIFV.inf', '-o',=20 > > '.../edk2/Build/OvmfIa32/NOOPT_GCC48/FV/PEIFV.Fv', '-i',=20 > > '.../edk2/Build/OvmfIa32/NOOPT_GCC48/FV/PEIFV.inf'] > > Return Value =3D 2 > > GenFv: ERROR 0001: Error opening file > > =20 > > .../edk2/Build/OvmfIa32/NOOPT_GCC48/FV/Ffs/52C05B14-0B98-496c-BC3B-0 > > 4B50211D680PeiCore/52C05B14-0B98-496c-BC3B-04B50211D680.ffs > > > > > > > > > > build.py... > > : error 7000: Failed to generate FV > > > > > > > > build.py... > > : error 7000: Failed to execute command > > > > > > - Failed - > > Build end time: 14:06:25, Aug.08 2019 Build total time: 00:00:06 >=20 > To be honest, I'm not sure what to ask for, at this point. >=20 > - On one hand, this is certainly not ideal. Continuing a manually=20 > interrupted build should preferably work -- that's a form of=20 > incremental build. And, it did work in my v3 testing; see bullet (5) in: >=20 > http://mid.mail-archive.com/4ea3d3fa-2210-3642-2337-db525312d312@redha= t.com > https://edk2.groups.io/g/devel/message/44246 >=20 > (Is this perhaps a regression from the V6 update, which was related to= =20 > incremental builds?) >=20 > - On the other hand, this is not necessarily show-stopper, and I'm=20 > quite out of capacity for testing further versions of this full patch se= t. > Perhaps you can work on this issue incrementally -- bugfixes can be=20 > accepted during the freeze periods. I think there are two (independent) circumstances where I would be happy f= or the support to be included even given this bug: 1) The parallel autogen is only invoked (at this point in time) when requested by an explicit command line parameter. or 2) The failure is detected and its cause clearly printed for the user. >>From my reading of the above, neither is true. At which point, I think we would either make one of those true, or root ca= use and fix the actual error, in order to be able to accept this into the t= ree. Regardless of which side of the stable tag. I *really* don't want for us to knowingly end up with a build system that = "sometimes breaks sporadically and you need to git clean the repository and= try again". > I don't feel comfortable giving Tested-by or Regression-tested-by in=20 > this state, but I also won't block the patch set from being merged. >=20 > Note that this problem appears repeatable, and it reproduces using > Python2 as well. It should be possible for you to reproduce and to=20 > debug. It being reproducible by Python 2 is actually really positive, since it su= ggests Python 3 async i/o is not involved. > (4) In this test, I repeated (3), but instead of interrupting the=20 > build with Ctrl-C, I introduced a syntax error to one of the C source=20 > files under OvmfPkg (I simply appended the constant "1" to the end of=20 > the file). >=20 > As expected, the build failed (and correctly stopped, too): >=20 > > .../edk2/OvmfPkg/VirtioNetDxe/SnpReceive.c:186:1: error: expected=20 > > identifier or '(' before numeric constant > > 1 > > ^ > > make: ***=20 > > [.../edk2/Build/OvmfIa32/NOOPT_GCC48/IA32/OvmfPkg/VirtioNetDxe/Virti > > oNet/OUTPUT/SnpReceive.obj] Error 1 > > > > > > build.py... > > : error 7000: Failed to execute command > > make tbuild=20 > > [.../edk2/Build/OvmfIa32/NOOPT_GCC48/IA32/OvmfPkg/VirtioNetDxe/Virti > > oNet] > > > > > > build.py... > > : error F002: Failed to build module > > .../edk2/OvmfPkg/VirtioNetDxe/VirtioNet.inf [IA32, GCC48,=20 > > NOOPT] > > > > - Failed - > > Build end time: 14:29:18, Aug.08 2019 Build total time: 00:00:38 >=20 > I undid the syntax error, and repeated the "build" command. >=20 > The build resumed fine, and produced a functional OVMF binary. Good. Not unexpected, but good to have verified. > (5) I also verified that changes to C files, made after the build=20 > completed successfully for the first time, would cause those files to=20 > be re-built, if the "build" command was repeated. So that's OK too. >=20 > ... All in all, I think the series is mature enough to merge, in order= =20 > to expose it to wider testing by the community, with the soft feature=20 > freeze just around the corner. The main functionality seems to work,=20 > there don't seem to be show-stoppers. IMO a BaseTools series doesn't=20 > have to be *perfect* -- as long as it doesn't get in the way of people= =20 > doing their work, it should be possible to improve upon, incrementally. > Therefore, from my side, I'm willing to give you a (somewhat reserved) >=20 > Acked-by: Laszlo Ersek >=20 > for the series. >=20 > I suggest seeking feedback from the other stewards as well. >=20 > To reiterate, the only issue I have found is that the build could not=20 > be resumed after I interrupted it with Ctrl-C, in section (3). If=20 > there is consensus to push the v8 series with that, I would suggest=20 > filing a TianoCore BZ about issue (3) first, and to reference the BZ=20 > as a "known issue" in the commit message of patch#4 or patch#5. I will throw in a transitional Nacked-by: Leif Lindholm for now. If it can happen from a Ctrl-C, it can happen from an OOM-event, a lost ne= twork connection, and a bunch of other things. And we could live with a cor= rupted state causing breakage on next build attempt - but not an opaque bre= akage. At a minimum, it needs to be clear what has caused the breakage. Best Regards, Leif