From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by mx.groups.io with SMTP id smtpd.web12.3317.1574404765736700910 for ; Thu, 21 Nov 2019 22:39:25 -0800 Authentication-Results: mx.groups.io; dkim=missing; spf=pass (domain: intel.com, ip: 134.134.136.31, mailfrom: bob.c.feng@intel.com) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 21 Nov 2019 22:39:24 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.69,228,1571727600"; d="scan'208";a="232577415" Received: from fmsmsx104.amr.corp.intel.com ([10.18.124.202]) by fmsmga004.fm.intel.com with ESMTP; 21 Nov 2019 22:39:24 -0800 Received: from fmsmsx113.amr.corp.intel.com (10.18.116.7) by fmsmsx104.amr.corp.intel.com (10.18.124.202) with Microsoft SMTP Server (TLS) id 14.3.439.0; Thu, 21 Nov 2019 22:39:24 -0800 Received: from shsmsx107.ccr.corp.intel.com (10.239.4.96) by FMSMSX113.amr.corp.intel.com (10.18.116.7) with Microsoft SMTP Server (TLS) id 14.3.439.0; Thu, 21 Nov 2019 22:39:24 -0800 Received: from shsmsx104.ccr.corp.intel.com ([169.254.5.127]) by SHSMSX107.ccr.corp.intel.com ([169.254.9.63]) with mapi id 14.03.0439.000; Fri, 22 Nov 2019 14:39:22 +0800 From: "Bob Feng" To: "devel@edk2.groups.io" , "lersek@redhat.com" , "Fan, ZhijuX" CC: "Gao, Liming" , Leif Lindholm Subject: Re: [edk2-devel] [PATCH] BaseTools:fixed Build failed issue for Non-English OS Thread-Topic: [edk2-devel] [PATCH] BaseTools:fixed Build failed issue for Non-English OS Thread-Index: AdWflXp7a1q+YtFnRyKhCaD7MOH1+QAvAbYAAB+J4IA= Date: Fri, 22 Nov 2019 06:39:21 +0000 Message-ID: <08650203BA1BD64D8AD9B6D5D74A85D16156A292@SHSMSX104.ccr.corp.intel.com> References: <95a79c71-2eb4-af97-a738-170a2668599d@redhat.com> In-Reply-To: <95a79c71-2eb4-af97-a738-170a2668599d@redhat.com> Accept-Language: zh-CN, en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.239.127.40] MIME-Version: 1.0 Return-Path: bob.c.feng@intel.com Content-Language: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Hi Laszlo, I add some comments inline. Thanks, Bob -----Original Message----- From: devel@edk2.groups.io [mailto:devel@edk2.groups.io] On Behalf Of Lasz= lo Ersek Sent: Friday, November 22, 2019 1:53 AM To: devel@edk2.groups.io; Fan, ZhijuX Cc: Gao, Liming ; Feng, Bob C = ; Leif Lindholm Subject: Re: [edk2-devel] [PATCH] BaseTools:fixed Build failed issue for N= on-English OS (+Leif) On 11/20/19 12:27, Fan, ZhijuX wrote: > BZ:https://bugzilla.tianocore.org/show_bug.cgi?id=3D2365 > > Build failed on Non-English OS if structurePcd is used in platform dsc= =20 > file. > When the output of some functions is converted to code, Because=20 > different OS Character encoding form differently, there may be=20 > problems with some functions > > The patch is going to fixed this issue > > Cc: Liming Gao > Cc: Bob Feng > Signed-off-by: Zhiju.Fan > --- > BaseTools/Source/Python/Workspace/DscBuildData.py | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/BaseTools/Source/Python/Workspace/DscBuildData.py=20 > b/BaseTools/Source/Python/Workspace/DscBuildData.py > index 9192077f90..901d95a413 100644 > --- a/BaseTools/Source/Python/Workspace/DscBuildData.py > +++ b/BaseTools/Source/Python/Workspace/DscBuildData.py > @@ -1752,7 +1752,7 @@ class DscBuildData(PlatformBuildClassObject): > except: > EdkLogger.error('Build', COMMAND_FAILURE, 'Can not execute = command: %s' % Command) > Result =3D Process.communicate() > - return Process.returncode, Result[0].decode(), Result[1].decode= () > + return Process.returncode, Result[0].decode(errors=3D'ignore'),= = =20 > + Result[1].decode(errors=3D'ignore') > > @staticmethod > def IntToCString(Value, ValueSize): Let me quote the full method (pre-patch): @staticmethod def ExecuteCommand (Command): try: Process =3D subprocess.Popen(Command, stdout=3Dsubprocess.PIPE= , stderr=3Dsubprocess.PIPE, shell=3DTrue) except: EdkLogger.error('Build', COMMAND_FAILURE, 'Can not execute com= mand: %s' % Command) Result =3D Process.communicate() return Process.returncode, Result[0].decode(), Result[1].decode() Relevant documentation: - https://docs.python.org/3/library/subprocess.html#subprocess.Popen.commu= nicate - https://docs.python.org/3/library/stdtypes.html#bytes.decode - https://docs.python.org/3/library/codecs.html#error-handlers So we launch a subprocess, read its stdout and stderr as byte arrays, and = then decode those byte arrays (interpreted in UTF-8) to string (=3D sequenc= es of unicode code points). If the byte arrays turn out to be invalid (per UTF-8), we currently fail. With the patch, we don't fail, but return garbage (or I guess "partial dat= a" -- see the docs: "malformed data is ignored and encoding or decoding is = continued without further notice"). [Bob] That's correct. I agree. I think this patch is wrong. Here's why: - I don't see why the symptom is inherently tied to structured PCDs. [Bob] The build failure was found on a platform which uses Structure Pcds = in dsc and the build machine is a Win10 with Korean language. The present patch descripts the build failure case. - "Non-English OS" is not precise; it comes down to the encoding of the byte stream that the subprocess produces. Even if the current locale specifies English-language messages, the problem can surface if the subprocess prints *symbols*, with non-UTF-8 encoding. Conversely, if the locale specifies non-English messages, but the subprocess prints them in correct UTF-8 encoding, the problem will not arise. [Bob] I agree. "Non-English OS" is not precise. For this case, the build f= ailure is because the visual studio c compiler output message includes loca= lized string.=20 - I think the code should attempt to decode the stdout / stderr byte arrays using the LC_CTYPE locale category (not fixed UTF-8), and if *even that* fails, we should use errors=3D'replace' (or another replacement method). [Bob] I think for this case build tool does not need to handle the non-asc= ii characters so 'Ignore' will be enough. ... Finally, please compare commit 8ddec24dea74 ("BaseTools:Updata the out= put encoding of the Popen function", 2019-08-01). At this point, I'm comple= tely confused. Consider: (a) Commit 8ddec24dea74 removed the "encoding=3D'utf-8'" parameter -- which has no effect, as "encoding=3D'utf-8'" is the default. (b) It also removed "errors=3D'ignore'". In effect, that change set 'errors=3D"strict"'. (c) Okay... so the change in said commit actually means "be strict about UTF-8 decoding failures". But then the *message* on commit 8ddec24dea74, "Not all output works in utf-8, so change the encoding to the default", makes no sense at all! (d) Given the above, the present patch, in effect, *reverts* the last hunk of commit 8ddec24dea74. The "encoding=3D'utf-8'" parameter remains the default, and the patch basically says, "oh yea, you know what, I think I'd still like to ignore UTF-8 decoding errors, like I used to do before 8ddec24dea74". In short... 8ddec24dea74 was wrong, at least in part, and now we're revert= ing it (in part). The resultant code is still quite problematic in my opini= on (because we shouldn't *completely* ignore invalid UTF-8 sequences), but = even if we do what the present patch proposes, can we at least point out th= at we are *reverting* part of 8ddec24dea74? And drop the "structure PCD" an= d "non-English OS" language? [Bob] I agree to mention the present patch is to revert part of commit 8dd= ec24dea74, drop "non-English OS" language, but keep "structure PCD". Thanks Laszlo