public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
From: "Bob Feng" <bob.c.feng@intel.com>
To: "devel@edk2.groups.io" <devel@edk2.groups.io>,
	"lersek@redhat.com" <lersek@redhat.com>,
	"Fan, ZhijuX" <zhijux.fan@intel.com>
Cc: "Gao, Liming" <liming.gao@intel.com>,
	Leif Lindholm <leif.lindholm@linaro.org>
Subject: Re: [edk2-devel] [PATCH] BaseTools:fixed Build failed issue for Non-English OS
Date: Fri, 22 Nov 2019 06:39:21 +0000	[thread overview]
Message-ID: <08650203BA1BD64D8AD9B6D5D74A85D16156A292@SHSMSX104.ccr.corp.intel.com> (raw)
In-Reply-To: <95a79c71-2eb4-af97-a738-170a2668599d@redhat.com>

Hi Laszlo,

I add some comments inline.

Thanks,
Bob

-----Original Message-----
From: devel@edk2.groups.io [mailto:devel@edk2.groups.io] On Behalf Of Laszlo Ersek
Sent: Friday, November 22, 2019 1:53 AM
To: devel@edk2.groups.io; Fan, ZhijuX <zhijux.fan@intel.com>
Cc: Gao, Liming <liming.gao@intel.com>; Feng, Bob C <bob.c.feng@intel.com>; Leif Lindholm <leif.lindholm@linaro.org>
Subject: Re: [edk2-devel] [PATCH] BaseTools:fixed Build failed issue for Non-English OS

(+Leif)

On 11/20/19 12:27, Fan, ZhijuX wrote:
> BZ:https://bugzilla.tianocore.org/show_bug.cgi?id=2365
>
> Build failed on Non-English OS if structurePcd is used in platform dsc 
> file.
> When the output of some functions is converted to code, Because 
> different OS Character encoding form differently, there may be 
> problems with some functions
>
> The patch is going to fixed this issue
>
> Cc: Liming Gao <liming.gao@intel.com>
> Cc: Bob Feng <bob.c.feng@intel.com>
> Signed-off-by: Zhiju.Fan <zhijux.fan@intel.com>
> ---
>  BaseTools/Source/Python/Workspace/DscBuildData.py | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/BaseTools/Source/Python/Workspace/DscBuildData.py 
> b/BaseTools/Source/Python/Workspace/DscBuildData.py
> index 9192077f90..901d95a413 100644
> --- a/BaseTools/Source/Python/Workspace/DscBuildData.py
> +++ b/BaseTools/Source/Python/Workspace/DscBuildData.py
> @@ -1752,7 +1752,7 @@ class DscBuildData(PlatformBuildClassObject):
>          except:
>              EdkLogger.error('Build', COMMAND_FAILURE, 'Can not execute command: %s' % Command)
>          Result = Process.communicate()
> -        return Process.returncode, Result[0].decode(), Result[1].decode()
> +        return Process.returncode, Result[0].decode(errors='ignore'), 
> + Result[1].decode(errors='ignore')
>
>      @staticmethod
>      def IntToCString(Value, ValueSize):

Let me quote the full method (pre-patch):


    @staticmethod
    def ExecuteCommand (Command):
        try:
            Process = subprocess.Popen(Command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True)
        except:
            EdkLogger.error('Build', COMMAND_FAILURE, 'Can not execute command: %s' % Command)
        Result = Process.communicate()
        return Process.returncode, Result[0].decode(), Result[1].decode()

Relevant documentation:

- https://docs.python.org/3/library/subprocess.html#subprocess.Popen.communicate
- https://docs.python.org/3/library/stdtypes.html#bytes.decode
- https://docs.python.org/3/library/codecs.html#error-handlers

So we launch a subprocess, read its stdout and stderr as byte arrays, and then decode those byte arrays (interpreted in UTF-8) to string (= sequences of unicode code points).

If the byte arrays turn out to be invalid (per UTF-8), we currently fail.

With the patch, we don't fail, but return garbage (or I guess "partial data" -- see the docs: "malformed data is ignored and encoding or decoding is continued without further notice").

[Bob] That's correct. I agree.

I think this patch is wrong. Here's why:

- I don't see why the symptom is inherently tied to structured PCDs.

[Bob] The build failure was found on a platform which uses Structure Pcds in dsc and the build machine is a Win10 with Korean language.
The present patch descripts the build failure case.

- "Non-English OS" is not precise; it comes down to the encoding of the
  byte stream that the subprocess produces. Even if the current locale
  specifies English-language messages, the problem can surface if the
  subprocess prints *symbols*, with non-UTF-8 encoding. Conversely, if
  the locale specifies non-English messages, but the subprocess prints
  them in correct UTF-8 encoding, the problem will not arise.

[Bob] I agree. "Non-English OS" is not precise. For this case, the build failure is because the visual studio c compiler output message includes localized string. 

- I think the code should attempt to decode the stdout / stderr byte
  arrays using the LC_CTYPE locale category (not fixed UTF-8), and if
  *even that* fails, we should use errors='replace' (or another
  replacement method).

[Bob] I think for this case build tool does not need to handle the non-ascii characters so 'Ignore' will be enough.

... Finally, please compare commit 8ddec24dea74 ("BaseTools:Updata the output encoding of the Popen function", 2019-08-01). At this point, I'm completely confused. Consider:

(a) Commit 8ddec24dea74 removed the "encoding='utf-8'" parameter --
    which has no effect, as "encoding='utf-8'" is the default.

(b) It also removed "errors='ignore'". In effect, that change set
    'errors="strict"'.

(c) Okay... so the change in said commit actually means "be strict about
    UTF-8 decoding failures".  But then the *message* on commit
    8ddec24dea74, "Not all output works in utf-8, so change the encoding
    to the default", makes no sense at all!

(d) Given the above, the present patch, in effect, *reverts* the last
    hunk of commit 8ddec24dea74. The "encoding='utf-8'" parameter
    remains the default, and the patch basically says, "oh yea, you know
    what, I think I'd still like to ignore UTF-8 decoding errors, like I
    used to do before 8ddec24dea74".

In short... 8ddec24dea74 was wrong, at least in part, and now we're reverting it (in part). The resultant code is still quite problematic in my opinion (because we shouldn't *completely* ignore invalid UTF-8 sequences), but even if we do what the present patch proposes, can we at least point out that we are *reverting* part of 8ddec24dea74? And drop the "structure PCD" and "non-English OS" language?

[Bob] I agree to mention the present patch is to revert part of commit 8ddec24dea74, drop "non-English OS" language, but keep "structure PCD".

Thanks
Laszlo





  reply	other threads:[~2019-11-22  6:39 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-20 11:27 [PATCH] BaseTools:fixed Build failed issue for Non-English OS Fan, ZhijuX
2019-11-21  7:08 ` Bob Feng
2019-11-21 17:53 ` [edk2-devel] " Laszlo Ersek
2019-11-22  6:39   ` Bob Feng [this message]
2019-11-22 16:47     ` Laszlo Ersek
2019-11-25  1:37       ` Bob Feng
2019-11-25 12:12         ` Laszlo Ersek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=08650203BA1BD64D8AD9B6D5D74A85D16156A292@SHSMSX104.ccr.corp.intel.com \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox