From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) by mx.groups.io with SMTP id smtpd.web12.447.1604351465408565111 for ; Mon, 02 Nov 2020 13:11:05 -0800 Authentication-Results: mx.groups.io; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=gOS2qCcx; spf=pass (domain: redhat.com, ip: 63.128.21.124, mailfrom: lersek@redhat.com) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1604351464; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+PFqq8zMPcTrFMQJpuMqK5hpOrK1/bGR6scwxk4Fxpo=; b=gOS2qCcxeM7mMV2G+sy9GI5zY2EA8Ar9lZ6w5x0+r9Shwm96y1q/Cs1gwYZhsvpejny8Zu 5Oh748ZacExdu0pYbR+UzpPXQrYX1PuL5yRDxk9vlnrpe2OpcEse9AyIQBcJ7fWlu0ZQtL rBSG7mtU2/bmbbhRzxpVrqb6SRX8oKU= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-348-lZWlxDZcM5KG_8CFvmWUcA-1; Mon, 02 Nov 2020 16:11:00 -0500 X-MC-Unique: lZWlxDZcM5KG_8CFvmWUcA-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id D331A186DD22; Mon, 2 Nov 2020 21:10:58 +0000 (UTC) Received: from lacos-laptop-7.usersys.redhat.com (ovpn-112-52.ams2.redhat.com [10.36.112.52]) by smtp.corp.redhat.com (Postfix) with ESMTP id B7DDD60CCC; Mon, 2 Nov 2020 21:10:57 +0000 (UTC) Subject: Re: [edk2-devel] [edk] [PATCH] ShellPkg/edit: allow non-ASCII characters in edit To: devel@edk2.groups.io, xypron.glpk@gmx.de, Liming Gao Cc: michael.d.kinney@intel.com References: <20191124103749.21576-1-xypron.glpk@gmx.de> From: "Laszlo Ersek" Message-ID: Date: Mon, 2 Nov 2020 22:10:56 +0100 MIME-Version: 1.0 In-Reply-To: <20191124103749.21576-1-xypron.glpk@gmx.de> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=lersek@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit On 11/24/19 11:37, Heinrich Schuchardt via Groups.Io wrote: > REF: https://bugzilla.tianocore.org/show_bug.cgi?id=2339 > > Currently it is not possible to add letters outside the range > U+0020 - U+007F to a file using the edit command. > > In Unicode the following are control characters: > > * U+0000—U+001F (C0 controls) > * U+007F (DEL) > * U+0080—U+009F (C1 controls). > > For reference see: > > * https://unicode.org/charts/PDF/U0000.pdf > * https://unicode.org/charts/PDF/U0080.pdf > > So the characters we should exclude from the file buffer are: > U+0000 - U+001f, U+007f - U009F > > Allow all other characters as input to the file buffer in Unicode mode. > Allow only ASCII characters as input in ASCII mode. > > When saving a file in ASCII mode replace non-ASCII characters by a question > mark ('?'). > > Signed-off-by: Heinrich Schuchardt > --- > Resent due to a typo in Limings email-address. > --- (1) Please run "BaseTools/Scripts/SetupGit.py" in your edk2 clone. (2) This patch was lost because the ShellPkg maintainers were not CC'd. I suggest resending the patch, with the Content-Transfer-Encoding fixed (8bit or base64, per (1)), and with the following folks CC'd (per (2)): Ray Ni Zhichao Gao Thanks Laszlo > .../Edit/FileBuffer.c | 21 +++++++++++++++---- > 1 file changed, 17 insertions(+), 4 deletions(-) > > diff --git a/ShellPkg/Library/UefiShellDebug1CommandsLib/Edit/FileBuffer.c b/ShellPkg/Library/UefiShellDebug1CommandsLib/Edit/FileBuffer.c > index fd324cc4a8..12235e4e4b 100644 > --- a/ShellPkg/Library/UefiShellDebug1CommandsLib/Edit/FileBuffer.c > +++ b/ShellPkg/Library/UefiShellDebug1CommandsLib/Edit/FileBuffer.c > @@ -1360,6 +1360,8 @@ GetNewLine ( > /** > > Change a Unicode string to an ASCII string. > > > > + Non-ASCII characters are replaced by '?'. > > + > > @param[in] UStr The Unicode string. > > @param[in] Length The maximum size of AStr. > > @param[out] AStr ASCII string to pass out. > > @@ -1378,8 +1380,12 @@ UnicodeToAscii ( > // > > // just buffer copy, not character copy > > // > > - for (Index = 0; Index < Length; Index++) { > > - *AStr++ = (CHAR8) *UStr++; > > + for (Index = 0; Index < Length; Index++, UStr++) { > > + if (*UStr < 0x80) { > > + *AStr++ = (CHAR8) *UStr; > > + } else { > > + *AStr++ = '?'; > > + } > > } > > > > return Index; > > @@ -2154,9 +2160,16 @@ FileBufferDoCharInput ( > > > default: > > // > > - // DEAL WITH ASCII CHAR, filter out thing like ctrl+f > > + // Do not add Unicode control characters to the file buffer: > > + // > > + // * U+0000-U+001f (C0 controls) > > + // * U+007f (DEL) > > + // * U+0080-U+009f (C1 controls) > > + // > > + // Do not add non-ASCII characters in ASCII mode. > > // > > - if (Char > 127 || Char < 32) { > > + if (Char < 0x20 || (Char >= 0x7f && > > + (Char <= 0x9f || FileBuffer.FileType == FileTypeAscii))) { > > Status = StatusBarSetStatusString (L"Unknown Command"); > > } else { > > Status = FileBufferAddChar (Char); >