From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mout.gmx.net (mout.gmx.net [212.227.15.15]) by mx.groups.io with SMTP id smtpd.web11.2318.1604358697828787382 for ; Mon, 02 Nov 2020 15:11:38 -0800 Authentication-Results: mx.groups.io; dkim=pass header.i=@gmx.net header.s=badeba3b8450 header.b=S8smHmrl; spf=pass (domain: gmx.de, ip: 212.227.15.15, mailfrom: xypron.glpk@gmx.de) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.net; s=badeba3b8450; t=1604358688; bh=RQrCnlkELmFborCzUcVvN0tFZHD3dxKZrYoBvnyNOS0=; h=X-UI-Sender-Class:From:To:Cc:Subject:Date; b=S8smHmrlQ1vszAxJce+PD9PFHwJDtTlIRE/UdcoqJ/AyRqEEDtC653RQZk81fkmCP iO8aSxhzgm8aONAfsQ+fCS7ccMhcSfxCrzW7k0GJuTnbpHU/6ZeC1+WTzVi1wp90bX mdlP/sZ8LMvHN8YKpnpFYVHw1J5wDtiGZVCQAFU0= X-UI-Sender-Class: 01bb95c1-4bf8-414a-932a-4f6e2808ef9c Received: from LT02.fritz.box ([178.202.41.107]) by mail.gmx.com (mrgmx005 [212.227.17.184]) with ESMTPSA (Nemesis) id 1MvK0R-1kIKkw3f97-00rDg9; Tue, 03 Nov 2020 00:11:27 +0100 From: "Heinrich Schuchardt" To: Ray Ni , Zhichao Gao , Leif Lindholm , Ard Biesheuvel Cc: devel@edk2.groups.io, Laszlo Ersek , Liming Gao , Heinrich Schuchardt Subject: [edk] [PATCH] ShellPkg/edit: allow non-ASCII characters in edit Date: Tue, 3 Nov 2020 00:11:14 +0100 Message-Id: <20201102231114.31099-1-xypron.glpk@gmx.de> X-Mailer: git-send-email 2.28.0 MIME-Version: 1.0 X-Provags-ID: V03:K1:I4c2oOdUdYy6a2vSpDw1YhmAT+pp4X89XDqYWdjDv2ssB6CG5fz qRFZJjesmio8o/3LOyKgjOassXzCQyB1b8h0otDaruuDgnQ6ptPmi3hpCTJEiB4vLxL9K1+ Y+0Ig5yhPQt5Iz2jGedqivKqc7T3XEVsHQIaA5FRgxryLz1N6yCfhTX95BixGsCw8gsm4nE tB1gclY+l1Sh1FiMfCN9Q== X-Spam-Flag: NO X-UI-Out-Filterresults: notjunk:1;V03:K0:pVFYvxtypk0=:kpo/GRuMmVR2NEOypPgzJG j+ES0f2llc9mdqFErOOWikCXjX8VfiOgWiGfQNxPif9YXEMSN8TLUfHCng7L1kfEM/74bV1IY JnCSogF0nT8f6azecMGRBsHMLICh+QLYkiqI+gtflVcCKqZ+PjyI3GRdBok+K+gIliQGIV3Bm 0U08k5ANUPOvF77ercRd9gHbVL6mFKvd8ST+FI7lhqWQ7dPOLnb+QYsBXRWuQiK03WiPC1C7p AGoXi0EwLlMrCL6DAQc5d0t6HLWpW9I82QJH27LbzhI/rxE+cCGowYBCwQ01hgnR7Y8kVmh2+ 1g6sWNt9xA5m8V4S4ouc53oB7uwn8FIXb0qkzeNhBKudg4HHCGul1pNKIQ6IRff+PaDB7Buvh DoRh/oC3Cd2hGONDjzsQpReZSFdUfD8NjgcurXCxgjUlxQ5IjvU+oAEwuPSJTrbwjAwKLcdF6 twVuab6T/wqQHHnccbOYulSSnhWRN97AGIGe0Uv+24D0hNCF/qkfgLpNNcohfY2bWeUTmSo6w TtMohDEB0cQ/d/c0g5qWBb5+s8SWqg6vKqUT3QtgmCnMVikHWxyUZ3GjwrQ6OlayzzldfTdI8 bYEMijUzyySfWB6ZdWgNkQoLDY8Yfz8H9d0Rqz8fEI+zpr4KSGpYOCN7d1ffEqB5dO6UKghKX 19T9iMiLIYGqUx1M/RicA3Ep1rSxb44bN4mVPXgN+EUkxgmBqEFICylElKX12FXUvHUG26xSF 5uMB5oE0zaL+rJ+VWEMpSCiR2DFmE7+t8wZJZmONL88zOAcwOx78sEOUDYsPCuTWyzPa4fgZF uXLmI7CaCn1JH30BEbaD6B2S80SUj4E2hufjrqxkoCTaxqqzmUqiIEIpA+fyClPr7TaoAT+iB PJvKh4OzznpgTCTgUImQ== Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable REF: https://bugzilla.tianocore.org/show_bug.cgi?id=3D2339 Currently it is not possible to add letters outside the range U+0020 - U+007F to a file using the edit command. In Unicode the following are control characters: * U+0000=E2=80=94U+001F (C0 controls) * U+007F (DEL) * U+0080=E2=80=94U+009F (C1 controls). For reference see: * https://unicode.org/charts/PDF/U0000.pdf * https://unicode.org/charts/PDF/U0080.pdf So the characters we should exclude from the file buffer are: U+0000 - U+001f, U+007f - U009F Allow all other characters as input to the file buffer in Unicode mode. Allow only ASCII characters as input in ASCII mode. When saving a file in ASCII mode replace non-ASCII characters by a questio= n mark ('?'). For editing texts with double width characters (e.g. Japanese) further adjustments will be needed. Signed-off-by: Heinrich Schuchardt =2D-- Resent. Original message https://edk2.groups.io/g/devel/message/51205 =2D-- .../Edit/FileBuffer.c | 21 +++++++++++++++---- 1 file changed, 17 insertions(+), 4 deletions(-) diff --git a/ShellPkg/Library/UefiShellDebug1CommandsLib/Edit/FileBuffer.c= b/ShellPkg/Library/UefiShellDebug1CommandsLib/Edit/FileBuffer.c index 5659ec981054..3923b83670fa 100644 =2D-- a/ShellPkg/Library/UefiShellDebug1CommandsLib/Edit/FileBuffer.c +++ b/ShellPkg/Library/UefiShellDebug1CommandsLib/Edit/FileBuffer.c @@ -1360,6 +1360,8 @@ GetNewLine ( /** Change a Unicode string to an ASCII string. + Non-ASCII characters are replaced by '?'. + @param[in] UStr The Unicode string. @param[in] Length The maximum size of AStr. @param[out] AStr ASCII string to pass out. @@ -1378,8 +1380,12 @@ UnicodeToAscii ( // // just buffer copy, not character copy // - for (Index =3D 0; Index < Length; Index++) { - *AStr++ =3D (CHAR8) *UStr++; + for (Index =3D 0; Index < Length; Index++, UStr++) { + if (*UStr < 0x80) { + *AStr++ =3D (CHAR8) *UStr; + } else { + *AStr++ =3D '?'; + } } return Index; @@ -2154,9 +2160,16 @@ FileBufferDoCharInput ( default: // - // DEAL WITH ASCII CHAR, filter out thing like ctrl+f + // Do not add Unicode control characters to the file buffer: // - if (Char > 127 || Char < 32) { + // * U+0000-U+001f (C0 controls) + // * U+007f (DEL) + // * U+0080-U+009f (C1 controls) + // + // Do not add non-ASCII characters in ASCII mode. + // + if (Char < 0x20 || (Char >=3D 0x7f && + (Char <=3D 0x9f || FileBuffer.FileType =3D=3D FileTypeAscii))) { Status =3D StatusBarSetStatusString (L"Unknown Command"); } else { Status =3D FileBufferAddChar (Char); =2D- 2.28.0