From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-wm1-f66.google.com (mail-wm1-f66.google.com [209.85.128.66]) by mx.groups.io with SMTP id smtpd.web10.5924.1604446951919069035 for ; Tue, 03 Nov 2020 15:42:32 -0800 Authentication-Results: mx.groups.io; dkim=pass header.i=@nuviainc-com.20150623.gappssmtp.com header.s=20150623 header.b=HRLZBLMH; spf=pass (domain: nuviainc.com, ip: 209.85.128.66, mailfrom: leif@nuviainc.com) Received: by mail-wm1-f66.google.com with SMTP id d142so830965wmd.4 for ; Tue, 03 Nov 2020 15:42:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nuviainc-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to :user-agent; bh=2iNi5eEvt7kZKF37y7ERET6OdJ4NPAJZTcrB/RTzrqY=; b=HRLZBLMH8FJHg9beqjK/m5XAQR4eUEOQJ3D8V6BZlo3bN+UUQFp+JtbBCg2S0jUhmd XjS2nw+gk/cahSlx6LycAbAgPZKYIerJQeIjogqyniUDb/U6qEkzu82UxcGsrN7QWOPW t+nbamyT/6IfgfAPxqeXokUZMzuMaqwEv4m3ylejXpXXFHQF//UsevhydDAZNau6LrgV BWDOt1/oBU4yyoqHrm1GYt7z9i3vHeU5/9iTgucSjEGdR+24h9/yvbjnvDUGzeFcR8cs /KI4C95EKw3fvLvey2m0SJw6rX46nQSkUrKHDz6ZV+IFblEhTlOiHgD76gUg8na5EUUV YGnw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to:user-agent; bh=2iNi5eEvt7kZKF37y7ERET6OdJ4NPAJZTcrB/RTzrqY=; b=Pqkd0rKA8cDl+M9IcXy/lsb0YnRhwrIWITmVtj0NhN7v5HvytAM7ywx9dCEKKPbWSX 3A09jrnzwjeIYlvUJYP3NfgxUMMbAUi6ytdVXfx5RBrziwRAZ/qH4LY+2aK4sdjgN3/D iN3HcHpuHtfDGVPCAXw+410gZTvAz54o+CiNRulslZACzFlgcPPqsK4udzr2S4Q68qav itTfhniyRNlTWbgmxBZHOEbBy1f7Z72hdGGTYr50PNPrbNKGgpVzohqt9y2dqTuL1PM2 tLrDDGkxeyNQvsSDJoiJMpj2kXNVigL4rIcSGcy1K7i6+vldJ+8PViXmEwX8OZh7UiWm XMJg== X-Gm-Message-State: AOAM530puqjAQ4/P7UvvdpxCBFX0OX36IeJnFFN9oIPn3ankNbQGAqfp OhBcgdb2vg9lPJSrAB3XsjG/Tg== X-Google-Smtp-Source: ABdhPJxsRDiB9Cmd4RBvwcAVlsNREksrc5DeaHBTsqOLvTOzxhwP3KSND6D79qgwwNbHk0X4b1zbkQ== X-Received: by 2002:a7b:cb09:: with SMTP id u9mr1544904wmj.109.1604446950482; Tue, 03 Nov 2020 15:42:30 -0800 (PST) Return-Path: Received: from vanye (188.30.139.166.threembb.co.uk. [188.30.139.166]) by smtp.gmail.com with ESMTPSA id i6sm283964wma.42.2020.11.03.15.42.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 03 Nov 2020 15:42:30 -0800 (PST) Date: Tue, 3 Nov 2020 23:42:26 +0000 From: "Leif Lindholm" To: Heinrich Schuchardt Cc: Ray Ni , Zhichao Gao , Ard Biesheuvel , devel@edk2.groups.io, Laszlo Ersek , Liming Gao Subject: Re: [edk] [PATCH] ShellPkg/edit: allow non-ASCII characters in edit Message-ID: <20201103234226.GR1664@vanye> References: <20201102231114.31099-1-xypron.glpk@gmx.de> MIME-Version: 1.0 In-Reply-To: <20201102231114.31099-1-xypron.glpk@gmx.de> User-Agent: Mutt/1.10.1 (2018-07-13) Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit Ray, Zhichao - please comment. I'm not sure what the correct answer is here, but current behaviour seems suboptimal. Best Regards, Leif On Tue, Nov 03, 2020 at 00:11:14 +0100, Heinrich Schuchardt wrote: > REF: https://bugzilla.tianocore.org/show_bug.cgi?id=2339 > > Currently it is not possible to add letters outside the range > U+0020 - U+007F to a file using the edit command. > > In Unicode the following are control characters: > > * U+0000—U+001F (C0 controls) > * U+007F (DEL) > * U+0080—U+009F (C1 controls). > > For reference see: > > * https://unicode.org/charts/PDF/U0000.pdf > * https://unicode.org/charts/PDF/U0080.pdf > > So the characters we should exclude from the file buffer are: > U+0000 - U+001f, U+007f - U009F > > Allow all other characters as input to the file buffer in Unicode mode. > Allow only ASCII characters as input in ASCII mode. > > When saving a file in ASCII mode replace non-ASCII characters by a question > mark ('?'). > > For editing texts with double width characters (e.g. Japanese) further > adjustments will be needed. > > Signed-off-by: Heinrich Schuchardt > --- > Resent. > Original message https://edk2.groups.io/g/devel/message/51205 > --- > .../Edit/FileBuffer.c | 21 +++++++++++++++---- > 1 file changed, 17 insertions(+), 4 deletions(-) > > diff --git a/ShellPkg/Library/UefiShellDebug1CommandsLib/Edit/FileBuffer.c b/ShellPkg/Library/UefiShellDebug1CommandsLib/Edit/FileBuffer.c > index 5659ec981054..3923b83670fa 100644 > --- a/ShellPkg/Library/UefiShellDebug1CommandsLib/Edit/FileBuffer.c > +++ b/ShellPkg/Library/UefiShellDebug1CommandsLib/Edit/FileBuffer.c > @@ -1360,6 +1360,8 @@ GetNewLine ( > /** > Change a Unicode string to an ASCII string. > > + Non-ASCII characters are replaced by '?'. > + > @param[in] UStr The Unicode string. > @param[in] Length The maximum size of AStr. > @param[out] AStr ASCII string to pass out. > @@ -1378,8 +1380,12 @@ UnicodeToAscii ( > // > // just buffer copy, not character copy > // > - for (Index = 0; Index < Length; Index++) { > - *AStr++ = (CHAR8) *UStr++; > + for (Index = 0; Index < Length; Index++, UStr++) { > + if (*UStr < 0x80) { > + *AStr++ = (CHAR8) *UStr; > + } else { > + *AStr++ = '?'; > + } > } > > return Index; > @@ -2154,9 +2160,16 @@ FileBufferDoCharInput ( > > default: > // > - // DEAL WITH ASCII CHAR, filter out thing like ctrl+f > + // Do not add Unicode control characters to the file buffer: > // > - if (Char > 127 || Char < 32) { > + // * U+0000-U+001f (C0 controls) > + // * U+007f (DEL) > + // * U+0080-U+009f (C1 controls) > + // > + // Do not add non-ASCII characters in ASCII mode. > + // > + if (Char < 0x20 || (Char >= 0x7f && > + (Char <= 0x9f || FileBuffer.FileType == FileTypeAscii))) { > Status = StatusBarSetStatusString (L"Unknown Command"); > } else { > Status = FileBufferAddChar (Char); > -- > 2.28.0 >