From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mout02.posteo.de (mout02.posteo.de [185.67.36.66]) by mx.groups.io with SMTP id smtpd.web11.15564.1676739858665501756 for ; Sat, 18 Feb 2023 09:04:19 -0800 Authentication-Results: mx.groups.io; dkim=pass header.i=@posteo.de header.s=2017 header.b=ViacrOyE; spf=pass (domain: posteo.de, ip: 185.67.36.66, mailfrom: mhaeuser@posteo.de) Received: from submission (posteo.de [185.67.36.169]) by mout02.posteo.de (Postfix) with ESMTPS id 3C964240407 for ; Sat, 18 Feb 2023 18:04:15 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=posteo.de; s=2017; t=1676739856; bh=T/bvI+hLbSzA2iySo2NoHxZBuKIrI5dhRVvNjLS4Fr0=; h=From:Subject:Date:Cc:To:From; b=ViacrOyEpY9t805m8yc84weQ+aWtpCR/CPnvtAjmfuD6f0+TY8HYyrkEtSJ6S/5Of z33CKxYbTGPlpVCZEYtbbYUrBi8ZJi/8n02J3jd3R7fzL8vcIVynZBYVBasrGkU0SJ z/sBtBDystUxOgzT3xVNlEjHttaNsH7bUoppf8ssYRAScXnQA37O5UaVCCDFnRbYuI hjfxEcObodO1QF81wzjm7tFpf2Oz35f6MKls599GQ3uYl8BNBBaTH91SYac1tNRBd7 J6t/8q7VhtxFpWZX3gacXcH37gWvFjNptxaZdIWn/tnAZ5G/BelmRI792/SWnktN0w jkijQWYy0wmqA== Received: from customer (localhost [127.0.0.1]) by submission (posteo.de) with ESMTPSA id 4PJw5C28Xhz9rxD; Sat, 18 Feb 2023 18:04:15 +0100 (CET) From: =?UTF-8?B?TWFydmluIEjDpHVzZXI=?= Message-Id: <4EB4450D-5FBE-4303-92A1-AA8E825D659C@posteo.de> Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3731.400.51.1.1\)) Subject: Re: [PATCH 1/1] Ext4Pkg: Move unicode collation initialization to Start() Date: Sat, 18 Feb 2023 17:04:04 +0000 In-Reply-To: Cc: edk2-devel-groups-io , Ard Biesheuvel To: Pedro Falcato References: <20230217201427.139581-1-pedro.falcato@gmail.com> Content-Type: multipart/alternative; boundary="Apple-Mail=_E341B60F-36F6-4D41-8169-58D14D9910AC" --Apple-Mail=_E341B60F-36F6-4D41-8169-58D14D9910AC Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii > On 18. Feb 2023, at 16:14, Pedro Falcato = wrote: >=20 > Proper collation (and proper unicode handling in general) is AIUI > pretty hard and involves a lot of tables, etc. > So it makes total sense to me why one wants to "dynamically link" this > in using protocols instead of statically linking it everywhere. >=20 > Should be noted that EnglishDxe makes a mockery out of unicode > collation and does plain ASCII operations only :)) Yes, and so does FatPei [1]. Also, EnglishDxe has a clear (and = unvalidated!) assumption that FAT file names are strictly 7-bit ASCII = [2]. Without doing any proper research, Wikipedia seems to suggest that = even EASCII could be problematic for FAT file names. Under the = assumption that FAT will only ever give us 7-bit ASCII file names, it's = both unrealistic to assume someone will implement proper Unicode = collation, but also to fear any observable deviations from FatDxe even = if this was the case and Ext4Dxe used ASCII-only collation (because the = extended Unicode collation support would be no-op w.r.t. FAT). The = ASCII-only support is pretty tiny, it could live in BaseLib without a = problem (and not be duplicated into FatPkg...). In my opinion, merge this to fix the bug and at some later point change = this to static linking (maybe alongside FatDxe, but definitely = deduplicate FatPei). [1] = https://github.com/tianocore/edk2/tree/02fcfdce1e5ce86f1951191883e7e30de5a= a08be/FatPei/FatLiteLib.c#L341-L365 = [2] = https://github.com/tianocore/edk2/tree/02fcfdce1e5ce86f1951191883e7e30de5a= a08be/MdeModulePkg/Universal/Disk/UnicodeCollation/EnglishDxe/UnicodeColla= tionEng.c#L386-L406 = > I did think about that, but then decided against it since I felt that > initializing multiple times would make little logical sense. I get it, but moving it inside would be more consistent with similar = code and remove the risk for forgetting the check if it was ever used = elsewhere (I know, it probably won't be, but this is a design question). Best regards, Marvin= --Apple-Mail=_E341B60F-36F6-4D41-8169-58D14D9910AC Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=us-ascii
On 18. Feb = 2023, at 16:14, Pedro Falcato <pedro.falcato@gmail.com> = wrote:

Proper collation (and = proper unicode handling in general) is AIUI
pretty hard and involves a lot of tables, = etc.
So it makes total sense = to me why one wants to "dynamically link" this
in using protocols instead of statically = linking it everywhere.

Should = be noted that EnglishDxe makes a mockery out of unicode
collation and does plain ASCII operations = only :))

Yes, and so does FatPei = [1]. Also, EnglishDxe has a clear (and unvalidated!) assumption that FAT = file names are strictly 7-bit ASCII [2]. Without doing any proper = research, Wikipedia seems to suggest that even EASCII could be = problematic for FAT file names. Under the assumption that FAT will only = ever give us 7-bit ASCII file names, it's both unrealistic to assume = someone will implement proper Unicode collation, but also to fear any = observable deviations from FatDxe even if this was the case and Ext4Dxe = used ASCII-only collation (because the extended Unicode collation = support would be no-op w.r.t. FAT). The ASCII-only support is pretty = tiny, it could live in BaseLib without a problem (and not be duplicated = into FatPkg...).

In my opinion, merge this to = fix the bug and at some later point change this to static linking (maybe = alongside FatDxe, but definitely deduplicate = FatPei).


<= div>[2] https://github.com/tianocore/edk2/tree/02fcfdc= e1e5ce86f1951191883e7e30de5aa08be/MdeModulePkg/Univ= ersal/Disk/UnicodeCollation/EnglishDxe/UnicodeCollationEng.c#L386-L406=

I did = think about that, but then decided against it since I felt = that
initializing multiple = times would make little logical sense.

I get it, = but moving it inside would be more consistent with similar code and = remove the risk for forgetting the check if it was ever used elsewhere = (I know, it probably won't be, but this is a design = question).

Best = regards,
Marvin
= --Apple-Mail=_E341B60F-36F6-4D41-8169-58D14D9910AC--