From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from msmail.insydesw.com.tw (ms.insydesw.com [211.75.113.220]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 51EF921951C95 for ; Wed, 26 Apr 2017 09:15:26 -0700 (PDT) Received: from msmail.insydesw.com.tw ([fe80::74f7:f173:f4aa:9a05]) by msmail.insydesw.com.tw ([fe80::74f7:f173:f4aa:9a05%11]) with mapi id 14.01.0438.000; Thu, 27 Apr 2017 00:15:23 +0800 From: Tim Lewis To: Michael Kinney , "edk2-devel@lists.01.org" CC: Jaben Carsey , Kevin W Shaw Thread-Topic: [edk2] [edk2-UniSpecification PATCH] Allow .uni files on disk to be UTF-8 without a BOM Thread-Index: AQHSvil74G8S0zh6EEy+8PD2lASjlqHWUHiAgAGDraA= Date: Wed, 26 Apr 2017 16:15:22 +0000 Message-ID: <7236196A5DF6C040855A6D96F556A53F576347@msmail.insydesw.com.tw> References: <1493168839-11708-1-git-send-email-michael.d.kinney@intel.com> <1493168839-11708-2-git-send-email-michael.d.kinney@intel.com> In-Reply-To: <1493168839-11708-2-git-send-email-michael.d.kinney@intel.com> Accept-Language: en-US, zh-TW X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [192.168.100.107] MIME-Version: 1.0 Subject: Re: [edk2-UniSpecification PATCH] Allow .uni files on disk to be UTF-8 without a BOM X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Apr 2017 16:15:27 -0000 Content-Language: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Mike -- This breaks our existing build tools, which assume that a file without a BO= M is UTF-16.=20 Tim -----Original Message----- From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of Mich= ael Kinney Sent: Tuesday, April 25, 2017 6:07 PM To: edk2-devel@lists.01.org Cc: Jaben Carsey ; Kevin W Shaw Subject: [edk2] [edk2-UniSpecification PATCH] Allow .uni files on disk to b= e UTF-8 without a BOM https://bugzilla.tianocore.org/show_bug.cgi?id=3D507 Cc: Jaben Carsey Cc: Yonghong Zhu Cc: Kevin W Shaw Contributed-under: TianoCore Contribution Agreement 1.1 Signed-off-by: Michael Kinney --- 2_unicode_strings_file_format.md | 9 ++++++--- README.md | 27 ++++++++++++++------------- 2 files changed, 20 insertions(+), 16 deletions(-) diff --git a/2_unicode_strings_file_format.md b/2_unicode_strings_file_form= at.md index 0150c85..7a4a019 100644 --- a/2_unicode_strings_file_format.md +++ b/2_unicode_strings_file_format.md @@ -33,7 +33,8 @@ =20 EDK II Unicode files are used for mapping token names to localized strings= that are identified by an RFC4646 language code. The format for storing E= DK II -Unicode files is UTF-16LE. The character content must be UCS-2. +Unicode files on disk is UTF-8 (without a BOM character) or UTF-16LE=20 +(with a BOM character). The character content must be UCS-2. =20 Strings ends are determined by the first of the following items found: =20 @@ -44,11 +45,13 @@ Strings ends are determined by the first of the followi= ng items found: =20 Comments may appear anywhere within the string file. =20 -All the files must begin with a Unicode BOM character. +All UTF-16LE files must begin with a Unicode BOM character. +All UTF-8 files must not begin with a Unicode BOM character. =20 ********** **NOTE:** Please make sure you select an editor that supports UCS-2 charac= ters -that can be stored in a UTF-16LE file. +that can be stored in either a UTF-8 (without a BOM character) or a=20 +UTF-16LE file (with a BOM character). ********** =20 ## 2.1 Common EBNF diff --git a/README.md b/README.md index 63842a1..015aef1 100644 --- a/README.md +++ b/README.md @@ -77,16 +77,17 @@ Copyright (c) 2016-2017, Intel Corporation. All rights = reserved. =20 ### Revision History =20 -| Revision | Description = | Date | -| ----------------- | ----------------------------------------------------= ------------------------------------ | --------------- | -| 1.0 | Initial Release. = | February 2014 | -| 1.1 | Updated EBNF to follow syntax specified in EBNF by t= he ANTLR project. | August 2014 | -| | Added content related to EDK II Meta-Data Unicode fi= les. | | -| | Restructured document. = | | -| | Removed security and C format GUID definitions, not = required for HII or other UNI files. | | -| | Removed invalid escape code sequences. = | | -| 1.2 | Added optional font formatting = | September 2014 | -| 1.2 Errata A | Correct misspelling of: `STR_PROPERTIES_MODULE_NAME`= | April 2015 | -| 1.3 | Added: Syntax for non-ascii characters inside quoted= strings. | March 2016 | -| | Removed: Info on specific consumers (.INF & .DEC) re= moved. | | -| 1.4 | Convert to GitBook format = | March 2017 | +| Revision | Description = | Date = | +| ----------------- | ----------------------------------------------------= ------------------------------------------------------------------ | ------= --------- | +| 1.0 | Initial Release. = | Februa= ry 2014 | +| 1.1 | Updated EBNF to follow syntax specified in EBNF by t= he ANTLR project. | August= 2014 | +| | Added content related to EDK II Meta-Data Unicode fi= les. | = | +| | Restructured document. = | = | +| | Removed security and C format GUID definitions, not = required for HII or other UNI files. | = | +| | Removed invalid escape code sequences. = | = | +| 1.2 | Added optional font formatting = | Septem= ber 2014 | +| 1.2 Errata A | Correct misspelling of: `STR_PROPERTIES_MODULE_NAME`= | April = 2015 | +| 1.3 | Added: Syntax for non-ascii characters inside quoted= strings. | March = 2016 | +| | Removed: Info on specific consumers (.INF & .DEC) re= moved. | = | +| 1.4 | Convert to GitBook format = | April = 2017 | +| | [#507](https://bugzilla.tianocore.org/show_bug.cgi?i= d=3D507) UNI Spec: Clarify that .uni files maybe UTF-8 without a BOM | = | -- 2.6.3.windows.1 _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel