public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
From: Tim Lewis <tim.lewis@insyde.com>
To: Michael Kinney <michael.d.kinney@intel.com>,
	"edk2-devel@lists.01.org" <edk2-devel@lists.01.org>
Cc: Jaben Carsey <jaben.carsey@intel.com>,
	Kevin W Shaw <kevin.w.shaw@intel.com>
Subject: Re: [edk2-UniSpecification PATCH] Allow .uni files on disk to be UTF-8 without a BOM
Date: Wed, 26 Apr 2017 16:15:22 +0000	[thread overview]
Message-ID: <7236196A5DF6C040855A6D96F556A53F576347@msmail.insydesw.com.tw> (raw)
In-Reply-To: <1493168839-11708-2-git-send-email-michael.d.kinney@intel.com>

Mike --

This breaks our existing build tools, which assume that a file without a BOM is UTF-16. 

Tim

-----Original Message-----
From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of Michael Kinney
Sent: Tuesday, April 25, 2017 6:07 PM
To: edk2-devel@lists.01.org
Cc: Jaben Carsey <jaben.carsey@intel.com>; Kevin W Shaw <kevin.w.shaw@intel.com>
Subject: [edk2] [edk2-UniSpecification PATCH] Allow .uni files on disk to be UTF-8 without a BOM

https://bugzilla.tianocore.org/show_bug.cgi?id=507

Cc: Jaben Carsey <jaben.carsey@intel.com>
Cc: Yonghong Zhu <yonghong.zhu@intel.com>
Cc: Kevin W Shaw <kevin.w.shaw@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Michael Kinney <michael.d.kinney@intel.com>
---
 2_unicode_strings_file_format.md |  9 ++++++---
 README.md                        | 27 ++++++++++++++-------------
 2 files changed, 20 insertions(+), 16 deletions(-)

diff --git a/2_unicode_strings_file_format.md b/2_unicode_strings_file_format.md
index 0150c85..7a4a019 100644
--- a/2_unicode_strings_file_format.md
+++ b/2_unicode_strings_file_format.md
@@ -33,7 +33,8 @@
 
 EDK II Unicode files are used for mapping token names to localized strings that  are identified by an RFC4646 language code. The format for storing EDK II -Unicode files is UTF-16LE. The character content must be UCS-2.
+Unicode files on disk is UTF-8 (without a BOM character) or UTF-16LE 
+(with a BOM character). The character content must be UCS-2.
 
 Strings ends are determined by the first of the following items found:
 
@@ -44,11 +45,13 @@ Strings ends are determined by the first of the following items found:
 
 Comments may appear anywhere within the string file.
 
-All the files must begin with a Unicode BOM character.
+All UTF-16LE files must begin with a Unicode BOM character.
+All UTF-8 files must not begin with a Unicode BOM character.
 
 **********
 **NOTE:** Please make sure you select an editor that supports UCS-2 characters -that can be stored in a UTF-16LE file.
+that can be stored in either a UTF-8 (without a BOM character) or a 
+UTF-16LE file (with a BOM character).
 **********
 
 ## 2.1 Common EBNF
diff --git a/README.md b/README.md
index 63842a1..015aef1 100644
--- a/README.md
+++ b/README.md
@@ -77,16 +77,17 @@ Copyright (c) 2016-2017, Intel Corporation. All rights reserved.
 
 ### Revision History
 
-| Revision          | Description                                                                              | Date            |
-| ----------------- | ---------------------------------------------------------------------------------------- | --------------- |
-| 1.0               | Initial Release.                                                                         | February 2014   |
-| 1.1               | Updated EBNF to follow syntax specified in EBNF by the ANTLR project.                    | August 2014     |
-|                   | Added content related to EDK II Meta-Data Unicode files.                                 |                 |
-|                   | Restructured document.                                                                   |                 |
-|                   | Removed security and C format GUID definitions, not required for HII or other UNI files. |                 |
-|                   | Removed invalid escape code sequences.                                                   |                 |
-| 1.2               | Added optional font formatting                                                           | September 2014  |
-| 1.2 Errata A      | Correct misspelling of: `STR_PROPERTIES_MODULE_NAME`                                     | April 2015      |
-| 1.3               | Added: Syntax for non-ascii characters inside quoted strings.                            | March 2016      |
-|                   | Removed: Info on specific consumers (.INF & .DEC) removed.                               |                 |
-| 1.4               | Convert to GitBook format                                                                | March 2017      |
+| Revision          | Description                                                                                                            | Date            |
+| ----------------- | ---------------------------------------------------------------------------------------------------------------------- | --------------- |
+| 1.0               | Initial Release.                                                                                                       | February 2014   |
+| 1.1               | Updated EBNF to follow syntax specified in EBNF by the ANTLR project.                                                  | August 2014     |
+|                   | Added content related to EDK II Meta-Data Unicode files.                                                               |                 |
+|                   | Restructured document.                                                                                                 |                 |
+|                   | Removed security and C format GUID definitions, not required for HII or other UNI files.                               |                 |
+|                   | Removed invalid escape code sequences.                                                                                 |                 |
+| 1.2               | Added optional font formatting                                                                                         | September 2014  |
+| 1.2 Errata A      | Correct misspelling of: `STR_PROPERTIES_MODULE_NAME`                                                                   | April 2015      |
+| 1.3               | Added: Syntax for non-ascii characters inside quoted strings.                                                          | March 2016      |
+|                   | Removed: Info on specific consumers (.INF & .DEC) removed.                                                             |                 |
+| 1.4               | Convert to GitBook format                                                                                              | April 2017      |
+|                   | [#507](https://bugzilla.tianocore.org/show_bug.cgi?id=507) UNI Spec: Clarify that .uni files maybe UTF-8 without a BOM |                 |
--
2.6.3.windows.1

_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


  reply	other threads:[~2017-04-26 16:15 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-26  1:07 [edk2-UniSpecification PATCH] Allow .uni files on disk to be UTF-8 without a BOM Michael Kinney
2017-04-26  1:07 ` Michael Kinney
2017-04-26 16:15   ` Tim Lewis [this message]
2017-04-26 17:44     ` Carsey, Jaben
2017-04-26 17:53       ` Tim Lewis
2017-04-26 18:25         ` Kinney, Michael D
2017-04-26 18:11     ` Kinney, Michael D
2017-04-26 18:34       ` Tim Lewis
2017-04-26 18:46         ` Kinney, Michael D
2017-04-26 18:53           ` Tim Lewis
2017-04-26 22:47             ` Kinney, Michael D
2017-04-26 23:13               ` Tim Lewis
2017-04-27  0:02                 ` Kinney, Michael D
2017-04-27  0:26                   ` Tim Lewis
2017-04-28 16:47                     ` Tim Lewis
2017-04-28 17:22                       ` Kinney, Michael D
2017-04-26  2:10 ` Zhu, Yonghong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7236196A5DF6C040855A6D96F556A53F576347@msmail.insydesw.com.tw \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox