public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
From: "Bob Feng" <bob.c.feng@intel.com>
To: Joe Richey <joerichey@google.com>,
	"devel@edk2.groups.io" <devel@edk2.groups.io>
Cc: "Gao, Liming" <liming.gao@intel.com>,
	"Zhu, Yonghong" <yonghong.zhu@intel.com>
Subject: Re: [PATCH] BaseTools: VfrCompile/Pccts: Fix invalid bytes
Date: Mon, 13 May 2019 08:16:53 +0000	[thread overview]
Message-ID: <08650203BA1BD64D8AD9B6D5D74A85D16010CBD2@SHSMSX101.ccr.corp.intel.com> (raw)
In-Reply-To: <20190511042401.115133-1-joerichey@google.com>

Hi Joe,

Please submit a Bugzilla (https://bugzilla.tianocore.org/) and add the BZ link to the commit message.

This patch looks good to me.

Thanks,
Bob

-----Original Message-----
From: Joe Richey [mailto:joerichey@google.com] 
Sent: Saturday, May 11, 2019 12:24 PM
To: devel@edk2.groups.io
Cc: Feng, Bob C <bob.c.feng@intel.com>; Gao, Liming <liming.gao@intel.com>; Zhu, Yonghong <yonghong.zhu@intel.com>
Subject: [PATCH] BaseTools: VfrCompile/Pccts: Fix invalid bytes

Three text files have invalid ASCII bytes, this can mess up tooling that trys to operate on the repository, which will accidentally classify them as binary data.

https://github.com/josephlr/edk2/tree/format

Cc: Bob Feng <bob.c.feng@intel.com>
Cc: Liming Gao <liming.gao@intel.com>
Cc: Yonghong Zhu <yonghong.zhu@intel.com>
Signed-off-by: Joe Richey <joerichey@google.com>
---
 BaseTools/Source/C/VfrCompile/Pccts/KNOWN_PROBLEMS.txt |  2 +-
 BaseTools/Source/C/VfrCompile/Pccts/antlr/antlr1.txt   | 78 ++++++++++----------
 BaseTools/Source/C/VfrCompile/Pccts/dlg/dlg1.txt       |  6 +-
 3 files changed, 43 insertions(+), 43 deletions(-)

diff --git a/BaseTools/Source/C/VfrCompile/Pccts/KNOWN_PROBLEMS.txt b/BaseTools/Source/C/VfrCompile/Pccts/KNOWN_PROBLEMS.txt
index 539cf775257b..f073e620ab68 100644
--- a/BaseTools/Source/C/VfrCompile/Pccts/KNOWN_PROBLEMS.txt
+++ b/BaseTools/Source/C/VfrCompile/Pccts/KNOWN_PROBLEMS.txt
@@ -40,7 +40,7 @@
     An bug (or at least an oddity) is that a reference to LT(1), LA(1),
     or LATEXT(1) in an action which immediately follows a token match
     in a rule refers to the token matched, not the token which is in
-    the lookahead buffer.  Consider:\x13
+    the lookahead buffer.  Consider:
 
         r : abc <<action alpha>> D <<action beta>> E;
 
diff --git a/BaseTools/Source/C/VfrCompile/Pccts/antlr/antlr1.txt b/BaseTools/Source/C/VfrCompile/Pccts/antlr/antlr1.txt
index 4a7d22e7f239..140b064217b7 100644
--- a/BaseTools/Source/C/VfrCompile/Pccts/antlr/antlr1.txt
+++ b/BaseTools/Source/C/VfrCompile/Pccts/antlr/antlr1.txt
@@ -9,48 +9,48 @@ NAME
      antlr - ANother Tool for Language Recognition
 
 SYNTAX
-     antlr [_\bo_\bp_\bt_\bi_\bo_\bn_\bs] _\bg_\br_\ba_\bm_\bm_\ba_\br__\bf_\bi_\bl_\be_\bs
+     antlr [options] grammar_files
 
 DESCRIPTION
-     _\bA_\bn_\bt_\bl_\br converts an extended form of context-free grammar into
+     Antlr converts an extended form of context-free grammar into
      a set of C functions which directly implement an efficient
      form of deterministic recursive-descent LL(k) parser.
      Context-free grammars may be augmented with predicates to
      allow semantics to influence parsing; this allows a form of
      context-sensitive parsing.  Selective backtracking is also
      available to handle non-LL(k) and even non-LALR(k) con-
-     structs.  _\bA_\bn_\bt_\bl_\br also produces a definition of a lexer which
+     structs.  Antlr also produces a definition of a lexer which
      can be automatically converted into C code for a DFA-based
-     lexer by _\bd_\bl_\bg.  Hence, _\ba_\bn_\bt_\bl_\br serves a function much like that
-     of _\by_\ba_\bc_\bc, however, it is notably more flexible and is more
-     integrated with a lexer generator (_\ba_\bn_\bt_\bl_\br directly generates
-     _\bd_\bl_\bg code, whereas _\by_\ba_\bc_\bc and _\bl_\be_\bx are given independent
-     descriptions).  Unlike _\by_\ba_\bc_\bc which accepts LALR(1) grammars,
-     _\ba_\bn_\bt_\bl_\br accepts LL(k) grammars in an extended BNF notation -
+     lexer by dlg.  Hence, antlr serves a function much like that
+     of yacc, however, it is notably more flexible and is more
+     integrated with a lexer generator (antlr directly generates
+     dlg code, whereas yacc and lex are given independent
+     descriptions).  Unlike yacc which accepts LALR(1) grammars,
+     antlr accepts LL(k) grammars in an extended BNF notation -
      which eliminates the need for precedence rules.
 
-     Like _\by_\ba_\bc_\bc grammars, _\ba_\bn_\bt_\bl_\br grammars can use automatically-
+     Like yacc grammars, antlr grammars can use automatically-
      maintained symbol attribute values referenced as dollar
-     variables.  Further, because _\ba_\bn_\bt_\bl_\br generates top-down
+     variables.  Further, because antlr generates top-down
      parsers, arbitrary values may be inherited from parent rules
-     (passed like function parameters).  _\bA_\bn_\bt_\bl_\br also has a mechan-
+     (passed like function parameters).  Antlr also has a mechan-
      ism for creating and manipulating abstract-syntax-trees.
 
-     There are various other niceties in _\ba_\bn_\bt_\bl_\br, including the
+     There are various other niceties in antlr, including the
      ability to spread one grammar over multiple files or even
      multiple grammars in a single file, the ability to generate
      a version of the grammar with actions stripped out (for
      documentation purposes), and lots more.
 
 OPTIONS
-     -ck _\bn
-          Use up to _\bn symbols of lookahead when using compressed
+     -ck n
+          Use up to n symbols of lookahead when using compressed
           (linear approximation) lookahead.  This type of looka-
           head is very cheap to compute and is attempted before
           full LL(k) lookahead, which is of exponential complex-
           ity in the worst case.  In general, the compressed loo-
-          kahead can be much deeper (e.g, -ck 10) _\bt_\bh_\ba_\bn _\bt_\bh_\be _\bf_\bu_\bl_\bl
-          _\bl_\bo_\bo_\bk_\ba_\bh_\be_\ba_\bd (_\bw_\bh_\bi_\bc_\bh _\bu_\bs_\bu_\ba_\bl_\bl_\by _\bm_\bu_\bs_\bt _\bb_\be _\bl_\be_\bs_\bs _\bt_\bh_\ba_\bn _\b4).
+          kahead can be much deeper (e.g, -ck 10) than the full
+          lookahead (which usually must be less than 4).
 
      -CC  Generate C++ output from both ANTLR and DLG.
 
@@ -86,20 +86,20 @@ OPTIONS
 
      -ga  Generate ANSI-compatible code (default case).  This has
           not been rigorously tested to be ANSI XJ11 C compliant,
-          but it is close.  The normal output of _\ba_\bn_\bt_\bl_\br is
+          but it is close.  The normal output of antlr is
           currently compilable under both K&R, ANSI C, and C++-
-          this option does nothing because _\ba_\bn_\bt_\bl_\br generates a
+          this option does nothing because antlr generates a
           bunch of #ifdef's to do the right thing depending on
           the language.
 
-     -gc  Indicates that _\ba_\bn_\bt_\bl_\br should generate no C code, i.e.,
+     -gc  Indicates that antlr should generate no C code, i.e.,
           only perform analysis on the grammar.
 
-     -gd  C code is inserted in each of the _\ba_\bn_\bt_\bl_\br generated pars-
+     -gd  C code is inserted in each of the antlr generated pars-
           ing functions to provide for user-defined handling of a
           detailed parse trace.  The inserted code consists of
           calls to the user-supplied macros or functions called
-          zzTRACEIN and zzTRACEOUT.  The only argument is a _\bc_\bh_\ba_\br
+          zzTRACEIN and zzTRACEOUT.  The only argument is a char
           * pointing to a C-style string which is the grammar
           rule recognized by the current parsing function.  If no
           definition is given for the trace functions, upon rule @@ -110,17 +110,17 @@ OPTIONS
 
      -gh  Generate stdpccts.h for non-ANTLR-generated files to
           include.  This file contains all defines needed to
-          describe the type of parser generated by _\ba_\bn_\bt_\bl_\br (e.g.
+          describe the type of parser generated by antlr (e.g.
           how much lookahead is used and whether or not trees are
           constructed) and contains the header action specified
           by the user.
 
      -gk  Generate parsers that delay lookahead fetches until
-          needed.  Without this option, _\ba_\bn_\bt_\bl_\br generates parsers
-          which always have _\bk tokens of lookahead available.
+          needed.  Without this option, antlr generates parsers
+          which always have k tokens of lookahead available.
 
      -gl  Generate line info about grammar actions in C parser of
-          the form # _\bl_\bi_\bn_\be "_\bf_\bi_\bl_\be" which makes error messages from
+          the form # line "file" which makes error messages from
           the C/C++ compiler make more sense as they will point
           into the grammar file not the resulting C file.
           Debugging is easier as well, because you will step @@ -128,18 +128,18 @@ OPTIONS
 
      -gs  Do not generate sets for token expression lists;
           instead generate a ||-separated sequence of
-          LA(1)==_\bt_\bo_\bk_\be_\bn__\bn_\bu_\bm_\bb_\be_\br.  The default is to generate sets.
+          LA(1)==token_number.  The default is to generate sets.
 
      -gt  Generate code for Abstract-Syntax Trees.
 
      -gx  Do not create the lexical analyzer files (dlg-related).
           This option should be given when the user wishes to
           provide a customized lexical analyzer.  It may also be
-          used in _\bm_\ba_\bk_\be scripts to cause only the parser to be
+          used in make scripts to cause only the parser to be
           rebuilt when a change not affecting the lexical struc-
           ture is made to the input grammars.
 
-     -k _\bn Set k of LL(k) to _\bn; i.e. set tokens of look-ahead
+     -k n Set k of LL(k) to n; i.e. set tokens of look-ahead
           (default==1).
 
      -o dir
@@ -171,9 +171,9 @@ OPTIONS
           release with option -pr on.  Context computation is off
           by default.
 
-     -rl _\bn
+     -rl n
           Limit the maximum number of tree nodes used by grammar
-          analysis to _\bn.  Occasionally, _\ba_\bn_\bt_\bl_\br is unable to
+          analysis to n.  Occasionally, antlr is unable to
           analyze a grammar submitted by the user.  This rare
           situation can only occur when the grammar is large and
           the amount of lookahead is greater than one.  A non- @@ -184,14 +184,14 @@ OPTIONS
           the number of calls to the full LL(k) algorithm.  An
           error message will be displayed, if this limit is
           reached, which indicates the grammar construct being
-          analyzed when _\ba_\bn_\bt_\bl_\br hit a non-linearity.  Use this
-          option if _\ba_\bn_\bt_\bl_\br seems to go out to lunch and your disk
-          start thrashing; try _\bn=10000 to start.  Once the
+          analyzed when antlr hit a non-linearity.  Use this
+          option if antlr seems to go out to lunch and your disk
+          start thrashing; try n=10000 to start.  Once the
           offending construct has been identified, try to remove
-          the ambiguity that _\ba_\bn_\bt_\bl_\br was trying to overcome with
+          the ambiguity that antlr was trying to overcome with
           large lookahead analysis.  The introduction of (...)?
           backtracking blocks eliminates some of these problems -
-          _\ba_\bn_\bt_\bl_\br does not analyze alternatives that begin with
+          antlr does not analyze alternatives that begin with
           (...)? (it simply backtracks, if necessary, at run
           time).
 
@@ -208,7 +208,7 @@ OPTIONS
           as the parser file.
 
 SPECIAL CONSIDERATIONS
-     _\bA_\bn_\bt_\bl_\br works...  we think.  There is no implicit guarantee of
+     Antlr works...  we think.  There is no implicit guarantee of
      anything.  We reserve no legal rights to the software known
      as the Purdue Compiler Construction Tool Set (PCCTS) - PCCTS
      is in the public domain.  An individual or company may do @@ -234,7 +234,7 @@ FILES
           output C++ parser when C++ mode is used.
 
      parser.dlg
-          output _\bd_\bl_\bg lexical analyzer.
+          output dlg lexical analyzer.
 
      err.c
           token string array, error sets and error support rou- @@ -251,7 +251,7 @@ FILES
           erated by default.  Not used in C++ mode.
 
      tokens.h
-          output #_\bd_\be_\bf_\bi_\bn_\be_\bs for tokens used and function prototypes
+          output #defines for tokens used and function prototypes
           for functions generated for rules.
 
 
diff --git a/BaseTools/Source/C/VfrCompile/Pccts/dlg/dlg1.txt b/BaseTools/Source/C/VfrCompile/Pccts/dlg/dlg1.txt
index 06b320de2abb..5ea5e933c808 100644
--- a/BaseTools/Source/C/VfrCompile/Pccts/dlg/dlg1.txt
+++ b/BaseTools/Source/C/VfrCompile/Pccts/dlg/dlg1.txt
@@ -9,14 +9,14 @@ NAME
      dlg - DFA Lexical Analyzer Generator
 
 SYNTAX
-     dlg [_\bo_\bp_\bt_\bi_\bo_\bn_\bs] _\bl_\be_\bx_\bi_\bc_\ba_\bl__\bs_\bp_\be_\bc [_\bo_\bu_\bt_\bp_\bu_\bt__\bf_\bi_\bl_\be]
+     dlg [options] lexical_spec [output_file]
 
 DESCRIPTION
      dlg is a tool that produces fast deterministic finite auto-
      mata for recognizing regular expressions in input.
 
 OPTIONS
-     -CC  Generate C++ output.  The _\bo_\bu_\bt_\bp_\bu_\bt__\bf_\bi_\bl_\be is not specified
+     -CC  Generate C++ output.  The output_file is not specified
           in this case.
 
      -C[ level]
@@ -69,7 +69,7 @@ OPTIONS
           in or send output to standard out.
 
 SPECIAL CONSIDERATIONS
-     _\bD_\bl_\bg works...  we think.  There is no implicit guarantee of
+     Dlg works...  we think.  There is no implicit guarantee of
      anything.  We reserve no legal rights to the software known
      as the Purdue Compiler Construction Tool Set (PCCTS) - PCCTS
      is in the public domain.  An individual or company may do
--
2.21.0.1020.gf2820cf01a-goog


      reply	other threads:[~2019-05-13  8:16 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-11  4:24 [PATCH] BaseTools: VfrCompile/Pccts: Fix invalid bytes Joe Richey
2019-05-13  8:16 ` Bob Feng [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=08650203BA1BD64D8AD9B6D5D74A85D16010CBD2@SHSMSX101.ccr.corp.intel.com \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox