From: "Joe Richey" <joerichey@google.com>
To: devel@edk2.groups.io
Cc: Bob Feng <bob.c.feng@intel.com>,
Liming Gao <liming.gao@intel.com>,
Yonghong Zhu <yonghong.zhu@intel.com>
Subject: [PATCH] BaseTools: VfrCompile/Pccts: Fix invalid bytes
Date: Fri, 10 May 2019 21:24:01 -0700 [thread overview]
Message-ID: <20190511042401.115133-1-joerichey@google.com> (raw)
Three text files have invalid ASCII bytes, this can mess up tooling
that trys to operate on the repository, which will accidentally
classify them as binary data.
https://github.com/josephlr/edk2/tree/format
Cc: Bob Feng <bob.c.feng@intel.com>
Cc: Liming Gao <liming.gao@intel.com>
Cc: Yonghong Zhu <yonghong.zhu@intel.com>
Signed-off-by: Joe Richey <joerichey@google.com>
---
BaseTools/Source/C/VfrCompile/Pccts/KNOWN_PROBLEMS.txt | 2 +-
BaseTools/Source/C/VfrCompile/Pccts/antlr/antlr1.txt | 78 ++++++++++----------
BaseTools/Source/C/VfrCompile/Pccts/dlg/dlg1.txt | 6 +-
3 files changed, 43 insertions(+), 43 deletions(-)
diff --git a/BaseTools/Source/C/VfrCompile/Pccts/KNOWN_PROBLEMS.txt b/BaseTools/Source/C/VfrCompile/Pccts/KNOWN_PROBLEMS.txt
index 539cf775257b..f073e620ab68 100644
--- a/BaseTools/Source/C/VfrCompile/Pccts/KNOWN_PROBLEMS.txt
+++ b/BaseTools/Source/C/VfrCompile/Pccts/KNOWN_PROBLEMS.txt
@@ -40,7 +40,7 @@
An bug (or at least an oddity) is that a reference to LT(1), LA(1),
or LATEXT(1) in an action which immediately follows a token match
in a rule refers to the token matched, not the token which is in
- the lookahead buffer. Consider:\x13
+ the lookahead buffer. Consider:
r : abc <<action alpha>> D <<action beta>> E;
diff --git a/BaseTools/Source/C/VfrCompile/Pccts/antlr/antlr1.txt b/BaseTools/Source/C/VfrCompile/Pccts/antlr/antlr1.txt
index 4a7d22e7f239..140b064217b7 100644
--- a/BaseTools/Source/C/VfrCompile/Pccts/antlr/antlr1.txt
+++ b/BaseTools/Source/C/VfrCompile/Pccts/antlr/antlr1.txt
@@ -9,48 +9,48 @@ NAME
antlr - ANother Tool for Language Recognition
SYNTAX
- antlr [_\bo_\bp_\bt_\bi_\bo_\bn_\bs] _\bg_\br_\ba_\bm_\bm_\ba_\br__\bf_\bi_\bl_\be_\bs
+ antlr [options] grammar_files
DESCRIPTION
- _\bA_\bn_\bt_\bl_\br converts an extended form of context-free grammar into
+ Antlr converts an extended form of context-free grammar into
a set of C functions which directly implement an efficient
form of deterministic recursive-descent LL(k) parser.
Context-free grammars may be augmented with predicates to
allow semantics to influence parsing; this allows a form of
context-sensitive parsing. Selective backtracking is also
available to handle non-LL(k) and even non-LALR(k) con-
- structs. _\bA_\bn_\bt_\bl_\br also produces a definition of a lexer which
+ structs. Antlr also produces a definition of a lexer which
can be automatically converted into C code for a DFA-based
- lexer by _\bd_\bl_\bg. Hence, _\ba_\bn_\bt_\bl_\br serves a function much like that
- of _\by_\ba_\bc_\bc, however, it is notably more flexible and is more
- integrated with a lexer generator (_\ba_\bn_\bt_\bl_\br directly generates
- _\bd_\bl_\bg code, whereas _\by_\ba_\bc_\bc and _\bl_\be_\bx are given independent
- descriptions). Unlike _\by_\ba_\bc_\bc which accepts LALR(1) grammars,
- _\ba_\bn_\bt_\bl_\br accepts LL(k) grammars in an extended BNF notation -
+ lexer by dlg. Hence, antlr serves a function much like that
+ of yacc, however, it is notably more flexible and is more
+ integrated with a lexer generator (antlr directly generates
+ dlg code, whereas yacc and lex are given independent
+ descriptions). Unlike yacc which accepts LALR(1) grammars,
+ antlr accepts LL(k) grammars in an extended BNF notation -
which eliminates the need for precedence rules.
- Like _\by_\ba_\bc_\bc grammars, _\ba_\bn_\bt_\bl_\br grammars can use automatically-
+ Like yacc grammars, antlr grammars can use automatically-
maintained symbol attribute values referenced as dollar
- variables. Further, because _\ba_\bn_\bt_\bl_\br generates top-down
+ variables. Further, because antlr generates top-down
parsers, arbitrary values may be inherited from parent rules
- (passed like function parameters). _\bA_\bn_\bt_\bl_\br also has a mechan-
+ (passed like function parameters). Antlr also has a mechan-
ism for creating and manipulating abstract-syntax-trees.
- There are various other niceties in _\ba_\bn_\bt_\bl_\br, including the
+ There are various other niceties in antlr, including the
ability to spread one grammar over multiple files or even
multiple grammars in a single file, the ability to generate
a version of the grammar with actions stripped out (for
documentation purposes), and lots more.
OPTIONS
- -ck _\bn
- Use up to _\bn symbols of lookahead when using compressed
+ -ck n
+ Use up to n symbols of lookahead when using compressed
(linear approximation) lookahead. This type of looka-
head is very cheap to compute and is attempted before
full LL(k) lookahead, which is of exponential complex-
ity in the worst case. In general, the compressed loo-
- kahead can be much deeper (e.g, -ck 10) _\bt_\bh_\ba_\bn _\bt_\bh_\be _\bf_\bu_\bl_\bl
- _\bl_\bo_\bo_\bk_\ba_\bh_\be_\ba_\bd (_\bw_\bh_\bi_\bc_\bh _\bu_\bs_\bu_\ba_\bl_\bl_\by _\bm_\bu_\bs_\bt _\bb_\be _\bl_\be_\bs_\bs _\bt_\bh_\ba_\bn _\b4).
+ kahead can be much deeper (e.g, -ck 10) than the full
+ lookahead (which usually must be less than 4).
-CC Generate C++ output from both ANTLR and DLG.
@@ -86,20 +86,20 @@ OPTIONS
-ga Generate ANSI-compatible code (default case). This has
not been rigorously tested to be ANSI XJ11 C compliant,
- but it is close. The normal output of _\ba_\bn_\bt_\bl_\br is
+ but it is close. The normal output of antlr is
currently compilable under both K&R, ANSI C, and C++-
- this option does nothing because _\ba_\bn_\bt_\bl_\br generates a
+ this option does nothing because antlr generates a
bunch of #ifdef's to do the right thing depending on
the language.
- -gc Indicates that _\ba_\bn_\bt_\bl_\br should generate no C code, i.e.,
+ -gc Indicates that antlr should generate no C code, i.e.,
only perform analysis on the grammar.
- -gd C code is inserted in each of the _\ba_\bn_\bt_\bl_\br generated pars-
+ -gd C code is inserted in each of the antlr generated pars-
ing functions to provide for user-defined handling of a
detailed parse trace. The inserted code consists of
calls to the user-supplied macros or functions called
- zzTRACEIN and zzTRACEOUT. The only argument is a _\bc_\bh_\ba_\br
+ zzTRACEIN and zzTRACEOUT. The only argument is a char
* pointing to a C-style string which is the grammar
rule recognized by the current parsing function. If no
definition is given for the trace functions, upon rule
@@ -110,17 +110,17 @@ OPTIONS
-gh Generate stdpccts.h for non-ANTLR-generated files to
include. This file contains all defines needed to
- describe the type of parser generated by _\ba_\bn_\bt_\bl_\br (e.g.
+ describe the type of parser generated by antlr (e.g.
how much lookahead is used and whether or not trees are
constructed) and contains the header action specified
by the user.
-gk Generate parsers that delay lookahead fetches until
- needed. Without this option, _\ba_\bn_\bt_\bl_\br generates parsers
- which always have _\bk tokens of lookahead available.
+ needed. Without this option, antlr generates parsers
+ which always have k tokens of lookahead available.
-gl Generate line info about grammar actions in C parser of
- the form # _\bl_\bi_\bn_\be "_\bf_\bi_\bl_\be" which makes error messages from
+ the form # line "file" which makes error messages from
the C/C++ compiler make more sense as they will point
into the grammar file not the resulting C file.
Debugging is easier as well, because you will step
@@ -128,18 +128,18 @@ OPTIONS
-gs Do not generate sets for token expression lists;
instead generate a ||-separated sequence of
- LA(1)==_\bt_\bo_\bk_\be_\bn__\bn_\bu_\bm_\bb_\be_\br. The default is to generate sets.
+ LA(1)==token_number. The default is to generate sets.
-gt Generate code for Abstract-Syntax Trees.
-gx Do not create the lexical analyzer files (dlg-related).
This option should be given when the user wishes to
provide a customized lexical analyzer. It may also be
- used in _\bm_\ba_\bk_\be scripts to cause only the parser to be
+ used in make scripts to cause only the parser to be
rebuilt when a change not affecting the lexical struc-
ture is made to the input grammars.
- -k _\bn Set k of LL(k) to _\bn; i.e. set tokens of look-ahead
+ -k n Set k of LL(k) to n; i.e. set tokens of look-ahead
(default==1).
-o dir
@@ -171,9 +171,9 @@ OPTIONS
release with option -pr on. Context computation is off
by default.
- -rl _\bn
+ -rl n
Limit the maximum number of tree nodes used by grammar
- analysis to _\bn. Occasionally, _\ba_\bn_\bt_\bl_\br is unable to
+ analysis to n. Occasionally, antlr is unable to
analyze a grammar submitted by the user. This rare
situation can only occur when the grammar is large and
the amount of lookahead is greater than one. A non-
@@ -184,14 +184,14 @@ OPTIONS
the number of calls to the full LL(k) algorithm. An
error message will be displayed, if this limit is
reached, which indicates the grammar construct being
- analyzed when _\ba_\bn_\bt_\bl_\br hit a non-linearity. Use this
- option if _\ba_\bn_\bt_\bl_\br seems to go out to lunch and your disk
- start thrashing; try _\bn=10000 to start. Once the
+ analyzed when antlr hit a non-linearity. Use this
+ option if antlr seems to go out to lunch and your disk
+ start thrashing; try n=10000 to start. Once the
offending construct has been identified, try to remove
- the ambiguity that _\ba_\bn_\bt_\bl_\br was trying to overcome with
+ the ambiguity that antlr was trying to overcome with
large lookahead analysis. The introduction of (...)?
backtracking blocks eliminates some of these problems -
- _\ba_\bn_\bt_\bl_\br does not analyze alternatives that begin with
+ antlr does not analyze alternatives that begin with
(...)? (it simply backtracks, if necessary, at run
time).
@@ -208,7 +208,7 @@ OPTIONS
as the parser file.
SPECIAL CONSIDERATIONS
- _\bA_\bn_\bt_\bl_\br works... we think. There is no implicit guarantee of
+ Antlr works... we think. There is no implicit guarantee of
anything. We reserve no legal rights to the software known
as the Purdue Compiler Construction Tool Set (PCCTS) - PCCTS
is in the public domain. An individual or company may do
@@ -234,7 +234,7 @@ FILES
output C++ parser when C++ mode is used.
parser.dlg
- output _\bd_\bl_\bg lexical analyzer.
+ output dlg lexical analyzer.
err.c
token string array, error sets and error support rou-
@@ -251,7 +251,7 @@ FILES
erated by default. Not used in C++ mode.
tokens.h
- output #_\bd_\be_\bf_\bi_\bn_\be_\bs for tokens used and function prototypes
+ output #defines for tokens used and function prototypes
for functions generated for rules.
diff --git a/BaseTools/Source/C/VfrCompile/Pccts/dlg/dlg1.txt b/BaseTools/Source/C/VfrCompile/Pccts/dlg/dlg1.txt
index 06b320de2abb..5ea5e933c808 100644
--- a/BaseTools/Source/C/VfrCompile/Pccts/dlg/dlg1.txt
+++ b/BaseTools/Source/C/VfrCompile/Pccts/dlg/dlg1.txt
@@ -9,14 +9,14 @@ NAME
dlg - DFA Lexical Analyzer Generator
SYNTAX
- dlg [_\bo_\bp_\bt_\bi_\bo_\bn_\bs] _\bl_\be_\bx_\bi_\bc_\ba_\bl__\bs_\bp_\be_\bc [_\bo_\bu_\bt_\bp_\bu_\bt__\bf_\bi_\bl_\be]
+ dlg [options] lexical_spec [output_file]
DESCRIPTION
dlg is a tool that produces fast deterministic finite auto-
mata for recognizing regular expressions in input.
OPTIONS
- -CC Generate C++ output. The _\bo_\bu_\bt_\bp_\bu_\bt__\bf_\bi_\bl_\be is not specified
+ -CC Generate C++ output. The output_file is not specified
in this case.
-C[ level]
@@ -69,7 +69,7 @@ OPTIONS
in or send output to standard out.
SPECIAL CONSIDERATIONS
- _\bD_\bl_\bg works... we think. There is no implicit guarantee of
+ Dlg works... we think. There is no implicit guarantee of
anything. We reserve no legal rights to the software known
as the Purdue Compiler Construction Tool Set (PCCTS) - PCCTS
is in the public domain. An individual or company may do
--
2.21.0.1020.gf2820cf01a-goog
next reply other threads:[~2019-05-11 4:24 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-05-11 4:24 Joe Richey [this message]
2019-05-13 8:16 ` [PATCH] BaseTools: VfrCompile/Pccts: Fix invalid bytes Bob Feng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-list from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190511042401.115133-1-joerichey@google.com \
--to=devel@edk2.groups.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox