public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
From: "Kinney, Michael D" <michael.d.kinney@intel.com>
To: "Carsey, Jaben" <jaben.carsey@intel.com>,
	"Gao, Liming" <liming.gao@intel.com>,
	"edk2-devel@lists.01.org" <edk2-devel@lists.01.org>,
	"Kinney, Michael D" <michael.d.kinney@intel.com>
Subject: Re: [RFC] Formalize source files to follow DOS format
Date: Mon, 21 May 2018 22:41:36 +0000	[thread overview]
Message-ID: <E92EE9817A31E24EB0585FDF735412F5B8A36DF4@ORSMSX113.amr.corp.intel.com> (raw)
In-Reply-To: <CB6E33457884FA40993F35157061515CA3D002A4@FMSMSX103.amr.corp.intel.com>

Liming,

We have a set of standard flags for tools that 
should always be present.

--help
-v
-q
--debug

We should also always have the program name,
description, version, and copyright.

Please see BaseTools/Scripts/BinToPcd.py as 
an example.

It might be useful to have a way to run this tool
on a single file when BaseTools/Scripts/PatchCheck.py
reports an issue.

Do you think it would be good to have one option to
scan path for file extensions that are documented as
DOS line endings so the extensions do not have to be
entered?

Mike


> -----Original Message-----
> From: edk2-devel [mailto:edk2-devel-
> bounces@lists.01.org] On Behalf Of Carsey, Jaben
> Sent: Monday, May 21, 2018 7:50 AM
> To: Gao, Liming <liming.gao@intel.com>; edk2-
> devel@lists.01.org
> Subject: Re: [edk2] [RFC] Formalize source files to
> follow DOS format
> 
> Liming,
> 
> One Pep8 thing.
> Can you change to use the with statement for the file
> read/write?
> 
> Other small thoughts.
> I think that FileList should be changed to a set as
> order is not important.
> Maybe wrapper the re.sub function with your own so all
> the .encode() are in one location?  As we move to python
> 3 we will have fewer changes to make.
> 
> 
> > -----Original Message-----
> > From: edk2-devel [mailto:edk2-devel-
> bounces@lists.01.org] On Behalf Of
> > Liming Gao
> > Sent: Sunday, May 20, 2018 9:52 PM
> > To: edk2-devel@lists.01.org
> > Subject: [edk2] [RFC] Formalize source files to follow
> DOS format
> >
> > FormatDosFiles.py is added to clean up dos source
> files. It bases on
> > the rules defined in EDKII C Coding Standards
> Specification.
> > 5.1.2 Do not use tab characters
> > 5.1.6 Only use CRLF (Carriage Return Line Feed) line
> endings.
> > 5.1.7 All files must end with CRLF
> > No trailing white space in one line. (To be added in
> spec)
> >
> > The source files in edk2 project with the below
> postfix are dos format.
> > .h .c .nasm .nasmb .asm .S .inf .dec .dsc .fdf .uni
> .asl .aslc .vfr .idf
> > .txt .bat .py
> >
> > The package maintainer can use this script to clean up
> all files in his
> > package. The prefer way is to create one patch per one
> package.
> >
> > Contributed-under: TianoCore Contribution Agreement
> 1.1
> > Signed-off-by: Liming Gao <liming.gao@intel.com>
> > ---
> >  BaseTools/Scripts/FormatDosFiles.py | 93
> > +++++++++++++++++++++++++++++++++++++
> >  1 file changed, 93 insertions(+)
> >  create mode 100644
> BaseTools/Scripts/FormatDosFiles.py
> >
> > diff --git a/BaseTools/Scripts/FormatDosFiles.py
> > b/BaseTools/Scripts/FormatDosFiles.py
> > new file mode 100644
> > index 0000000..c3a5476
> > --- /dev/null
> > +++ b/BaseTools/Scripts/FormatDosFiles.py
> > @@ -0,0 +1,93 @@
> > +# @file FormatDosFiles.py
> > +# This script format the source files to follow dos
> style.
> > +# It supports Python2.x and Python3.x both.
> > +#
> > +#  Copyright (c) 2018, Intel Corporation. All rights
> reserved.<BR>
> > +#
> > +#  This program and the accompanying materials
> > +#  are licensed and made available under the terms
> and conditions of the
> > BSD License
> > +#  which accompanies this distribution.  The full
> text of the license may be
> > found at
> > +#  http://opensource.org/licenses/bsd-license.php
> > +#
> > +#  THE PROGRAM IS DISTRIBUTED UNDER THE BSD LICENSE
> ON AN "AS IS"
> > BASIS,
> > +#  WITHOUT WARRANTIES OR REPRESENTATIONS OF ANY KIND,
> EITHER
> > EXPRESS OR IMPLIED.
> > +#
> > +
> > +#
> > +# Import Modules
> > +#
> > +import argparse
> > +import os
> > +import os.path
> > +import re
> > +import sys
> > +
> > +"""
> > +difference of string between python2 and python3:
> > +
> > +there is a large difference of string in python2 and
> python3.
> > +
> > +in python2,there are two type string,unicode string
> (unicode type) and 8-bit
> > string (str type).
> > +	us = u"abcd",
> > +	unicode string,which is internally stored as unicode
> code point.
> > +	s = "abcd",s = b"abcd",s = r"abcd",
> > +	all of them are 8-bit string,which is internally
> stored as bytes.
> > +
> > +in python3,a new type called bytes replace 8-bit
> string,and str type is
> > regarded as unicode string.
> > +	s = "abcd", s = u"abcd", s = r"abcd",
> > +	all of them are str type,which is internally stored
> unicode code point.
> > +	bs = b"abcd",
> > +	bytes type,which is interally stored as bytes
> > +
> > +in python2 ,the both type string can be mixed use,but
> in python3 it could
> > not,
> > +which means the pattern and content in re match
> should be the same type
> > in python3.
> > +in function FormatFile,it read file in binary mode so
> that the content is bytes
> > type,so the pattern should also be bytes type.
> > +As a result,I add encode() to make it compitable
> among python2 and
> > python3.
> > +
> > +difference of encode,decode in python2 and python3:
> > +the builtin function str.encode(encoding) and
> str.decode(encoding) are
> > used for convert between 8-bit string and unicode
> string.
> > +
> > +in python2
> > +	encode convert unicode type to str type.decode vice
> versa.default
> > encoding is ascii.
> > +	for example: s = us.encode()
> > +	but if the us is str type,the code will also work.it
> will be firstly convert
> > to unicode type,
> > +	in this situation,the call equals s =
> us.decode().encode().
> > +
> > +in python3
> > +	encode convert str type to bytes type,decode vice
> versa.default
> > encoding is utf8.
> > +	fpr example:
> > +	bs = s.encode(),only str type has encode method,so
> that won't be
> > used wrongly.decode is the same.
> > +
> > +in conclusion:
> > +	this code could work the same in python27 and
> python36
> > environment as far as the re pattern satisfy ascii
> character set.
> > +
> > +"""
> > +def FormatFiles():
> > +    parser = argparse.ArgumentParser()
> > +    parser.add_argument('path', nargs=1, help='The
> path for files to be
> > converted.')
> > +    parser.add_argument('extensions', nargs='+',
> help='File extensions filter.
> > (Example: .txt .c .h)')
> > +    args = parser.parse_args()
> > +    filelist = []
> > +    for dirpath, dirnames, filenames in
> os.walk(args.path[0]):
> > +        for filename in [f for f in filenames if
> any(f.endswith(ext) for ext in
> > args.extensions)]:
> > +            filelist.append(os.path.join(dirpath,
> filename))
> > +    for file in filelist:
> > +        fd = open(file, 'rb')
> > +        content = fd.read()
> > +        fd.close()
> > +        # Convert the line endings to CRLF
> > +        content = re.sub(r'([^\r])\n'.encode(),
> r'\1\r\n'.encode(), content)
> > +        content = re.sub(r'^\n'.encode(),
> r'\r\n'.encode(), content, flags =
> > re.MULTILINE)
> > +        # Add a new empty line if the file is not end
> with one
> > +        content = re.sub(r'([^\r\n])$'.encode(),
> r'\1\r\n'.encode(), content)
> > +        # Remove trailing white spaces
> > +        content = re.sub(r'[ \t]+(\r\n)'.encode(),
> r'\1'.encode(), content, flags =
> > re.MULTILINE)
> > +        # Replace '\t' with two spaces
> > +        content = re.sub('\t'.encode(), '
> '.encode(), content)
> > +        fd = open(file, 'wb')
> > +        fd.write(content)
> > +        fd.close()
> > +        print(file)
> > +
> > +if __name__ == "__main__":
> > +    sys.exit(FormatFiles())
> > \ No newline at end of file
> > --
> > 2.8.0.windows.1
> >
> > _______________________________________________
> > edk2-devel mailing list
> > edk2-devel@lists.01.org
> > https://lists.01.org/mailman/listinfo/edk2-devel
> _______________________________________________
> edk2-devel mailing list
> edk2-devel@lists.01.org
> https://lists.01.org/mailman/listinfo/edk2-devel


  reply	other threads:[~2018-05-21 22:41 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-21  4:51 [RFC] Formalize source files to follow DOS format Liming Gao
2018-05-21 14:50 ` Carsey, Jaben
2018-05-21 22:41   ` Kinney, Michael D [this message]
2018-05-21 22:43     ` Carsey, Jaben
2018-05-21 22:58       ` Kinney, Michael D
2018-05-24  8:35         ` Gao, Liming
2018-05-24 14:13           ` Carsey, Jaben
2018-05-25  2:24             ` Gao, Liming
2018-05-24  8:31   ` Gao, Liming
2018-05-24 14:13     ` Carsey, Jaben

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=E92EE9817A31E24EB0585FDF735412F5B8A36DF4@ORSMSX113.amr.corp.intel.com \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox