public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
From: "Gao, Liming" <liming.gao@intel.com>
To: "Carsey, Jaben" <jaben.carsey@intel.com>,
	"Kinney, Michael D" <michael.d.kinney@intel.com>
Cc: "edk2-devel@lists.01.org" <edk2-devel@lists.01.org>
Subject: Re: [RFC] Formalize source files to follow DOS format
Date: Fri, 25 May 2018 02:24:47 +0000	[thread overview]
Message-ID: <4A89E2EF3DFEDB4C8BFDE51014F606A14E2311F8@SHSMSX104.ccr.corp.intel.com> (raw)
In-Reply-To: <CB6E33457884FA40993F35157061515CA3D046E7@FMSMSX103.amr.corp.intel.com>

I get your point. We will update this script and send version 2. 

> -----Original Message-----
> From: Carsey, Jaben
> Sent: Thursday, May 24, 2018 10:14 PM
> To: Gao, Liming <liming.gao@intel.com>; Kinney, Michael D <michael.d.kinney@intel.com>
> Cc: edk2-devel@lists.01.org
> Subject: RE: [edk2] [RFC] Formalize source files to follow DOS format
> 
> We could do something like we do for compiler flags... append or overwrite depending on syntax.
> 
> > -----Original Message-----
> > From: Gao, Liming
> > Sent: Thursday, May 24, 2018 1:35 AM
> > To: Kinney, Michael D <michael.d.kinney@intel.com>; Carsey, Jaben
> > <jaben.carsey@intel.com>
> > Cc: edk2-devel@lists.01.org
> > Subject: RE: [edk2] [RFC] Formalize source files to follow DOS format
> > Importance: High
> >
> > Mike:
> >   I agree your comments. On default file set, this script can have the default
> > ones. User can specify more set to append the default ones instead of
> > override the default ones.
> >
> > Thanks
> > Liming
> > >-----Original Message-----
> > >From: Kinney, Michael D
> > >Sent: Tuesday, May 22, 2018 6:59 AM
> > >To: Carsey, Jaben <jaben.carsey@intel.com>; Kinney, Michael D
> > ><michael.d.kinney@intel.com>
> > >Cc: Gao, Liming <liming.gao@intel.com>; edk2-devel@lists.01.org
> > >Subject: RE: [edk2] [RFC] Formalize source files to follow DOS format
> > >
> > >Jaben,
> > >
> > >Yes.  With default behavior is default set and
> > >specifying one or more extensions overrides the
> > >default set.
> > >
> > >Mike
> > >
> > >> -----Original Message-----
> > >> From: Carsey, Jaben
> > >> Sent: Monday, May 21, 2018 3:43 PM
> > >> To: Kinney, Michael D <michael.d.kinney@intel.com>
> > >> Cc: Gao, Liming <liming.gao@intel.com>; edk2-
> > >> devel@lists.01.org
> > >> Subject: Re: [edk2] [RFC] Formalize source files to
> > >> follow DOS format
> > >>
> > >> Mike,
> > >>
> > >> Perhaps a default set of file extensions that can be
> > >> overridden?
> > >>
> > >> -Jaben
> > >>
> > >>
> > >> > On May 21, 2018, at 3:41 PM, Kinney, Michael D
> > >> <michael.d.kinney@intel.com> wrote:
> > >> >
> > >> > Liming,
> > >> >
> > >> > We have a set of standard flags for tools that
> > >> > should always be present.
> > >> >
> > >> > --help
> > >> > -v
> > >> > -q
> > >> > --debug
> > >> >
> > >> > We should also always have the program name,
> > >> > description, version, and copyright.
> > >> >
> > >> > Please see BaseTools/Scripts/BinToPcd.py as
> > >> > an example.
> > >> >
> > >> > It might be useful to have a way to run this tool
> > >> > on a single file when BaseTools/Scripts/PatchCheck.py
> > >> > reports an issue.
> > >> >
> > >> > Do you think it would be good to have one option to
> > >> > scan path for file extensions that are documented as
> > >> > DOS line endings so the extensions do not have to be
> > >> > entered?
> > >> >
> > >> > Mike
> > >> >
> > >> >
> > >> >> -----Original Message-----
> > >> >> From: edk2-devel [mailto:edk2-devel-
> > >> >> bounces@lists.01.org] On Behalf Of Carsey, Jaben
> > >> >> Sent: Monday, May 21, 2018 7:50 AM
> > >> >> To: Gao, Liming <liming.gao@intel.com>; edk2-
> > >> >> devel@lists.01.org
> > >> >> Subject: Re: [edk2] [RFC] Formalize source files to
> > >> >> follow DOS format
> > >> >>
> > >> >> Liming,
> > >> >>
> > >> >> One Pep8 thing.
> > >> >> Can you change to use the with statement for the file
> > >> >> read/write?
> > >> >>
> > >> >> Other small thoughts.
> > >> >> I think that FileList should be changed to a set as
> > >> >> order is not important.
> > >> >> Maybe wrapper the re.sub function with your own so
> > >> all
> > >> >> the .encode() are in one location?  As we move to
> > >> python
> > >> >> 3 we will have fewer changes to make.
> > >> >>
> > >> >>
> > >> >>> -----Original Message-----
> > >> >>> From: edk2-devel [mailto:edk2-devel-
> > >> >> bounces@lists.01.org] On Behalf Of
> > >> >>> Liming Gao
> > >> >>> Sent: Sunday, May 20, 2018 9:52 PM
> > >> >>> To: edk2-devel@lists.01.org
> > >> >>> Subject: [edk2] [RFC] Formalize source files to
> > >> follow
> > >> >> DOS format
> > >> >>>
> > >> >>> FormatDosFiles.py is added to clean up dos source
> > >> >> files. It bases on
> > >> >>> the rules defined in EDKII C Coding Standards
> > >> >> Specification.
> > >> >>> 5.1.2 Do not use tab characters
> > >> >>> 5.1.6 Only use CRLF (Carriage Return Line Feed) line
> > >> >> endings.
> > >> >>> 5.1.7 All files must end with CRLF
> > >> >>> No trailing white space in one line. (To be added in
> > >> >> spec)
> > >> >>>
> > >> >>> The source files in edk2 project with the below
> > >> >> postfix are dos format.
> > >> >>> .h .c .nasm .nasmb .asm .S .inf .dec .dsc .fdf .uni
> > >> >> .asl .aslc .vfr .idf
> > >> >>> .txt .bat .py
> > >> >>>
> > >> >>> The package maintainer can use this script to clean
> > >> up
> > >> >> all files in his
> > >> >>> package. The prefer way is to create one patch per
> > >> one
> > >> >> package.
> > >> >>>
> > >> >>> Contributed-under: TianoCore Contribution Agreement
> > >> >> 1.1
> > >> >>> Signed-off-by: Liming Gao <liming.gao@intel.com>
> > >> >>> ---
> > >> >>> BaseTools/Scripts/FormatDosFiles.py | 93
> > >> >>> +++++++++++++++++++++++++++++++++++++
> > >> >>> 1 file changed, 93 insertions(+)
> > >> >>> create mode 100644
> > >> >> BaseTools/Scripts/FormatDosFiles.py
> > >> >>>
> > >> >>> diff --git a/BaseTools/Scripts/FormatDosFiles.py
> > >> >>> b/BaseTools/Scripts/FormatDosFiles.py
> > >> >>> new file mode 100644
> > >> >>> index 0000000..c3a5476
> > >> >>> --- /dev/null
> > >> >>> +++ b/BaseTools/Scripts/FormatDosFiles.py
> > >> >>> @@ -0,0 +1,93 @@
> > >> >>> +# @file FormatDosFiles.py
> > >> >>> +# This script format the source files to follow dos
> > >> >> style.
> > >> >>> +# It supports Python2.x and Python3.x both.
> > >> >>> +#
> > >> >>> +#  Copyright (c) 2018, Intel Corporation. All
> > >> rights
> > >> >> reserved.<BR>
> > >> >>> +#
> > >> >>> +#  This program and the accompanying materials
> > >> >>> +#  are licensed and made available under the terms
> > >> >> and conditions of the
> > >> >>> BSD License
> > >> >>> +#  which accompanies this distribution.  The full
> > >> >> text of the license may be
> > >> >>> found at
> > >> >>> +#  http://opensource.org/licenses/bsd-license.php
> > >> >>> +#
> > >> >>> +#  THE PROGRAM IS DISTRIBUTED UNDER THE BSD LICENSE
> > >> >> ON AN "AS IS"
> > >> >>> BASIS,
> > >> >>> +#  WITHOUT WARRANTIES OR REPRESENTATIONS OF ANY
> > >> KIND,
> > >> >> EITHER
> > >> >>> EXPRESS OR IMPLIED.
> > >> >>> +#
> > >> >>> +
> > >> >>> +#
> > >> >>> +# Import Modules
> > >> >>> +#
> > >> >>> +import argparse
> > >> >>> +import os
> > >> >>> +import os.path
> > >> >>> +import re
> > >> >>> +import sys
> > >> >>> +
> > >> >>> +"""
> > >> >>> +difference of string between python2 and python3:
> > >> >>> +
> > >> >>> +there is a large difference of string in python2
> > >> and
> > >> >> python3.
> > >> >>> +
> > >> >>> +in python2,there are two type string,unicode string
> > >> >> (unicode type) and 8-bit
> > >> >>> string (str type).
> > >> >>> +    us = u"abcd",
> > >> >>> +    unicode string,which is internally stored as
> > >> unicode
> > >> >> code point.
> > >> >>> +    s = "abcd",s = b"abcd",s = r"abcd",
> > >> >>> +    all of them are 8-bit string,which is
> > >> internally
> > >> >> stored as bytes.
> > >> >>> +
> > >> >>> +in python3,a new type called bytes replace 8-bit
> > >> >> string,and str type is
> > >> >>> regarded as unicode string.
> > >> >>> +    s = "abcd", s = u"abcd", s = r"abcd",
> > >> >>> +    all of them are str type,which is internally
> > >> stored
> > >> >> unicode code point.
> > >> >>> +    bs = b"abcd",
> > >> >>> +    bytes type,which is interally stored as bytes
> > >> >>> +
> > >> >>> +in python2 ,the both type string can be mixed
> > >> use,but
> > >> >> in python3 it could
> > >> >>> not,
> > >> >>> +which means the pattern and content in re match
> > >> >> should be the same type
> > >> >>> in python3.
> > >> >>> +in function FormatFile,it read file in binary mode
> > >> so
> > >> >> that the content is bytes
> > >> >>> type,so the pattern should also be bytes type.
> > >> >>> +As a result,I add encode() to make it compitable
> > >> >> among python2 and
> > >> >>> python3.
> > >> >>> +
> > >> >>> +difference of encode,decode in python2 and python3:
> > >> >>> +the builtin function str.encode(encoding) and
> > >> >> str.decode(encoding) are
> > >> >>> used for convert between 8-bit string and unicode
> > >> >> string.
> > >> >>> +
> > >> >>> +in python2
> > >> >>> +    encode convert unicode type to str type.decode
> > >> vice
> > >> >> versa.default
> > >> >>> encoding is ascii.
> > >> >>> +    for example: s = us.encode()
> > >> >>> +    but if the us is str type,the code will also
> > >> work.it
> > >> >> will be firstly convert
> > >> >>> to unicode type,
> > >> >>> +    in this situation,the call equals s =
> > >> >> us.decode().encode().
> > >> >>> +
> > >> >>> +in python3
> > >> >>> +    encode convert str type to bytes type,decode
> > >> vice
> > >> >> versa.default
> > >> >>> encoding is utf8.
> > >> >>> +    fpr example:
> > >> >>> +    bs = s.encode(),only str type has encode
> > >> method,so
> > >> >> that won't be
> > >> >>> used wrongly.decode is the same.
> > >> >>> +
> > >> >>> +in conclusion:
> > >> >>> +    this code could work the same in python27 and
> > >> >> python36
> > >> >>> environment as far as the re pattern satisfy ascii
> > >> >> character set.
> > >> >>> +
> > >> >>> +"""
> > >> >>> +def FormatFiles():
> > >> >>> +    parser = argparse.ArgumentParser()
> > >> >>> +    parser.add_argument('path', nargs=1, help='The
> > >> >> path for files to be
> > >> >>> converted.')
> > >> >>> +    parser.add_argument('extensions', nargs='+',
> > >> >> help='File extensions filter.
> > >> >>> (Example: .txt .c .h)')
> > >> >>> +    args = parser.parse_args()
> > >> >>> +    filelist = []
> > >> >>> +    for dirpath, dirnames, filenames in
> > >> >> os.walk(args.path[0]):
> > >> >>> +        for filename in [f for f in filenames if
> > >> >> any(f.endswith(ext) for ext in
> > >> >>> args.extensions)]:
> > >> >>> +            filelist.append(os.path.join(dirpath,
> > >> >> filename))
> > >> >>> +    for file in filelist:
> > >> >>> +        fd = open(file, 'rb')
> > >> >>> +        content = fd.read()
> > >> >>> +        fd.close()
> > >> >>> +        # Convert the line endings to CRLF
> > >> >>> +        content = re.sub(r'([^\r])\n'.encode(),
> > >> >> r'\1\r\n'.encode(), content)
> > >> >>> +        content = re.sub(r'^\n'.encode(),
> > >> >> r'\r\n'.encode(), content, flags =
> > >> >>> re.MULTILINE)
> > >> >>> +        # Add a new empty line if the file is not
> > >> end
> > >> >> with one
> > >> >>> +        content = re.sub(r'([^\r\n])$'.encode(),
> > >> >> r'\1\r\n'.encode(), content)
> > >> >>> +        # Remove trailing white spaces
> > >> >>> +        content = re.sub(r'[ \t]+(\r\n)'.encode(),
> > >> >> r'\1'.encode(), content, flags =
> > >> >>> re.MULTILINE)
> > >> >>> +        # Replace '\t' with two spaces
> > >> >>> +        content = re.sub('\t'.encode(), '
> > >> >> '.encode(), content)
> > >> >>> +        fd = open(file, 'wb')
> > >> >>> +        fd.write(content)
> > >> >>> +        fd.close()
> > >> >>> +        print(file)
> > >> >>> +
> > >> >>> +if __name__ == "__main__":
> > >> >>> +    sys.exit(FormatFiles())
> > >> >>> \ No newline at end of file
> > >> >>> --
> > >> >>> 2.8.0.windows.1
> > >> >>>
> > >> >>> _______________________________________________
> > >> >>> edk2-devel mailing list
> > >> >>> edk2-devel@lists.01.org
> > >> >>> https://lists.01.org/mailman/listinfo/edk2-devel
> > >> >> _______________________________________________
> > >> >> edk2-devel mailing list
> > >> >> edk2-devel@lists.01.org
> > >> >> https://lists.01.org/mailman/listinfo/edk2-devel


  reply	other threads:[~2018-05-25  2:24 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-21  4:51 [RFC] Formalize source files to follow DOS format Liming Gao
2018-05-21 14:50 ` Carsey, Jaben
2018-05-21 22:41   ` Kinney, Michael D
2018-05-21 22:43     ` Carsey, Jaben
2018-05-21 22:58       ` Kinney, Michael D
2018-05-24  8:35         ` Gao, Liming
2018-05-24 14:13           ` Carsey, Jaben
2018-05-25  2:24             ` Gao, Liming [this message]
2018-05-24  8:31   ` Gao, Liming
2018-05-24 14:13     ` Carsey, Jaben

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A89E2EF3DFEDB4C8BFDE51014F606A14E2311F8@SHSMSX104.ccr.corp.intel.com \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox