From: "Gao, Liming" <liming.gao@intel.com>
To: "Kinney, Michael D" <michael.d.kinney@intel.com>,
"Carsey, Jaben" <jaben.carsey@intel.com>
Cc: "edk2-devel@lists.01.org" <edk2-devel@lists.01.org>
Subject: Re: [RFC] Formalize source files to follow DOS format
Date: Thu, 24 May 2018 08:35:01 +0000 [thread overview]
Message-ID: <4A89E2EF3DFEDB4C8BFDE51014F606A14E230CAA@SHSMSX104.ccr.corp.intel.com> (raw)
In-Reply-To: <E92EE9817A31E24EB0585FDF735412F5B8A36E76@ORSMSX113.amr.corp.intel.com>
Mike:
I agree your comments. On default file set, this script can have the default ones. User can specify more set to append the default ones instead of override the default ones.
Thanks
Liming
>-----Original Message-----
>From: Kinney, Michael D
>Sent: Tuesday, May 22, 2018 6:59 AM
>To: Carsey, Jaben <jaben.carsey@intel.com>; Kinney, Michael D
><michael.d.kinney@intel.com>
>Cc: Gao, Liming <liming.gao@intel.com>; edk2-devel@lists.01.org
>Subject: RE: [edk2] [RFC] Formalize source files to follow DOS format
>
>Jaben,
>
>Yes. With default behavior is default set and
>specifying one or more extensions overrides the
>default set.
>
>Mike
>
>> -----Original Message-----
>> From: Carsey, Jaben
>> Sent: Monday, May 21, 2018 3:43 PM
>> To: Kinney, Michael D <michael.d.kinney@intel.com>
>> Cc: Gao, Liming <liming.gao@intel.com>; edk2-
>> devel@lists.01.org
>> Subject: Re: [edk2] [RFC] Formalize source files to
>> follow DOS format
>>
>> Mike,
>>
>> Perhaps a default set of file extensions that can be
>> overridden?
>>
>> -Jaben
>>
>>
>> > On May 21, 2018, at 3:41 PM, Kinney, Michael D
>> <michael.d.kinney@intel.com> wrote:
>> >
>> > Liming,
>> >
>> > We have a set of standard flags for tools that
>> > should always be present.
>> >
>> > --help
>> > -v
>> > -q
>> > --debug
>> >
>> > We should also always have the program name,
>> > description, version, and copyright.
>> >
>> > Please see BaseTools/Scripts/BinToPcd.py as
>> > an example.
>> >
>> > It might be useful to have a way to run this tool
>> > on a single file when BaseTools/Scripts/PatchCheck.py
>> > reports an issue.
>> >
>> > Do you think it would be good to have one option to
>> > scan path for file extensions that are documented as
>> > DOS line endings so the extensions do not have to be
>> > entered?
>> >
>> > Mike
>> >
>> >
>> >> -----Original Message-----
>> >> From: edk2-devel [mailto:edk2-devel-
>> >> bounces@lists.01.org] On Behalf Of Carsey, Jaben
>> >> Sent: Monday, May 21, 2018 7:50 AM
>> >> To: Gao, Liming <liming.gao@intel.com>; edk2-
>> >> devel@lists.01.org
>> >> Subject: Re: [edk2] [RFC] Formalize source files to
>> >> follow DOS format
>> >>
>> >> Liming,
>> >>
>> >> One Pep8 thing.
>> >> Can you change to use the with statement for the file
>> >> read/write?
>> >>
>> >> Other small thoughts.
>> >> I think that FileList should be changed to a set as
>> >> order is not important.
>> >> Maybe wrapper the re.sub function with your own so
>> all
>> >> the .encode() are in one location? As we move to
>> python
>> >> 3 we will have fewer changes to make.
>> >>
>> >>
>> >>> -----Original Message-----
>> >>> From: edk2-devel [mailto:edk2-devel-
>> >> bounces@lists.01.org] On Behalf Of
>> >>> Liming Gao
>> >>> Sent: Sunday, May 20, 2018 9:52 PM
>> >>> To: edk2-devel@lists.01.org
>> >>> Subject: [edk2] [RFC] Formalize source files to
>> follow
>> >> DOS format
>> >>>
>> >>> FormatDosFiles.py is added to clean up dos source
>> >> files. It bases on
>> >>> the rules defined in EDKII C Coding Standards
>> >> Specification.
>> >>> 5.1.2 Do not use tab characters
>> >>> 5.1.6 Only use CRLF (Carriage Return Line Feed) line
>> >> endings.
>> >>> 5.1.7 All files must end with CRLF
>> >>> No trailing white space in one line. (To be added in
>> >> spec)
>> >>>
>> >>> The source files in edk2 project with the below
>> >> postfix are dos format.
>> >>> .h .c .nasm .nasmb .asm .S .inf .dec .dsc .fdf .uni
>> >> .asl .aslc .vfr .idf
>> >>> .txt .bat .py
>> >>>
>> >>> The package maintainer can use this script to clean
>> up
>> >> all files in his
>> >>> package. The prefer way is to create one patch per
>> one
>> >> package.
>> >>>
>> >>> Contributed-under: TianoCore Contribution Agreement
>> >> 1.1
>> >>> Signed-off-by: Liming Gao <liming.gao@intel.com>
>> >>> ---
>> >>> BaseTools/Scripts/FormatDosFiles.py | 93
>> >>> +++++++++++++++++++++++++++++++++++++
>> >>> 1 file changed, 93 insertions(+)
>> >>> create mode 100644
>> >> BaseTools/Scripts/FormatDosFiles.py
>> >>>
>> >>> diff --git a/BaseTools/Scripts/FormatDosFiles.py
>> >>> b/BaseTools/Scripts/FormatDosFiles.py
>> >>> new file mode 100644
>> >>> index 0000000..c3a5476
>> >>> --- /dev/null
>> >>> +++ b/BaseTools/Scripts/FormatDosFiles.py
>> >>> @@ -0,0 +1,93 @@
>> >>> +# @file FormatDosFiles.py
>> >>> +# This script format the source files to follow dos
>> >> style.
>> >>> +# It supports Python2.x and Python3.x both.
>> >>> +#
>> >>> +# Copyright (c) 2018, Intel Corporation. All
>> rights
>> >> reserved.<BR>
>> >>> +#
>> >>> +# This program and the accompanying materials
>> >>> +# are licensed and made available under the terms
>> >> and conditions of the
>> >>> BSD License
>> >>> +# which accompanies this distribution. The full
>> >> text of the license may be
>> >>> found at
>> >>> +# http://opensource.org/licenses/bsd-license.php
>> >>> +#
>> >>> +# THE PROGRAM IS DISTRIBUTED UNDER THE BSD LICENSE
>> >> ON AN "AS IS"
>> >>> BASIS,
>> >>> +# WITHOUT WARRANTIES OR REPRESENTATIONS OF ANY
>> KIND,
>> >> EITHER
>> >>> EXPRESS OR IMPLIED.
>> >>> +#
>> >>> +
>> >>> +#
>> >>> +# Import Modules
>> >>> +#
>> >>> +import argparse
>> >>> +import os
>> >>> +import os.path
>> >>> +import re
>> >>> +import sys
>> >>> +
>> >>> +"""
>> >>> +difference of string between python2 and python3:
>> >>> +
>> >>> +there is a large difference of string in python2
>> and
>> >> python3.
>> >>> +
>> >>> +in python2,there are two type string,unicode string
>> >> (unicode type) and 8-bit
>> >>> string (str type).
>> >>> + us = u"abcd",
>> >>> + unicode string,which is internally stored as
>> unicode
>> >> code point.
>> >>> + s = "abcd",s = b"abcd",s = r"abcd",
>> >>> + all of them are 8-bit string,which is
>> internally
>> >> stored as bytes.
>> >>> +
>> >>> +in python3,a new type called bytes replace 8-bit
>> >> string,and str type is
>> >>> regarded as unicode string.
>> >>> + s = "abcd", s = u"abcd", s = r"abcd",
>> >>> + all of them are str type,which is internally
>> stored
>> >> unicode code point.
>> >>> + bs = b"abcd",
>> >>> + bytes type,which is interally stored as bytes
>> >>> +
>> >>> +in python2 ,the both type string can be mixed
>> use,but
>> >> in python3 it could
>> >>> not,
>> >>> +which means the pattern and content in re match
>> >> should be the same type
>> >>> in python3.
>> >>> +in function FormatFile,it read file in binary mode
>> so
>> >> that the content is bytes
>> >>> type,so the pattern should also be bytes type.
>> >>> +As a result,I add encode() to make it compitable
>> >> among python2 and
>> >>> python3.
>> >>> +
>> >>> +difference of encode,decode in python2 and python3:
>> >>> +the builtin function str.encode(encoding) and
>> >> str.decode(encoding) are
>> >>> used for convert between 8-bit string and unicode
>> >> string.
>> >>> +
>> >>> +in python2
>> >>> + encode convert unicode type to str type.decode
>> vice
>> >> versa.default
>> >>> encoding is ascii.
>> >>> + for example: s = us.encode()
>> >>> + but if the us is str type,the code will also
>> work.it
>> >> will be firstly convert
>> >>> to unicode type,
>> >>> + in this situation,the call equals s =
>> >> us.decode().encode().
>> >>> +
>> >>> +in python3
>> >>> + encode convert str type to bytes type,decode
>> vice
>> >> versa.default
>> >>> encoding is utf8.
>> >>> + fpr example:
>> >>> + bs = s.encode(),only str type has encode
>> method,so
>> >> that won't be
>> >>> used wrongly.decode is the same.
>> >>> +
>> >>> +in conclusion:
>> >>> + this code could work the same in python27 and
>> >> python36
>> >>> environment as far as the re pattern satisfy ascii
>> >> character set.
>> >>> +
>> >>> +"""
>> >>> +def FormatFiles():
>> >>> + parser = argparse.ArgumentParser()
>> >>> + parser.add_argument('path', nargs=1, help='The
>> >> path for files to be
>> >>> converted.')
>> >>> + parser.add_argument('extensions', nargs='+',
>> >> help='File extensions filter.
>> >>> (Example: .txt .c .h)')
>> >>> + args = parser.parse_args()
>> >>> + filelist = []
>> >>> + for dirpath, dirnames, filenames in
>> >> os.walk(args.path[0]):
>> >>> + for filename in [f for f in filenames if
>> >> any(f.endswith(ext) for ext in
>> >>> args.extensions)]:
>> >>> + filelist.append(os.path.join(dirpath,
>> >> filename))
>> >>> + for file in filelist:
>> >>> + fd = open(file, 'rb')
>> >>> + content = fd.read()
>> >>> + fd.close()
>> >>> + # Convert the line endings to CRLF
>> >>> + content = re.sub(r'([^\r])\n'.encode(),
>> >> r'\1\r\n'.encode(), content)
>> >>> + content = re.sub(r'^\n'.encode(),
>> >> r'\r\n'.encode(), content, flags =
>> >>> re.MULTILINE)
>> >>> + # Add a new empty line if the file is not
>> end
>> >> with one
>> >>> + content = re.sub(r'([^\r\n])$'.encode(),
>> >> r'\1\r\n'.encode(), content)
>> >>> + # Remove trailing white spaces
>> >>> + content = re.sub(r'[ \t]+(\r\n)'.encode(),
>> >> r'\1'.encode(), content, flags =
>> >>> re.MULTILINE)
>> >>> + # Replace '\t' with two spaces
>> >>> + content = re.sub('\t'.encode(), '
>> >> '.encode(), content)
>> >>> + fd = open(file, 'wb')
>> >>> + fd.write(content)
>> >>> + fd.close()
>> >>> + print(file)
>> >>> +
>> >>> +if __name__ == "__main__":
>> >>> + sys.exit(FormatFiles())
>> >>> \ No newline at end of file
>> >>> --
>> >>> 2.8.0.windows.1
>> >>>
>> >>> _______________________________________________
>> >>> edk2-devel mailing list
>> >>> edk2-devel@lists.01.org
>> >>> https://lists.01.org/mailman/listinfo/edk2-devel
>> >> _______________________________________________
>> >> edk2-devel mailing list
>> >> edk2-devel@lists.01.org
>> >> https://lists.01.org/mailman/listinfo/edk2-devel
next prev parent reply other threads:[~2018-05-24 8:35 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-05-21 4:51 [RFC] Formalize source files to follow DOS format Liming Gao
2018-05-21 14:50 ` Carsey, Jaben
2018-05-21 22:41 ` Kinney, Michael D
2018-05-21 22:43 ` Carsey, Jaben
2018-05-21 22:58 ` Kinney, Michael D
2018-05-24 8:35 ` Gao, Liming [this message]
2018-05-24 14:13 ` Carsey, Jaben
2018-05-25 2:24 ` Gao, Liming
2018-05-24 8:31 ` Gao, Liming
2018-05-24 14:13 ` Carsey, Jaben
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-list from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4A89E2EF3DFEDB4C8BFDE51014F606A14E230CAA@SHSMSX104.ccr.corp.intel.com \
--to=devel@edk2.groups.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox