From: "Carsey, Jaben" <jaben.carsey@intel.com>
To: "Kinney, Michael D" <michael.d.kinney@intel.com>
Cc: "Gao, Liming" <liming.gao@intel.com>,
"edk2-devel@lists.01.org" <edk2-devel@lists.01.org>
Subject: Re: [RFC] Formalize source files to follow DOS format
Date: Mon, 21 May 2018 22:43:26 +0000 [thread overview]
Message-ID: <05BFD767-DC75-401E-B651-6333815FDDFF@intel.com> (raw)
In-Reply-To: <E92EE9817A31E24EB0585FDF735412F5B8A36DF4@ORSMSX113.amr.corp.intel.com>
Mike,
Perhaps a default set of file extensions that can be overridden?
-Jaben
> On May 21, 2018, at 3:41 PM, Kinney, Michael D <michael.d.kinney@intel.com> wrote:
>
> Liming,
>
> We have a set of standard flags for tools that
> should always be present.
>
> --help
> -v
> -q
> --debug
>
> We should also always have the program name,
> description, version, and copyright.
>
> Please see BaseTools/Scripts/BinToPcd.py as
> an example.
>
> It might be useful to have a way to run this tool
> on a single file when BaseTools/Scripts/PatchCheck.py
> reports an issue.
>
> Do you think it would be good to have one option to
> scan path for file extensions that are documented as
> DOS line endings so the extensions do not have to be
> entered?
>
> Mike
>
>
>> -----Original Message-----
>> From: edk2-devel [mailto:edk2-devel-
>> bounces@lists.01.org] On Behalf Of Carsey, Jaben
>> Sent: Monday, May 21, 2018 7:50 AM
>> To: Gao, Liming <liming.gao@intel.com>; edk2-
>> devel@lists.01.org
>> Subject: Re: [edk2] [RFC] Formalize source files to
>> follow DOS format
>>
>> Liming,
>>
>> One Pep8 thing.
>> Can you change to use the with statement for the file
>> read/write?
>>
>> Other small thoughts.
>> I think that FileList should be changed to a set as
>> order is not important.
>> Maybe wrapper the re.sub function with your own so all
>> the .encode() are in one location? As we move to python
>> 3 we will have fewer changes to make.
>>
>>
>>> -----Original Message-----
>>> From: edk2-devel [mailto:edk2-devel-
>> bounces@lists.01.org] On Behalf Of
>>> Liming Gao
>>> Sent: Sunday, May 20, 2018 9:52 PM
>>> To: edk2-devel@lists.01.org
>>> Subject: [edk2] [RFC] Formalize source files to follow
>> DOS format
>>>
>>> FormatDosFiles.py is added to clean up dos source
>> files. It bases on
>>> the rules defined in EDKII C Coding Standards
>> Specification.
>>> 5.1.2 Do not use tab characters
>>> 5.1.6 Only use CRLF (Carriage Return Line Feed) line
>> endings.
>>> 5.1.7 All files must end with CRLF
>>> No trailing white space in one line. (To be added in
>> spec)
>>>
>>> The source files in edk2 project with the below
>> postfix are dos format.
>>> .h .c .nasm .nasmb .asm .S .inf .dec .dsc .fdf .uni
>> .asl .aslc .vfr .idf
>>> .txt .bat .py
>>>
>>> The package maintainer can use this script to clean up
>> all files in his
>>> package. The prefer way is to create one patch per one
>> package.
>>>
>>> Contributed-under: TianoCore Contribution Agreement
>> 1.1
>>> Signed-off-by: Liming Gao <liming.gao@intel.com>
>>> ---
>>> BaseTools/Scripts/FormatDosFiles.py | 93
>>> +++++++++++++++++++++++++++++++++++++
>>> 1 file changed, 93 insertions(+)
>>> create mode 100644
>> BaseTools/Scripts/FormatDosFiles.py
>>>
>>> diff --git a/BaseTools/Scripts/FormatDosFiles.py
>>> b/BaseTools/Scripts/FormatDosFiles.py
>>> new file mode 100644
>>> index 0000000..c3a5476
>>> --- /dev/null
>>> +++ b/BaseTools/Scripts/FormatDosFiles.py
>>> @@ -0,0 +1,93 @@
>>> +# @file FormatDosFiles.py
>>> +# This script format the source files to follow dos
>> style.
>>> +# It supports Python2.x and Python3.x both.
>>> +#
>>> +# Copyright (c) 2018, Intel Corporation. All rights
>> reserved.<BR>
>>> +#
>>> +# This program and the accompanying materials
>>> +# are licensed and made available under the terms
>> and conditions of the
>>> BSD License
>>> +# which accompanies this distribution. The full
>> text of the license may be
>>> found at
>>> +# http://opensource.org/licenses/bsd-license.php
>>> +#
>>> +# THE PROGRAM IS DISTRIBUTED UNDER THE BSD LICENSE
>> ON AN "AS IS"
>>> BASIS,
>>> +# WITHOUT WARRANTIES OR REPRESENTATIONS OF ANY KIND,
>> EITHER
>>> EXPRESS OR IMPLIED.
>>> +#
>>> +
>>> +#
>>> +# Import Modules
>>> +#
>>> +import argparse
>>> +import os
>>> +import os.path
>>> +import re
>>> +import sys
>>> +
>>> +"""
>>> +difference of string between python2 and python3:
>>> +
>>> +there is a large difference of string in python2 and
>> python3.
>>> +
>>> +in python2,there are two type string,unicode string
>> (unicode type) and 8-bit
>>> string (str type).
>>> + us = u"abcd",
>>> + unicode string,which is internally stored as unicode
>> code point.
>>> + s = "abcd",s = b"abcd",s = r"abcd",
>>> + all of them are 8-bit string,which is internally
>> stored as bytes.
>>> +
>>> +in python3,a new type called bytes replace 8-bit
>> string,and str type is
>>> regarded as unicode string.
>>> + s = "abcd", s = u"abcd", s = r"abcd",
>>> + all of them are str type,which is internally stored
>> unicode code point.
>>> + bs = b"abcd",
>>> + bytes type,which is interally stored as bytes
>>> +
>>> +in python2 ,the both type string can be mixed use,but
>> in python3 it could
>>> not,
>>> +which means the pattern and content in re match
>> should be the same type
>>> in python3.
>>> +in function FormatFile,it read file in binary mode so
>> that the content is bytes
>>> type,so the pattern should also be bytes type.
>>> +As a result,I add encode() to make it compitable
>> among python2 and
>>> python3.
>>> +
>>> +difference of encode,decode in python2 and python3:
>>> +the builtin function str.encode(encoding) and
>> str.decode(encoding) are
>>> used for convert between 8-bit string and unicode
>> string.
>>> +
>>> +in python2
>>> + encode convert unicode type to str type.decode vice
>> versa.default
>>> encoding is ascii.
>>> + for example: s = us.encode()
>>> + but if the us is str type,the code will also work.it
>> will be firstly convert
>>> to unicode type,
>>> + in this situation,the call equals s =
>> us.decode().encode().
>>> +
>>> +in python3
>>> + encode convert str type to bytes type,decode vice
>> versa.default
>>> encoding is utf8.
>>> + fpr example:
>>> + bs = s.encode(),only str type has encode method,so
>> that won't be
>>> used wrongly.decode is the same.
>>> +
>>> +in conclusion:
>>> + this code could work the same in python27 and
>> python36
>>> environment as far as the re pattern satisfy ascii
>> character set.
>>> +
>>> +"""
>>> +def FormatFiles():
>>> + parser = argparse.ArgumentParser()
>>> + parser.add_argument('path', nargs=1, help='The
>> path for files to be
>>> converted.')
>>> + parser.add_argument('extensions', nargs='+',
>> help='File extensions filter.
>>> (Example: .txt .c .h)')
>>> + args = parser.parse_args()
>>> + filelist = []
>>> + for dirpath, dirnames, filenames in
>> os.walk(args.path[0]):
>>> + for filename in [f for f in filenames if
>> any(f.endswith(ext) for ext in
>>> args.extensions)]:
>>> + filelist.append(os.path.join(dirpath,
>> filename))
>>> + for file in filelist:
>>> + fd = open(file, 'rb')
>>> + content = fd.read()
>>> + fd.close()
>>> + # Convert the line endings to CRLF
>>> + content = re.sub(r'([^\r])\n'.encode(),
>> r'\1\r\n'.encode(), content)
>>> + content = re.sub(r'^\n'.encode(),
>> r'\r\n'.encode(), content, flags =
>>> re.MULTILINE)
>>> + # Add a new empty line if the file is not end
>> with one
>>> + content = re.sub(r'([^\r\n])$'.encode(),
>> r'\1\r\n'.encode(), content)
>>> + # Remove trailing white spaces
>>> + content = re.sub(r'[ \t]+(\r\n)'.encode(),
>> r'\1'.encode(), content, flags =
>>> re.MULTILINE)
>>> + # Replace '\t' with two spaces
>>> + content = re.sub('\t'.encode(), '
>> '.encode(), content)
>>> + fd = open(file, 'wb')
>>> + fd.write(content)
>>> + fd.close()
>>> + print(file)
>>> +
>>> +if __name__ == "__main__":
>>> + sys.exit(FormatFiles())
>>> \ No newline at end of file
>>> --
>>> 2.8.0.windows.1
>>>
>>> _______________________________________________
>>> edk2-devel mailing list
>>> edk2-devel@lists.01.org
>>> https://lists.01.org/mailman/listinfo/edk2-devel
>> _______________________________________________
>> edk2-devel mailing list
>> edk2-devel@lists.01.org
>> https://lists.01.org/mailman/listinfo/edk2-devel
next prev parent reply other threads:[~2018-05-21 22:43 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-05-21 4:51 [RFC] Formalize source files to follow DOS format Liming Gao
2018-05-21 14:50 ` Carsey, Jaben
2018-05-21 22:41 ` Kinney, Michael D
2018-05-21 22:43 ` Carsey, Jaben [this message]
2018-05-21 22:58 ` Kinney, Michael D
2018-05-24 8:35 ` Gao, Liming
2018-05-24 14:13 ` Carsey, Jaben
2018-05-25 2:24 ` Gao, Liming
2018-05-24 8:31 ` Gao, Liming
2018-05-24 14:13 ` Carsey, Jaben
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-list from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=05BFD767-DC75-401E-B651-6333815FDDFF@intel.com \
--to=devel@edk2.groups.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox