From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received-SPF: Pass (sender SPF authorized) identity=mailfrom; client-ip=192.55.52.88; helo=mga01.intel.com; envelope-from=jaben.carsey@intel.com; receiver=edk2-devel@lists.01.org Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id BD94921B02845 for ; Mon, 21 May 2018 15:43:27 -0700 (PDT) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 21 May 2018 15:43:27 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.49,427,1520924400"; d="scan'208";a="48696378" Received: from fmsmsx103.amr.corp.intel.com ([10.18.124.201]) by fmsmga002.fm.intel.com with ESMTP; 21 May 2018 15:43:27 -0700 Received: from fmsmsx155.amr.corp.intel.com (10.18.116.71) by FMSMSX103.amr.corp.intel.com (10.18.124.201) with Microsoft SMTP Server (TLS) id 14.3.319.2; Mon, 21 May 2018 15:43:26 -0700 Received: from fmsmsx103.amr.corp.intel.com ([169.254.2.228]) by FMSMSX155.amr.corp.intel.com ([169.254.5.91]) with mapi id 14.03.0319.002; Mon, 21 May 2018 15:43:26 -0700 From: "Carsey, Jaben" To: "Kinney, Michael D" CC: "Gao, Liming" , "edk2-devel@lists.01.org" Thread-Topic: [edk2] [RFC] Formalize source files to follow DOS format Thread-Index: AQHT8L93zdwNLwuiKEyE83CT2z5wr6Q6QifQgAD7cgD//4srqQ== Date: Mon, 21 May 2018 22:43:26 +0000 Message-ID: <05BFD767-DC75-401E-B651-6333815FDDFF@intel.com> References: <1526878301-13892-1-git-send-email-liming.gao@intel.com> , In-Reply-To: Accept-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: MIME-Version: 1.0 Subject: Re: [RFC] Formalize source files to follow DOS format X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 May 2018 22:43:27 -0000 Content-Language: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Mike, Perhaps a default set of file extensions that can be overridden? -Jaben > On May 21, 2018, at 3:41 PM, Kinney, Michael D wrote: >=20 > Liming, >=20 > We have a set of standard flags for tools that=20 > should always be present. >=20 > --help > -v > -q > --debug >=20 > We should also always have the program name, > description, version, and copyright. >=20 > Please see BaseTools/Scripts/BinToPcd.py as=20 > an example. >=20 > It might be useful to have a way to run this tool > on a single file when BaseTools/Scripts/PatchCheck.py > reports an issue. >=20 > Do you think it would be good to have one option to > scan path for file extensions that are documented as > DOS line endings so the extensions do not have to be > entered? >=20 > Mike >=20 >=20 >> -----Original Message----- >> From: edk2-devel [mailto:edk2-devel- >> bounces@lists.01.org] On Behalf Of Carsey, Jaben >> Sent: Monday, May 21, 2018 7:50 AM >> To: Gao, Liming ; edk2- >> devel@lists.01.org >> Subject: Re: [edk2] [RFC] Formalize source files to >> follow DOS format >>=20 >> Liming, >>=20 >> One Pep8 thing. >> Can you change to use the with statement for the file >> read/write? >>=20 >> Other small thoughts. >> I think that FileList should be changed to a set as >> order is not important. >> Maybe wrapper the re.sub function with your own so all >> the .encode() are in one location? As we move to python >> 3 we will have fewer changes to make. >>=20 >>=20 >>> -----Original Message----- >>> From: edk2-devel [mailto:edk2-devel- >> bounces@lists.01.org] On Behalf Of >>> Liming Gao >>> Sent: Sunday, May 20, 2018 9:52 PM >>> To: edk2-devel@lists.01.org >>> Subject: [edk2] [RFC] Formalize source files to follow >> DOS format >>>=20 >>> FormatDosFiles.py is added to clean up dos source >> files. It bases on >>> the rules defined in EDKII C Coding Standards >> Specification. >>> 5.1.2 Do not use tab characters >>> 5.1.6 Only use CRLF (Carriage Return Line Feed) line >> endings. >>> 5.1.7 All files must end with CRLF >>> No trailing white space in one line. (To be added in >> spec) >>>=20 >>> The source files in edk2 project with the below >> postfix are dos format. >>> .h .c .nasm .nasmb .asm .S .inf .dec .dsc .fdf .uni >> .asl .aslc .vfr .idf >>> .txt .bat .py >>>=20 >>> The package maintainer can use this script to clean up >> all files in his >>> package. The prefer way is to create one patch per one >> package. >>>=20 >>> Contributed-under: TianoCore Contribution Agreement >> 1.1 >>> Signed-off-by: Liming Gao >>> --- >>> BaseTools/Scripts/FormatDosFiles.py | 93 >>> +++++++++++++++++++++++++++++++++++++ >>> 1 file changed, 93 insertions(+) >>> create mode 100644 >> BaseTools/Scripts/FormatDosFiles.py >>>=20 >>> diff --git a/BaseTools/Scripts/FormatDosFiles.py >>> b/BaseTools/Scripts/FormatDosFiles.py >>> new file mode 100644 >>> index 0000000..c3a5476 >>> --- /dev/null >>> +++ b/BaseTools/Scripts/FormatDosFiles.py >>> @@ -0,0 +1,93 @@ >>> +# @file FormatDosFiles.py >>> +# This script format the source files to follow dos >> style. >>> +# It supports Python2.x and Python3.x both. >>> +# >>> +# Copyright (c) 2018, Intel Corporation. All rights >> reserved.
>>> +# >>> +# This program and the accompanying materials >>> +# are licensed and made available under the terms >> and conditions of the >>> BSD License >>> +# which accompanies this distribution. The full >> text of the license may be >>> found at >>> +# http://opensource.org/licenses/bsd-license.php >>> +# >>> +# THE PROGRAM IS DISTRIBUTED UNDER THE BSD LICENSE >> ON AN "AS IS" >>> BASIS, >>> +# WITHOUT WARRANTIES OR REPRESENTATIONS OF ANY KIND, >> EITHER >>> EXPRESS OR IMPLIED. >>> +# >>> + >>> +# >>> +# Import Modules >>> +# >>> +import argparse >>> +import os >>> +import os.path >>> +import re >>> +import sys >>> + >>> +""" >>> +difference of string between python2 and python3: >>> + >>> +there is a large difference of string in python2 and >> python3. >>> + >>> +in python2,there are two type string,unicode string >> (unicode type) and 8-bit >>> string (str type). >>> + us =3D u"abcd", >>> + unicode string,which is internally stored as unicode >> code point. >>> + s =3D "abcd",s =3D b"abcd",s =3D r"abcd", >>> + all of them are 8-bit string,which is internally >> stored as bytes. >>> + >>> +in python3,a new type called bytes replace 8-bit >> string,and str type is >>> regarded as unicode string. >>> + s =3D "abcd", s =3D u"abcd", s =3D r"abcd", >>> + all of them are str type,which is internally stored >> unicode code point. >>> + bs =3D b"abcd", >>> + bytes type,which is interally stored as bytes >>> + >>> +in python2 ,the both type string can be mixed use,but >> in python3 it could >>> not, >>> +which means the pattern and content in re match >> should be the same type >>> in python3. >>> +in function FormatFile,it read file in binary mode so >> that the content is bytes >>> type,so the pattern should also be bytes type. >>> +As a result,I add encode() to make it compitable >> among python2 and >>> python3. >>> + >>> +difference of encode,decode in python2 and python3: >>> +the builtin function str.encode(encoding) and >> str.decode(encoding) are >>> used for convert between 8-bit string and unicode >> string. >>> + >>> +in python2 >>> + encode convert unicode type to str type.decode vice >> versa.default >>> encoding is ascii. >>> + for example: s =3D us.encode() >>> + but if the us is str type,the code will also work.it >> will be firstly convert >>> to unicode type, >>> + in this situation,the call equals s =3D >> us.decode().encode(). >>> + >>> +in python3 >>> + encode convert str type to bytes type,decode vice >> versa.default >>> encoding is utf8. >>> + fpr example: >>> + bs =3D s.encode(),only str type has encode method,so >> that won't be >>> used wrongly.decode is the same. >>> + >>> +in conclusion: >>> + this code could work the same in python27 and >> python36 >>> environment as far as the re pattern satisfy ascii >> character set. >>> + >>> +""" >>> +def FormatFiles(): >>> + parser =3D argparse.ArgumentParser() >>> + parser.add_argument('path', nargs=3D1, help=3D'The >> path for files to be >>> converted.') >>> + parser.add_argument('extensions', nargs=3D'+', >> help=3D'File extensions filter. >>> (Example: .txt .c .h)') >>> + args =3D parser.parse_args() >>> + filelist =3D [] >>> + for dirpath, dirnames, filenames in >> os.walk(args.path[0]): >>> + for filename in [f for f in filenames if >> any(f.endswith(ext) for ext in >>> args.extensions)]: >>> + filelist.append(os.path.join(dirpath, >> filename)) >>> + for file in filelist: >>> + fd =3D open(file, 'rb') >>> + content =3D fd.read() >>> + fd.close() >>> + # Convert the line endings to CRLF >>> + content =3D re.sub(r'([^\r])\n'.encode(), >> r'\1\r\n'.encode(), content) >>> + content =3D re.sub(r'^\n'.encode(), >> r'\r\n'.encode(), content, flags =3D >>> re.MULTILINE) >>> + # Add a new empty line if the file is not end >> with one >>> + content =3D re.sub(r'([^\r\n])$'.encode(), >> r'\1\r\n'.encode(), content) >>> + # Remove trailing white spaces >>> + content =3D re.sub(r'[ \t]+(\r\n)'.encode(), >> r'\1'.encode(), content, flags =3D >>> re.MULTILINE) >>> + # Replace '\t' with two spaces >>> + content =3D re.sub('\t'.encode(), ' >> '.encode(), content) >>> + fd =3D open(file, 'wb') >>> + fd.write(content) >>> + fd.close() >>> + print(file) >>> + >>> +if __name__ =3D=3D "__main__": >>> + sys.exit(FormatFiles()) >>> \ No newline at end of file >>> -- >>> 2.8.0.windows.1 >>>=20 >>> _______________________________________________ >>> edk2-devel mailing list >>> edk2-devel@lists.01.org >>> https://lists.01.org/mailman/listinfo/edk2-devel >> _______________________________________________ >> edk2-devel mailing list >> edk2-devel@lists.01.org >> https://lists.01.org/mailman/listinfo/edk2-devel