From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received-SPF: Pass (sender SPF authorized) identity=mailfrom; client-ip=134.134.136.24; helo=mga09.intel.com; envelope-from=michael.d.kinney@intel.com; receiver=edk2-devel@lists.01.org Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id ABEBE21B02845 for ; Mon, 21 May 2018 15:41:37 -0700 (PDT) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 21 May 2018 15:41:37 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.49,427,1520924400"; d="scan'208";a="52712885" Received: from orsmsx108.amr.corp.intel.com ([10.22.240.6]) by orsmga003.jf.intel.com with ESMTP; 21 May 2018 15:41:37 -0700 Received: from orsmsx151.amr.corp.intel.com (10.22.226.38) by ORSMSX108.amr.corp.intel.com (10.22.240.6) with Microsoft SMTP Server (TLS) id 14.3.319.2; Mon, 21 May 2018 15:41:36 -0700 Received: from orsmsx113.amr.corp.intel.com ([169.254.9.119]) by ORSMSX151.amr.corp.intel.com ([169.254.7.140]) with mapi id 14.03.0319.002; Mon, 21 May 2018 15:41:36 -0700 From: "Kinney, Michael D" To: "Carsey, Jaben" , "Gao, Liming" , "edk2-devel@lists.01.org" , "Kinney, Michael D" Thread-Topic: [edk2] [RFC] Formalize source files to follow DOS format Thread-Index: AQHT8L949ajxRTtA20mkJbQe5Aj6YqQ6ufAAgAAMfLA= Date: Mon, 21 May 2018 22:41:36 +0000 Message-ID: References: <1526878301-13892-1-git-send-email-liming.gao@intel.com> In-Reply-To: Accept-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-version: 11.0.200.100 dlp-reaction: no-action x-originating-ip: [10.22.254.139] MIME-Version: 1.0 Subject: Re: [RFC] Formalize source files to follow DOS format X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 May 2018 22:41:37 -0000 Content-Language: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Liming, We have a set of standard flags for tools that=20 should always be present. --help -v -q --debug We should also always have the program name, description, version, and copyright. Please see BaseTools/Scripts/BinToPcd.py as=20 an example. It might be useful to have a way to run this tool on a single file when BaseTools/Scripts/PatchCheck.py reports an issue. Do you think it would be good to have one option to scan path for file extensions that are documented as DOS line endings so the extensions do not have to be entered? Mike > -----Original Message----- > From: edk2-devel [mailto:edk2-devel- > bounces@lists.01.org] On Behalf Of Carsey, Jaben > Sent: Monday, May 21, 2018 7:50 AM > To: Gao, Liming ; edk2- > devel@lists.01.org > Subject: Re: [edk2] [RFC] Formalize source files to > follow DOS format >=20 > Liming, >=20 > One Pep8 thing. > Can you change to use the with statement for the file > read/write? >=20 > Other small thoughts. > I think that FileList should be changed to a set as > order is not important. > Maybe wrapper the re.sub function with your own so all > the .encode() are in one location? As we move to python > 3 we will have fewer changes to make. >=20 >=20 > > -----Original Message----- > > From: edk2-devel [mailto:edk2-devel- > bounces@lists.01.org] On Behalf Of > > Liming Gao > > Sent: Sunday, May 20, 2018 9:52 PM > > To: edk2-devel@lists.01.org > > Subject: [edk2] [RFC] Formalize source files to follow > DOS format > > > > FormatDosFiles.py is added to clean up dos source > files. It bases on > > the rules defined in EDKII C Coding Standards > Specification. > > 5.1.2 Do not use tab characters > > 5.1.6 Only use CRLF (Carriage Return Line Feed) line > endings. > > 5.1.7 All files must end with CRLF > > No trailing white space in one line. (To be added in > spec) > > > > The source files in edk2 project with the below > postfix are dos format. > > .h .c .nasm .nasmb .asm .S .inf .dec .dsc .fdf .uni > .asl .aslc .vfr .idf > > .txt .bat .py > > > > The package maintainer can use this script to clean up > all files in his > > package. The prefer way is to create one patch per one > package. > > > > Contributed-under: TianoCore Contribution Agreement > 1.1 > > Signed-off-by: Liming Gao > > --- > > BaseTools/Scripts/FormatDosFiles.py | 93 > > +++++++++++++++++++++++++++++++++++++ > > 1 file changed, 93 insertions(+) > > create mode 100644 > BaseTools/Scripts/FormatDosFiles.py > > > > diff --git a/BaseTools/Scripts/FormatDosFiles.py > > b/BaseTools/Scripts/FormatDosFiles.py > > new file mode 100644 > > index 0000000..c3a5476 > > --- /dev/null > > +++ b/BaseTools/Scripts/FormatDosFiles.py > > @@ -0,0 +1,93 @@ > > +# @file FormatDosFiles.py > > +# This script format the source files to follow dos > style. > > +# It supports Python2.x and Python3.x both. > > +# > > +# Copyright (c) 2018, Intel Corporation. All rights > reserved.
> > +# > > +# This program and the accompanying materials > > +# are licensed and made available under the terms > and conditions of the > > BSD License > > +# which accompanies this distribution. The full > text of the license may be > > found at > > +# http://opensource.org/licenses/bsd-license.php > > +# > > +# THE PROGRAM IS DISTRIBUTED UNDER THE BSD LICENSE > ON AN "AS IS" > > BASIS, > > +# WITHOUT WARRANTIES OR REPRESENTATIONS OF ANY KIND, > EITHER > > EXPRESS OR IMPLIED. > > +# > > + > > +# > > +# Import Modules > > +# > > +import argparse > > +import os > > +import os.path > > +import re > > +import sys > > + > > +""" > > +difference of string between python2 and python3: > > + > > +there is a large difference of string in python2 and > python3. > > + > > +in python2,there are two type string,unicode string > (unicode type) and 8-bit > > string (str type). > > + us =3D u"abcd", > > + unicode string,which is internally stored as unicode > code point. > > + s =3D "abcd",s =3D b"abcd",s =3D r"abcd", > > + all of them are 8-bit string,which is internally > stored as bytes. > > + > > +in python3,a new type called bytes replace 8-bit > string,and str type is > > regarded as unicode string. > > + s =3D "abcd", s =3D u"abcd", s =3D r"abcd", > > + all of them are str type,which is internally stored > unicode code point. > > + bs =3D b"abcd", > > + bytes type,which is interally stored as bytes > > + > > +in python2 ,the both type string can be mixed use,but > in python3 it could > > not, > > +which means the pattern and content in re match > should be the same type > > in python3. > > +in function FormatFile,it read file in binary mode so > that the content is bytes > > type,so the pattern should also be bytes type. > > +As a result,I add encode() to make it compitable > among python2 and > > python3. > > + > > +difference of encode,decode in python2 and python3: > > +the builtin function str.encode(encoding) and > str.decode(encoding) are > > used for convert between 8-bit string and unicode > string. > > + > > +in python2 > > + encode convert unicode type to str type.decode vice > versa.default > > encoding is ascii. > > + for example: s =3D us.encode() > > + but if the us is str type,the code will also work.it > will be firstly convert > > to unicode type, > > + in this situation,the call equals s =3D > us.decode().encode(). > > + > > +in python3 > > + encode convert str type to bytes type,decode vice > versa.default > > encoding is utf8. > > + fpr example: > > + bs =3D s.encode(),only str type has encode method,so > that won't be > > used wrongly.decode is the same. > > + > > +in conclusion: > > + this code could work the same in python27 and > python36 > > environment as far as the re pattern satisfy ascii > character set. > > + > > +""" > > +def FormatFiles(): > > + parser =3D argparse.ArgumentParser() > > + parser.add_argument('path', nargs=3D1, help=3D'The > path for files to be > > converted.') > > + parser.add_argument('extensions', nargs=3D'+', > help=3D'File extensions filter. > > (Example: .txt .c .h)') > > + args =3D parser.parse_args() > > + filelist =3D [] > > + for dirpath, dirnames, filenames in > os.walk(args.path[0]): > > + for filename in [f for f in filenames if > any(f.endswith(ext) for ext in > > args.extensions)]: > > + filelist.append(os.path.join(dirpath, > filename)) > > + for file in filelist: > > + fd =3D open(file, 'rb') > > + content =3D fd.read() > > + fd.close() > > + # Convert the line endings to CRLF > > + content =3D re.sub(r'([^\r])\n'.encode(), > r'\1\r\n'.encode(), content) > > + content =3D re.sub(r'^\n'.encode(), > r'\r\n'.encode(), content, flags =3D > > re.MULTILINE) > > + # Add a new empty line if the file is not end > with one > > + content =3D re.sub(r'([^\r\n])$'.encode(), > r'\1\r\n'.encode(), content) > > + # Remove trailing white spaces > > + content =3D re.sub(r'[ \t]+(\r\n)'.encode(), > r'\1'.encode(), content, flags =3D > > re.MULTILINE) > > + # Replace '\t' with two spaces > > + content =3D re.sub('\t'.encode(), ' > '.encode(), content) > > + fd =3D open(file, 'wb') > > + fd.write(content) > > + fd.close() > > + print(file) > > + > > +if __name__ =3D=3D "__main__": > > + sys.exit(FormatFiles()) > > \ No newline at end of file > > -- > > 2.8.0.windows.1 > > > > _______________________________________________ > > edk2-devel mailing list > > edk2-devel@lists.01.org > > https://lists.01.org/mailman/listinfo/edk2-devel > _______________________________________________ > edk2-devel mailing list > edk2-devel@lists.01.org > https://lists.01.org/mailman/listinfo/edk2-devel