From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received-SPF: Pass (sender SPF authorized) identity=mailfrom; client-ip=134.134.136.20; helo=mga02.intel.com; envelope-from=jaben.carsey@intel.com; receiver=edk2-devel@lists.01.org Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 1C25D203BBB80 for ; Thu, 24 May 2018 07:13:12 -0700 (PDT) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 24 May 2018 07:13:11 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.49,436,1520924400"; d="scan'208";a="43606771" Received: from fmsmsx104.amr.corp.intel.com ([10.18.124.202]) by orsmga007.jf.intel.com with ESMTP; 24 May 2018 07:13:11 -0700 Received: from fmsmsx103.amr.corp.intel.com ([169.254.2.228]) by fmsmsx104.amr.corp.intel.com ([169.254.3.184]) with mapi id 14.03.0319.002; Thu, 24 May 2018 07:13:11 -0700 From: "Carsey, Jaben" To: "Gao, Liming" , "edk2-devel@lists.01.org" Thread-Topic: [edk2] [RFC] Formalize source files to follow DOS format Thread-Index: AQHT8L93zdwNLwuiKEyE83CT2z5wr6Q6QifQgATE3AD//+n5sA== Date: Thu, 24 May 2018 14:13:09 +0000 Message-ID: References: <1526878301-13892-1-git-send-email-liming.gao@intel.com> <4A89E2EF3DFEDB4C8BFDE51014F606A14E230C9B@SHSMSX104.ccr.corp.intel.com> In-Reply-To: <4A89E2EF3DFEDB4C8BFDE51014F606A14E230C9B@SHSMSX104.ccr.corp.intel.com> Accept-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiOGRjZjNiZDItODNmMS00NDFhLWIzY2UtNGExMDg5NzkzZjk0IiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX05UIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE3LjEwLjE4MDQuNDkiLCJUcnVzdGVkTGFiZWxIYXNoIjoiS2VWSnYwcUJ1VmQ3Tk1wcUNXVGNra3BFXC9CSlRsRzR4RVhNSzJWbmZcLzJOMUR2OGxhN0FLZURmU25WMng3b0s5In0= x-ctpclassification: CTP_NT x-originating-ip: [10.1.200.106] MIME-Version: 1.0 Subject: Re: [RFC] Formalize source files to follow DOS format X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 May 2018 14:13:12 -0000 Content-Language: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Follow pep8 for coding style. The technical benefit is things like that If an exception occurs we still c= lose the file. > -----Original Message----- > From: Gao, Liming > Sent: Thursday, May 24, 2018 1:31 AM > To: Carsey, Jaben ; edk2-devel@lists.01.org > Subject: RE: [edk2] [RFC] Formalize source files to follow DOS format > Importance: High >=20 > Jaben: > What difference of statement for file read/write? >=20 > Besides, we use .encode() here to support python 3. After we move to > python 3, this script is not changed. >=20 > Thanks > Liming > >-----Original Message----- > >From: Carsey, Jaben > >Sent: Monday, May 21, 2018 10:50 PM > >To: Gao, Liming ; edk2-devel@lists.01.org > >Subject: RE: [edk2] [RFC] Formalize source files to follow DOS format > > > >Liming, > > > >One Pep8 thing. > >Can you change to use the with statement for the file read/write? > > > >Other small thoughts. > >I think that FileList should be changed to a set as order is not importa= nt. > >Maybe wrapper the re.sub function with your own so all the .encode() are > in > >one location? As we move to python 3 we will have fewer changes to > make. > > > > > >> -----Original Message----- > >> From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of > >> Liming Gao > >> Sent: Sunday, May 20, 2018 9:52 PM > >> To: edk2-devel@lists.01.org > >> Subject: [edk2] [RFC] Formalize source files to follow DOS format > >> > >> FormatDosFiles.py is added to clean up dos source files. It bases on > >> the rules defined in EDKII C Coding Standards Specification. > >> 5.1.2 Do not use tab characters > >> 5.1.6 Only use CRLF (Carriage Return Line Feed) line endings. > >> 5.1.7 All files must end with CRLF > >> No trailing white space in one line. (To be added in spec) > >> > >> The source files in edk2 project with the below postfix are dos format= . > >> .h .c .nasm .nasmb .asm .S .inf .dec .dsc .fdf .uni .asl .aslc .vfr .i= df > >> .txt .bat .py > >> > >> The package maintainer can use this script to clean up all files in hi= s > >> package. The prefer way is to create one patch per one package. > >> > >> Contributed-under: TianoCore Contribution Agreement 1.1 > >> Signed-off-by: Liming Gao > >> --- > >> BaseTools/Scripts/FormatDosFiles.py | 93 > >> +++++++++++++++++++++++++++++++++++++ > >> 1 file changed, 93 insertions(+) > >> create mode 100644 BaseTools/Scripts/FormatDosFiles.py > >> > >> diff --git a/BaseTools/Scripts/FormatDosFiles.py > >> b/BaseTools/Scripts/FormatDosFiles.py > >> new file mode 100644 > >> index 0000000..c3a5476 > >> --- /dev/null > >> +++ b/BaseTools/Scripts/FormatDosFiles.py > >> @@ -0,0 +1,93 @@ > >> +# @file FormatDosFiles.py > >> +# This script format the source files to follow dos style. > >> +# It supports Python2.x and Python3.x both. > >> +# > >> +# Copyright (c) 2018, Intel Corporation. All rights reserved.
> >> +# > >> +# This program and the accompanying materials > >> +# are licensed and made available under the terms and conditions of = the > >> BSD License > >> +# which accompanies this distribution. The full text of the license= may > be > >> found at > >> +# http://opensource.org/licenses/bsd-license.php > >> +# > >> +# THE PROGRAM IS DISTRIBUTED UNDER THE BSD LICENSE ON AN "AS > IS" > >> BASIS, > >> +# WITHOUT WARRANTIES OR REPRESENTATIONS OF ANY KIND, EITHER > >> EXPRESS OR IMPLIED. > >> +# > >> + > >> +# > >> +# Import Modules > >> +# > >> +import argparse > >> +import os > >> +import os.path > >> +import re > >> +import sys > >> + > >> +""" > >> +difference of string between python2 and python3: > >> + > >> +there is a large difference of string in python2 and python3. > >> + > >> +in python2,there are two type string,unicode string (unicode type) an= d > 8- > >bit > >> string (str type). > >> + us =3D u"abcd", > >> + unicode string,which is internally stored as unicode code point. > >> + s =3D "abcd",s =3D b"abcd",s =3D r"abcd", > >> + all of them are 8-bit string,which is internally stored as bytes. > >> + > >> +in python3,a new type called bytes replace 8-bit string,and str type = is > >> regarded as unicode string. > >> + s =3D "abcd", s =3D u"abcd", s =3D r"abcd", > >> + all of them are str type,which is internally stored unicode code poi= nt. > >> + bs =3D b"abcd", > >> + bytes type,which is interally stored as bytes > >> + > >> +in python2 ,the both type string can be mixed use,but in python3 it c= ould > >> not, > >> +which means the pattern and content in re match should be the same > type > >> in python3. > >> +in function FormatFile,it read file in binary mode so that the conten= t is > >bytes > >> type,so the pattern should also be bytes type. > >> +As a result,I add encode() to make it compitable among python2 and > >> python3. > >> + > >> +difference of encode,decode in python2 and python3: > >> +the builtin function str.encode(encoding) and str.decode(encoding) ar= e > >> used for convert between 8-bit string and unicode string. > >> + > >> +in python2 > >> + encode convert unicode type to str type.decode vice versa.default > >> encoding is ascii. > >> + for example: s =3D us.encode() > >> + but if the us is str type,the code will also work.it will be firstly= convert > >> to unicode type, > >> + in this situation,the call equals s =3D us.decode().encode(). > >> + > >> +in python3 > >> + encode convert str type to bytes type,decode vice versa.default > >> encoding is utf8. > >> + fpr example: > >> + bs =3D s.encode(),only str type has encode method,so that won't be > >> used wrongly.decode is the same. > >> + > >> +in conclusion: > >> + this code could work the same in python27 and python36 > >> environment as far as the re pattern satisfy ascii character set. > >> + > >> +""" > >> +def FormatFiles(): > >> + parser =3D argparse.ArgumentParser() > >> + parser.add_argument('path', nargs=3D1, help=3D'The path for files= to be > >> converted.') > >> + parser.add_argument('extensions', nargs=3D'+', help=3D'File exten= sions > filter. > >> (Example: .txt .c .h)') > >> + args =3D parser.parse_args() > >> + filelist =3D [] > >> + for dirpath, dirnames, filenames in os.walk(args.path[0]): > >> + for filename in [f for f in filenames if any(f.endswith(ext) = for ext in > >> args.extensions)]: > >> + filelist.append(os.path.join(dirpath, filename)) > >> + for file in filelist: > >> + fd =3D open(file, 'rb') > >> + content =3D fd.read() > >> + fd.close() > >> + # Convert the line endings to CRLF > >> + content =3D re.sub(r'([^\r])\n'.encode(), r'\1\r\n'.encode(),= content) > >> + content =3D re.sub(r'^\n'.encode(), r'\r\n'.encode(), content= , flags =3D > >> re.MULTILINE) > >> + # Add a new empty line if the file is not end with one > >> + content =3D re.sub(r'([^\r\n])$'.encode(), r'\1\r\n'.encode()= , content) > >> + # Remove trailing white spaces > >> + content =3D re.sub(r'[ \t]+(\r\n)'.encode(), r'\1'.encode(), = content, > flags > >=3D > >> re.MULTILINE) > >> + # Replace '\t' with two spaces > >> + content =3D re.sub('\t'.encode(), ' '.encode(), content) > >> + fd =3D open(file, 'wb') > >> + fd.write(content) > >> + fd.close() > >> + print(file) > >> + > >> +if __name__ =3D=3D "__main__": > >> + sys.exit(FormatFiles()) > >> \ No newline at end of file > >> -- > >> 2.8.0.windows.1 > >> > >> _______________________________________________ > >> edk2-devel mailing list > >> edk2-devel@lists.01.org > >> https://lists.01.org/mailman/listinfo/edk2-devel