From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received-SPF: Pass (sender SPF authorized) identity=mailfrom; client-ip=134.134.136.65; helo=mga03.intel.com; envelope-from=jaben.carsey@intel.com; receiver=edk2-devel@lists.01.org Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 0C660207E4E03 for ; Thu, 24 May 2018 07:13:49 -0700 (PDT) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 24 May 2018 07:13:49 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.49,436,1520924400"; d="scan'208";a="231468999" Received: from fmsmsx105.amr.corp.intel.com ([10.18.124.203]) by fmsmga006.fm.intel.com with ESMTP; 24 May 2018 07:13:49 -0700 Received: from fmsmsx123.amr.corp.intel.com (10.18.125.38) by FMSMSX105.amr.corp.intel.com (10.18.124.203) with Microsoft SMTP Server (TLS) id 14.3.319.2; Thu, 24 May 2018 07:13:48 -0700 Received: from fmsmsx103.amr.corp.intel.com ([169.254.2.228]) by fmsmsx123.amr.corp.intel.com ([169.254.7.19]) with mapi id 14.03.0319.002; Thu, 24 May 2018 07:13:48 -0700 From: "Carsey, Jaben" To: "Gao, Liming" , "Kinney, Michael D" CC: "edk2-devel@lists.01.org" Thread-Topic: [edk2] [RFC] Formalize source files to follow DOS format Thread-Index: AQHT8L93zdwNLwuiKEyE83CT2z5wr6Q6QifQgAD7cgD//4srqYAAeaCAgAPFq4D//+kw8A== Date: Thu, 24 May 2018 14:13:47 +0000 Message-ID: References: <1526878301-13892-1-git-send-email-liming.gao@intel.com> , <05BFD767-DC75-401E-B651-6333815FDDFF@intel.com> <4A89E2EF3DFEDB4C8BFDE51014F606A14E230CAA@SHSMSX104.ccr.corp.intel.com> In-Reply-To: <4A89E2EF3DFEDB4C8BFDE51014F606A14E230CAA@SHSMSX104.ccr.corp.intel.com> Accept-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiYmMwNmUwMTItMTNmMy00YzI1LWJlMzAtMWFmZWRjZTc0ZTk4IiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX05UIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE3LjEwLjE4MDQuNDkiLCJUcnVzdGVkTGFiZWxIYXNoIjoiNE40N0NFZ2srK2J6cFwvemlCbmFCQUtQUkNUaUVDVE5HRmpSVHpRazQwazZzWFNnbXdGOWU4aitOa2UrNmJ2dDUifQ== x-ctpclassification: CTP_NT x-originating-ip: [10.1.200.106] MIME-Version: 1.0 Subject: Re: [RFC] Formalize source files to follow DOS format X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 May 2018 14:13:50 -0000 Content-Language: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable We could do something like we do for compiler flags... append or overwrite = depending on syntax. > -----Original Message----- > From: Gao, Liming > Sent: Thursday, May 24, 2018 1:35 AM > To: Kinney, Michael D ; Carsey, Jaben > > Cc: edk2-devel@lists.01.org > Subject: RE: [edk2] [RFC] Formalize source files to follow DOS format > Importance: High >=20 > Mike: > I agree your comments. On default file set, this script can have the de= fault > ones. User can specify more set to append the default ones instead of > override the default ones. >=20 > Thanks > Liming > >-----Original Message----- > >From: Kinney, Michael D > >Sent: Tuesday, May 22, 2018 6:59 AM > >To: Carsey, Jaben ; Kinney, Michael D > > > >Cc: Gao, Liming ; edk2-devel@lists.01.org > >Subject: RE: [edk2] [RFC] Formalize source files to follow DOS format > > > >Jaben, > > > >Yes. With default behavior is default set and > >specifying one or more extensions overrides the > >default set. > > > >Mike > > > >> -----Original Message----- > >> From: Carsey, Jaben > >> Sent: Monday, May 21, 2018 3:43 PM > >> To: Kinney, Michael D > >> Cc: Gao, Liming ; edk2- > >> devel@lists.01.org > >> Subject: Re: [edk2] [RFC] Formalize source files to > >> follow DOS format > >> > >> Mike, > >> > >> Perhaps a default set of file extensions that can be > >> overridden? > >> > >> -Jaben > >> > >> > >> > On May 21, 2018, at 3:41 PM, Kinney, Michael D > >> wrote: > >> > > >> > Liming, > >> > > >> > We have a set of standard flags for tools that > >> > should always be present. > >> > > >> > --help > >> > -v > >> > -q > >> > --debug > >> > > >> > We should also always have the program name, > >> > description, version, and copyright. > >> > > >> > Please see BaseTools/Scripts/BinToPcd.py as > >> > an example. > >> > > >> > It might be useful to have a way to run this tool > >> > on a single file when BaseTools/Scripts/PatchCheck.py > >> > reports an issue. > >> > > >> > Do you think it would be good to have one option to > >> > scan path for file extensions that are documented as > >> > DOS line endings so the extensions do not have to be > >> > entered? > >> > > >> > Mike > >> > > >> > > >> >> -----Original Message----- > >> >> From: edk2-devel [mailto:edk2-devel- > >> >> bounces@lists.01.org] On Behalf Of Carsey, Jaben > >> >> Sent: Monday, May 21, 2018 7:50 AM > >> >> To: Gao, Liming ; edk2- > >> >> devel@lists.01.org > >> >> Subject: Re: [edk2] [RFC] Formalize source files to > >> >> follow DOS format > >> >> > >> >> Liming, > >> >> > >> >> One Pep8 thing. > >> >> Can you change to use the with statement for the file > >> >> read/write? > >> >> > >> >> Other small thoughts. > >> >> I think that FileList should be changed to a set as > >> >> order is not important. > >> >> Maybe wrapper the re.sub function with your own so > >> all > >> >> the .encode() are in one location? As we move to > >> python > >> >> 3 we will have fewer changes to make. > >> >> > >> >> > >> >>> -----Original Message----- > >> >>> From: edk2-devel [mailto:edk2-devel- > >> >> bounces@lists.01.org] On Behalf Of > >> >>> Liming Gao > >> >>> Sent: Sunday, May 20, 2018 9:52 PM > >> >>> To: edk2-devel@lists.01.org > >> >>> Subject: [edk2] [RFC] Formalize source files to > >> follow > >> >> DOS format > >> >>> > >> >>> FormatDosFiles.py is added to clean up dos source > >> >> files. It bases on > >> >>> the rules defined in EDKII C Coding Standards > >> >> Specification. > >> >>> 5.1.2 Do not use tab characters > >> >>> 5.1.6 Only use CRLF (Carriage Return Line Feed) line > >> >> endings. > >> >>> 5.1.7 All files must end with CRLF > >> >>> No trailing white space in one line. (To be added in > >> >> spec) > >> >>> > >> >>> The source files in edk2 project with the below > >> >> postfix are dos format. > >> >>> .h .c .nasm .nasmb .asm .S .inf .dec .dsc .fdf .uni > >> >> .asl .aslc .vfr .idf > >> >>> .txt .bat .py > >> >>> > >> >>> The package maintainer can use this script to clean > >> up > >> >> all files in his > >> >>> package. The prefer way is to create one patch per > >> one > >> >> package. > >> >>> > >> >>> Contributed-under: TianoCore Contribution Agreement > >> >> 1.1 > >> >>> Signed-off-by: Liming Gao > >> >>> --- > >> >>> BaseTools/Scripts/FormatDosFiles.py | 93 > >> >>> +++++++++++++++++++++++++++++++++++++ > >> >>> 1 file changed, 93 insertions(+) > >> >>> create mode 100644 > >> >> BaseTools/Scripts/FormatDosFiles.py > >> >>> > >> >>> diff --git a/BaseTools/Scripts/FormatDosFiles.py > >> >>> b/BaseTools/Scripts/FormatDosFiles.py > >> >>> new file mode 100644 > >> >>> index 0000000..c3a5476 > >> >>> --- /dev/null > >> >>> +++ b/BaseTools/Scripts/FormatDosFiles.py > >> >>> @@ -0,0 +1,93 @@ > >> >>> +# @file FormatDosFiles.py > >> >>> +# This script format the source files to follow dos > >> >> style. > >> >>> +# It supports Python2.x and Python3.x both. > >> >>> +# > >> >>> +# Copyright (c) 2018, Intel Corporation. All > >> rights > >> >> reserved.
> >> >>> +# > >> >>> +# This program and the accompanying materials > >> >>> +# are licensed and made available under the terms > >> >> and conditions of the > >> >>> BSD License > >> >>> +# which accompanies this distribution. The full > >> >> text of the license may be > >> >>> found at > >> >>> +# http://opensource.org/licenses/bsd-license.php > >> >>> +# > >> >>> +# THE PROGRAM IS DISTRIBUTED UNDER THE BSD LICENSE > >> >> ON AN "AS IS" > >> >>> BASIS, > >> >>> +# WITHOUT WARRANTIES OR REPRESENTATIONS OF ANY > >> KIND, > >> >> EITHER > >> >>> EXPRESS OR IMPLIED. > >> >>> +# > >> >>> + > >> >>> +# > >> >>> +# Import Modules > >> >>> +# > >> >>> +import argparse > >> >>> +import os > >> >>> +import os.path > >> >>> +import re > >> >>> +import sys > >> >>> + > >> >>> +""" > >> >>> +difference of string between python2 and python3: > >> >>> + > >> >>> +there is a large difference of string in python2 > >> and > >> >> python3. > >> >>> + > >> >>> +in python2,there are two type string,unicode string > >> >> (unicode type) and 8-bit > >> >>> string (str type). > >> >>> + us =3D u"abcd", > >> >>> + unicode string,which is internally stored as > >> unicode > >> >> code point. > >> >>> + s =3D "abcd",s =3D b"abcd",s =3D r"abcd", > >> >>> + all of them are 8-bit string,which is > >> internally > >> >> stored as bytes. > >> >>> + > >> >>> +in python3,a new type called bytes replace 8-bit > >> >> string,and str type is > >> >>> regarded as unicode string. > >> >>> + s =3D "abcd", s =3D u"abcd", s =3D r"abcd", > >> >>> + all of them are str type,which is internally > >> stored > >> >> unicode code point. > >> >>> + bs =3D b"abcd", > >> >>> + bytes type,which is interally stored as bytes > >> >>> + > >> >>> +in python2 ,the both type string can be mixed > >> use,but > >> >> in python3 it could > >> >>> not, > >> >>> +which means the pattern and content in re match > >> >> should be the same type > >> >>> in python3. > >> >>> +in function FormatFile,it read file in binary mode > >> so > >> >> that the content is bytes > >> >>> type,so the pattern should also be bytes type. > >> >>> +As a result,I add encode() to make it compitable > >> >> among python2 and > >> >>> python3. > >> >>> + > >> >>> +difference of encode,decode in python2 and python3: > >> >>> +the builtin function str.encode(encoding) and > >> >> str.decode(encoding) are > >> >>> used for convert between 8-bit string and unicode > >> >> string. > >> >>> + > >> >>> +in python2 > >> >>> + encode convert unicode type to str type.decode > >> vice > >> >> versa.default > >> >>> encoding is ascii. > >> >>> + for example: s =3D us.encode() > >> >>> + but if the us is str type,the code will also > >> work.it > >> >> will be firstly convert > >> >>> to unicode type, > >> >>> + in this situation,the call equals s =3D > >> >> us.decode().encode(). > >> >>> + > >> >>> +in python3 > >> >>> + encode convert str type to bytes type,decode > >> vice > >> >> versa.default > >> >>> encoding is utf8. > >> >>> + fpr example: > >> >>> + bs =3D s.encode(),only str type has encode > >> method,so > >> >> that won't be > >> >>> used wrongly.decode is the same. > >> >>> + > >> >>> +in conclusion: > >> >>> + this code could work the same in python27 and > >> >> python36 > >> >>> environment as far as the re pattern satisfy ascii > >> >> character set. > >> >>> + > >> >>> +""" > >> >>> +def FormatFiles(): > >> >>> + parser =3D argparse.ArgumentParser() > >> >>> + parser.add_argument('path', nargs=3D1, help=3D'The > >> >> path for files to be > >> >>> converted.') > >> >>> + parser.add_argument('extensions', nargs=3D'+', > >> >> help=3D'File extensions filter. > >> >>> (Example: .txt .c .h)') > >> >>> + args =3D parser.parse_args() > >> >>> + filelist =3D [] > >> >>> + for dirpath, dirnames, filenames in > >> >> os.walk(args.path[0]): > >> >>> + for filename in [f for f in filenames if > >> >> any(f.endswith(ext) for ext in > >> >>> args.extensions)]: > >> >>> + filelist.append(os.path.join(dirpath, > >> >> filename)) > >> >>> + for file in filelist: > >> >>> + fd =3D open(file, 'rb') > >> >>> + content =3D fd.read() > >> >>> + fd.close() > >> >>> + # Convert the line endings to CRLF > >> >>> + content =3D re.sub(r'([^\r])\n'.encode(), > >> >> r'\1\r\n'.encode(), content) > >> >>> + content =3D re.sub(r'^\n'.encode(), > >> >> r'\r\n'.encode(), content, flags =3D > >> >>> re.MULTILINE) > >> >>> + # Add a new empty line if the file is not > >> end > >> >> with one > >> >>> + content =3D re.sub(r'([^\r\n])$'.encode(), > >> >> r'\1\r\n'.encode(), content) > >> >>> + # Remove trailing white spaces > >> >>> + content =3D re.sub(r'[ \t]+(\r\n)'.encode(), > >> >> r'\1'.encode(), content, flags =3D > >> >>> re.MULTILINE) > >> >>> + # Replace '\t' with two spaces > >> >>> + content =3D re.sub('\t'.encode(), ' > >> >> '.encode(), content) > >> >>> + fd =3D open(file, 'wb') > >> >>> + fd.write(content) > >> >>> + fd.close() > >> >>> + print(file) > >> >>> + > >> >>> +if __name__ =3D=3D "__main__": > >> >>> + sys.exit(FormatFiles()) > >> >>> \ No newline at end of file > >> >>> -- > >> >>> 2.8.0.windows.1 > >> >>> > >> >>> _______________________________________________ > >> >>> edk2-devel mailing list > >> >>> edk2-devel@lists.01.org > >> >>> https://lists.01.org/mailman/listinfo/edk2-devel > >> >> _______________________________________________ > >> >> edk2-devel mailing list > >> >> edk2-devel@lists.01.org > >> >> https://lists.01.org/mailman/listinfo/edk2-devel