From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received-SPF: Pass (sender SPF authorized) identity=mailfrom; client-ip=192.55.52.115; helo=mga14.intel.com; envelope-from=liming.gao@intel.com; receiver=edk2-devel@lists.01.org Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 45DBE2083775C for ; Thu, 24 May 2018 01:35:05 -0700 (PDT) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 24 May 2018 01:35:05 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.49,436,1520924400"; d="scan'208";a="58076457" Received: from fmsmsx107.amr.corp.intel.com ([10.18.124.205]) by fmsmga001.fm.intel.com with ESMTP; 24 May 2018 01:35:04 -0700 Received: from fmsmsx119.amr.corp.intel.com (10.18.124.207) by fmsmsx107.amr.corp.intel.com (10.18.124.205) with Microsoft SMTP Server (TLS) id 14.3.319.2; Thu, 24 May 2018 01:35:04 -0700 Received: from shsmsx101.ccr.corp.intel.com (10.239.4.153) by FMSMSX119.amr.corp.intel.com (10.18.124.207) with Microsoft SMTP Server (TLS) id 14.3.319.2; Thu, 24 May 2018 01:35:04 -0700 Received: from shsmsx104.ccr.corp.intel.com ([169.254.5.240]) by SHSMSX101.ccr.corp.intel.com ([169.254.1.40]) with mapi id 14.03.0319.002; Thu, 24 May 2018 16:35:01 +0800 From: "Gao, Liming" To: "Kinney, Michael D" , "Carsey, Jaben" CC: "edk2-devel@lists.01.org" Thread-Topic: [edk2] [RFC] Formalize source files to follow DOS format Thread-Index: AQHT8L95unhbvU1d5EWbV5oLOW8ZXqQ5vnsAgACDqQCAAACDAIAABEiAgARLAgA= Date: Thu, 24 May 2018 08:35:01 +0000 Message-ID: <4A89E2EF3DFEDB4C8BFDE51014F606A14E230CAA@SHSMSX104.ccr.corp.intel.com> References: <1526878301-13892-1-git-send-email-liming.gao@intel.com> , <05BFD767-DC75-401E-B651-6333815FDDFF@intel.com> In-Reply-To: Accept-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.239.127.40] MIME-Version: 1.0 Subject: Re: [RFC] Formalize source files to follow DOS format X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 May 2018 08:35:08 -0000 Content-Language: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Mike: I agree your comments. On default file set, this script can have the defa= ult ones. User can specify more set to append the default ones instead of o= verride the default ones. Thanks Liming >-----Original Message----- >From: Kinney, Michael D >Sent: Tuesday, May 22, 2018 6:59 AM >To: Carsey, Jaben ; Kinney, Michael D > >Cc: Gao, Liming ; edk2-devel@lists.01.org >Subject: RE: [edk2] [RFC] Formalize source files to follow DOS format > >Jaben, > >Yes. With default behavior is default set and >specifying one or more extensions overrides the >default set. > >Mike > >> -----Original Message----- >> From: Carsey, Jaben >> Sent: Monday, May 21, 2018 3:43 PM >> To: Kinney, Michael D >> Cc: Gao, Liming ; edk2- >> devel@lists.01.org >> Subject: Re: [edk2] [RFC] Formalize source files to >> follow DOS format >> >> Mike, >> >> Perhaps a default set of file extensions that can be >> overridden? >> >> -Jaben >> >> >> > On May 21, 2018, at 3:41 PM, Kinney, Michael D >> wrote: >> > >> > Liming, >> > >> > We have a set of standard flags for tools that >> > should always be present. >> > >> > --help >> > -v >> > -q >> > --debug >> > >> > We should also always have the program name, >> > description, version, and copyright. >> > >> > Please see BaseTools/Scripts/BinToPcd.py as >> > an example. >> > >> > It might be useful to have a way to run this tool >> > on a single file when BaseTools/Scripts/PatchCheck.py >> > reports an issue. >> > >> > Do you think it would be good to have one option to >> > scan path for file extensions that are documented as >> > DOS line endings so the extensions do not have to be >> > entered? >> > >> > Mike >> > >> > >> >> -----Original Message----- >> >> From: edk2-devel [mailto:edk2-devel- >> >> bounces@lists.01.org] On Behalf Of Carsey, Jaben >> >> Sent: Monday, May 21, 2018 7:50 AM >> >> To: Gao, Liming ; edk2- >> >> devel@lists.01.org >> >> Subject: Re: [edk2] [RFC] Formalize source files to >> >> follow DOS format >> >> >> >> Liming, >> >> >> >> One Pep8 thing. >> >> Can you change to use the with statement for the file >> >> read/write? >> >> >> >> Other small thoughts. >> >> I think that FileList should be changed to a set as >> >> order is not important. >> >> Maybe wrapper the re.sub function with your own so >> all >> >> the .encode() are in one location? As we move to >> python >> >> 3 we will have fewer changes to make. >> >> >> >> >> >>> -----Original Message----- >> >>> From: edk2-devel [mailto:edk2-devel- >> >> bounces@lists.01.org] On Behalf Of >> >>> Liming Gao >> >>> Sent: Sunday, May 20, 2018 9:52 PM >> >>> To: edk2-devel@lists.01.org >> >>> Subject: [edk2] [RFC] Formalize source files to >> follow >> >> DOS format >> >>> >> >>> FormatDosFiles.py is added to clean up dos source >> >> files. It bases on >> >>> the rules defined in EDKII C Coding Standards >> >> Specification. >> >>> 5.1.2 Do not use tab characters >> >>> 5.1.6 Only use CRLF (Carriage Return Line Feed) line >> >> endings. >> >>> 5.1.7 All files must end with CRLF >> >>> No trailing white space in one line. (To be added in >> >> spec) >> >>> >> >>> The source files in edk2 project with the below >> >> postfix are dos format. >> >>> .h .c .nasm .nasmb .asm .S .inf .dec .dsc .fdf .uni >> >> .asl .aslc .vfr .idf >> >>> .txt .bat .py >> >>> >> >>> The package maintainer can use this script to clean >> up >> >> all files in his >> >>> package. The prefer way is to create one patch per >> one >> >> package. >> >>> >> >>> Contributed-under: TianoCore Contribution Agreement >> >> 1.1 >> >>> Signed-off-by: Liming Gao >> >>> --- >> >>> BaseTools/Scripts/FormatDosFiles.py | 93 >> >>> +++++++++++++++++++++++++++++++++++++ >> >>> 1 file changed, 93 insertions(+) >> >>> create mode 100644 >> >> BaseTools/Scripts/FormatDosFiles.py >> >>> >> >>> diff --git a/BaseTools/Scripts/FormatDosFiles.py >> >>> b/BaseTools/Scripts/FormatDosFiles.py >> >>> new file mode 100644 >> >>> index 0000000..c3a5476 >> >>> --- /dev/null >> >>> +++ b/BaseTools/Scripts/FormatDosFiles.py >> >>> @@ -0,0 +1,93 @@ >> >>> +# @file FormatDosFiles.py >> >>> +# This script format the source files to follow dos >> >> style. >> >>> +# It supports Python2.x and Python3.x both. >> >>> +# >> >>> +# Copyright (c) 2018, Intel Corporation. All >> rights >> >> reserved.
>> >>> +# >> >>> +# This program and the accompanying materials >> >>> +# are licensed and made available under the terms >> >> and conditions of the >> >>> BSD License >> >>> +# which accompanies this distribution. The full >> >> text of the license may be >> >>> found at >> >>> +# http://opensource.org/licenses/bsd-license.php >> >>> +# >> >>> +# THE PROGRAM IS DISTRIBUTED UNDER THE BSD LICENSE >> >> ON AN "AS IS" >> >>> BASIS, >> >>> +# WITHOUT WARRANTIES OR REPRESENTATIONS OF ANY >> KIND, >> >> EITHER >> >>> EXPRESS OR IMPLIED. >> >>> +# >> >>> + >> >>> +# >> >>> +# Import Modules >> >>> +# >> >>> +import argparse >> >>> +import os >> >>> +import os.path >> >>> +import re >> >>> +import sys >> >>> + >> >>> +""" >> >>> +difference of string between python2 and python3: >> >>> + >> >>> +there is a large difference of string in python2 >> and >> >> python3. >> >>> + >> >>> +in python2,there are two type string,unicode string >> >> (unicode type) and 8-bit >> >>> string (str type). >> >>> + us =3D u"abcd", >> >>> + unicode string,which is internally stored as >> unicode >> >> code point. >> >>> + s =3D "abcd",s =3D b"abcd",s =3D r"abcd", >> >>> + all of them are 8-bit string,which is >> internally >> >> stored as bytes. >> >>> + >> >>> +in python3,a new type called bytes replace 8-bit >> >> string,and str type is >> >>> regarded as unicode string. >> >>> + s =3D "abcd", s =3D u"abcd", s =3D r"abcd", >> >>> + all of them are str type,which is internally >> stored >> >> unicode code point. >> >>> + bs =3D b"abcd", >> >>> + bytes type,which is interally stored as bytes >> >>> + >> >>> +in python2 ,the both type string can be mixed >> use,but >> >> in python3 it could >> >>> not, >> >>> +which means the pattern and content in re match >> >> should be the same type >> >>> in python3. >> >>> +in function FormatFile,it read file in binary mode >> so >> >> that the content is bytes >> >>> type,so the pattern should also be bytes type. >> >>> +As a result,I add encode() to make it compitable >> >> among python2 and >> >>> python3. >> >>> + >> >>> +difference of encode,decode in python2 and python3: >> >>> +the builtin function str.encode(encoding) and >> >> str.decode(encoding) are >> >>> used for convert between 8-bit string and unicode >> >> string. >> >>> + >> >>> +in python2 >> >>> + encode convert unicode type to str type.decode >> vice >> >> versa.default >> >>> encoding is ascii. >> >>> + for example: s =3D us.encode() >> >>> + but if the us is str type,the code will also >> work.it >> >> will be firstly convert >> >>> to unicode type, >> >>> + in this situation,the call equals s =3D >> >> us.decode().encode(). >> >>> + >> >>> +in python3 >> >>> + encode convert str type to bytes type,decode >> vice >> >> versa.default >> >>> encoding is utf8. >> >>> + fpr example: >> >>> + bs =3D s.encode(),only str type has encode >> method,so >> >> that won't be >> >>> used wrongly.decode is the same. >> >>> + >> >>> +in conclusion: >> >>> + this code could work the same in python27 and >> >> python36 >> >>> environment as far as the re pattern satisfy ascii >> >> character set. >> >>> + >> >>> +""" >> >>> +def FormatFiles(): >> >>> + parser =3D argparse.ArgumentParser() >> >>> + parser.add_argument('path', nargs=3D1, help=3D'The >> >> path for files to be >> >>> converted.') >> >>> + parser.add_argument('extensions', nargs=3D'+', >> >> help=3D'File extensions filter. >> >>> (Example: .txt .c .h)') >> >>> + args =3D parser.parse_args() >> >>> + filelist =3D [] >> >>> + for dirpath, dirnames, filenames in >> >> os.walk(args.path[0]): >> >>> + for filename in [f for f in filenames if >> >> any(f.endswith(ext) for ext in >> >>> args.extensions)]: >> >>> + filelist.append(os.path.join(dirpath, >> >> filename)) >> >>> + for file in filelist: >> >>> + fd =3D open(file, 'rb') >> >>> + content =3D fd.read() >> >>> + fd.close() >> >>> + # Convert the line endings to CRLF >> >>> + content =3D re.sub(r'([^\r])\n'.encode(), >> >> r'\1\r\n'.encode(), content) >> >>> + content =3D re.sub(r'^\n'.encode(), >> >> r'\r\n'.encode(), content, flags =3D >> >>> re.MULTILINE) >> >>> + # Add a new empty line if the file is not >> end >> >> with one >> >>> + content =3D re.sub(r'([^\r\n])$'.encode(), >> >> r'\1\r\n'.encode(), content) >> >>> + # Remove trailing white spaces >> >>> + content =3D re.sub(r'[ \t]+(\r\n)'.encode(), >> >> r'\1'.encode(), content, flags =3D >> >>> re.MULTILINE) >> >>> + # Replace '\t' with two spaces >> >>> + content =3D re.sub('\t'.encode(), ' >> >> '.encode(), content) >> >>> + fd =3D open(file, 'wb') >> >>> + fd.write(content) >> >>> + fd.close() >> >>> + print(file) >> >>> + >> >>> +if __name__ =3D=3D "__main__": >> >>> + sys.exit(FormatFiles()) >> >>> \ No newline at end of file >> >>> -- >> >>> 2.8.0.windows.1 >> >>> >> >>> _______________________________________________ >> >>> edk2-devel mailing list >> >>> edk2-devel@lists.01.org >> >>> https://lists.01.org/mailman/listinfo/edk2-devel >> >> _______________________________________________ >> >> edk2-devel mailing list >> >> edk2-devel@lists.01.org >> >> https://lists.01.org/mailman/listinfo/edk2-devel