public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
From: Liming Gao <liming.gao@intel.com>
To: edk2-devel@lists.01.org
Subject: [RFC] Formalize source files to follow DOS format
Date: Mon, 21 May 2018 12:51:41 +0800	[thread overview]
Message-ID: <1526878301-13892-1-git-send-email-liming.gao@intel.com> (raw)

FormatDosFiles.py is added to clean up dos source files. It bases on
the rules defined in EDKII C Coding Standards Specification.
5.1.2 Do not use tab characters
5.1.6 Only use CRLF (Carriage Return Line Feed) line endings.
5.1.7 All files must end with CRLF
No trailing white space in one line. (To be added in spec)

The source files in edk2 project with the below postfix are dos format.
.h .c .nasm .nasmb .asm .S .inf .dec .dsc .fdf .uni .asl .aslc .vfr .idf 
.txt .bat .py

The package maintainer can use this script to clean up all files in his 
package. The prefer way is to create one patch per one package.

Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Liming Gao <liming.gao@intel.com>
---
 BaseTools/Scripts/FormatDosFiles.py | 93 +++++++++++++++++++++++++++++++++++++
 1 file changed, 93 insertions(+)
 create mode 100644 BaseTools/Scripts/FormatDosFiles.py

diff --git a/BaseTools/Scripts/FormatDosFiles.py b/BaseTools/Scripts/FormatDosFiles.py
new file mode 100644
index 0000000..c3a5476
--- /dev/null
+++ b/BaseTools/Scripts/FormatDosFiles.py
@@ -0,0 +1,93 @@
+# @file FormatDosFiles.py
+# This script format the source files to follow dos style.
+# It supports Python2.x and Python3.x both.
+#
+#  Copyright (c) 2018, Intel Corporation. All rights reserved.<BR>
+#
+#  This program and the accompanying materials
+#  are licensed and made available under the terms and conditions of the BSD License
+#  which accompanies this distribution.  The full text of the license may be found at
+#  http://opensource.org/licenses/bsd-license.php
+#
+#  THE PROGRAM IS DISTRIBUTED UNDER THE BSD LICENSE ON AN "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR REPRESENTATIONS OF ANY KIND, EITHER EXPRESS OR IMPLIED.
+#
+
+#
+# Import Modules
+#
+import argparse
+import os
+import os.path
+import re
+import sys
+
+"""
+difference of string between python2 and python3:
+
+there is a large difference of string in python2 and python3.
+
+in python2,there are two type string,unicode string (unicode type) and 8-bit string (str type).
+	us = u"abcd",
+	unicode string,which is internally stored as unicode code point.
+	s = "abcd",s = b"abcd",s = r"abcd",
+	all of them are 8-bit string,which is internally stored as bytes.
+
+in python3,a new type called bytes replace 8-bit string,and str type is regarded as unicode string.
+	s = "abcd", s = u"abcd", s = r"abcd",
+	all of them are str type,which is internally stored unicode code point.
+	bs = b"abcd",
+	bytes type,which is interally stored as bytes
+
+in python2 ,the both type string can be mixed use,but in python3 it could not,
+which means the pattern and content in re match should be the same type in python3.
+in function FormatFile,it read file in binary mode so that the content is bytes type,so the pattern should also be bytes type.
+As a result,I add encode() to make it compitable among python2 and python3.
+  
+difference of encode,decode in python2 and python3: 
+the builtin function str.encode(encoding) and str.decode(encoding) are used for convert between 8-bit string and unicode string.
+
+in python2
+	encode convert unicode type to str type.decode vice versa.default encoding is ascii.
+	for example: s = us.encode()
+	but if the us is str type,the code will also work.it will be firstly convert to unicode type,
+	in this situation,the call equals s = us.decode().encode().
+
+in python3
+	encode convert str type to bytes type,decode vice versa.default encoding is utf8.
+	fpr example:
+	bs = s.encode(),only str type has encode method,so that won't be used wrongly.decode is the same.
+	
+in conclusion:
+	this code could work the same in python27 and python36 environment as far as the re pattern satisfy ascii character set.
+
+"""
+def FormatFiles():
+    parser = argparse.ArgumentParser()
+    parser.add_argument('path', nargs=1, help='The path for files to be converted.')
+    parser.add_argument('extensions', nargs='+', help='File extensions filter. (Example: .txt .c .h)')
+    args = parser.parse_args()
+    filelist = []
+    for dirpath, dirnames, filenames in os.walk(args.path[0]):
+        for filename in [f for f in filenames if any(f.endswith(ext) for ext in args.extensions)]:
+            filelist.append(os.path.join(dirpath, filename))
+    for file in filelist:
+        fd = open(file, 'rb')
+        content = fd.read()
+        fd.close()
+        # Convert the line endings to CRLF
+        content = re.sub(r'([^\r])\n'.encode(), r'\1\r\n'.encode(), content)
+        content = re.sub(r'^\n'.encode(), r'\r\n'.encode(), content, flags = re.MULTILINE)
+        # Add a new empty line if the file is not end with one
+        content = re.sub(r'([^\r\n])$'.encode(), r'\1\r\n'.encode(), content)
+        # Remove trailing white spaces
+        content = re.sub(r'[ \t]+(\r\n)'.encode(), r'\1'.encode(), content, flags = re.MULTILINE)
+        # Replace '\t' with two spaces
+        content = re.sub('\t'.encode(), '  '.encode(), content)
+        fd = open(file, 'wb')
+        fd.write(content)
+        fd.close()
+        print(file)
+
+if __name__ == "__main__":
+    sys.exit(FormatFiles())
\ No newline at end of file
-- 
2.8.0.windows.1



             reply	other threads:[~2018-05-21  4:52 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-21  4:51 Liming Gao [this message]
2018-05-21 14:50 ` [RFC] Formalize source files to follow DOS format Carsey, Jaben
2018-05-21 22:41   ` Kinney, Michael D
2018-05-21 22:43     ` Carsey, Jaben
2018-05-21 22:58       ` Kinney, Michael D
2018-05-24  8:35         ` Gao, Liming
2018-05-24 14:13           ` Carsey, Jaben
2018-05-25  2:24             ` Gao, Liming
2018-05-24  8:31   ` Gao, Liming
2018-05-24 14:13     ` Carsey, Jaben

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1526878301-13892-1-git-send-email-liming.gao@intel.com \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox