From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-1.mimecast.com (us-smtp-1.mimecast.com [205.139.110.120]) by mx.groups.io with SMTP id smtpd.web09.42128.1578936708598071010 for ; Mon, 13 Jan 2020 09:31:48 -0800 Authentication-Results: mx.groups.io; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Ga9VogyV; spf=pass (domain: redhat.com, ip: 205.139.110.120, mailfrom: lersek@redhat.com) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1578936707; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Bvyr7PcEIV+i6nSMpmQkQT+yegS90o4s5XWhPq0ljF8=; b=Ga9VogyVbfod3WnfkwQ8KJws4zN/Xyi3WA1/Wj+c6xcFwbvq91JW+XRXNgo0T0nrjopVJQ +dirWpuar9lq138f1yCISv7L9usFYXpCeATtxScWCNzjsIpUmVj3Vs38UiozIAS7sgpv0a 7kG2ZBA6VDMeDEYGOFIRj312nX561WU= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-18-qSBm6_JDMEiWwMavCHOi1Q-1; Mon, 13 Jan 2020 12:31:44 -0500 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 6038118B9FC2; Mon, 13 Jan 2020 17:31:43 +0000 (UTC) Received: from lacos-laptop-7.usersys.redhat.com (ovpn-116-224.ams2.redhat.com [10.36.116.224]) by smtp.corp.redhat.com (Postfix) with ESMTP id D69BD19C5B; Mon, 13 Jan 2020 17:31:41 +0000 (UTC) Subject: Re: [Patch] BaseTools/Scripts/PatchCheck: Address false error conditions To: Michael D Kinney , devel@edk2.groups.io Cc: Bob Feng , Liming Gao , Jordan Justen References: <20200110225358.8204-1-michael.d.kinney@intel.com> From: "Laszlo Ersek" Message-ID: <2f7d61eb-7acf-6f6b-fc89-c784a45dd2a6@redhat.com> Date: Mon, 13 Jan 2020 18:31:40 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <20200110225358.8204-1-michael.d.kinney@intel.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-MC-Unique: qSBm6_JDMEiWwMavCHOi1Q-1 X-Mimecast-Spam-Score: 0 Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hello Mike, thanks a lot for this patch! Comments below: On 01/10/20 23:53, Michael D Kinney wrote: > https://bugzilla.tianocore.org/show_bug.cgi?id=3D2406 > > * Always print subject line after the git commit id to make > it easier to know the context of warnings or errors. > * Allow UTF-8 characters in subject line > * Error if subject line length > 75 without CVE-xxx-xxxxx present > * Error if subject line length > 92 with CVE-xxxx-xxxxx present > * If body line length is > 75, then print warning instead of error. > > Cc: Bob Feng > Cc: Liming Gao > Cc: Laszlo Ersek > Cc: Jordan Justen > Signed-off-by: Michael D Kinney > --- > BaseTools/Scripts/PatchCheck.py | 57 +++++++++++++++++++++++++++++---- > 1 file changed, 51 insertions(+), 6 deletions(-) > > diff --git a/BaseTools/Scripts/PatchCheck.py b/BaseTools/Scripts/PatchChe= ck.py > index 9668025798..ff4572bb7c 100755 > --- a/BaseTools/Scripts/PatchCheck.py > +++ b/BaseTools/Scripts/PatchCheck.py > @@ -1,7 +1,7 @@ > ## @file > # Check a patch for various format issues > # > -# Copyright (c) 2015 - 2019, Intel Corporation. All rights reserved. > +# Copyright (c) 2015 - 2020, Intel Corporation. All rights reserved. > # > # SPDX-License-Identifier: BSD-2-Clause-Patent > # > @@ -35,6 +35,8 @@ class CommitMessageCheck: > self.subject =3D subject > self.msg =3D message > > + print (subject) > + > self.check_contributed_under() > self.check_signed_off_by() > self.check_misc_signatures() > @@ -179,6 +181,8 @@ class CommitMessageCheck: > for sig in self.sig_types: > self.find_signatures(sig) > > + cve_re =3D re.compile('CVE-[0-9]{4}-[0-9]{5}[^0-9]') > + (1) I suggest we permit CVE IDs with 4 arbitrary digits too. That is: CVE-[0-9]{4}-[0-9]{4,5}[^0-9] ^^^^^ Because of: https://cve.mitre.org/cve/identifiers/syntaxchange.html#new For example, "CVE-2020-0001" is an existent CVE: https://cve.mitre.org/cgi-bin/cvename.cgi?name=3DCVE-2020-0001 and edk2 could get a CVE assigned for which only 4 "arbitrary digits" were necessary. (In that case, the CNA would not pad that sequence to 5 digits.) Now, I wouldn't like to overcomplicate this, therefore I'm not requesting that in such cases, we also lower the inclusive limit from 92 to 91 characters, in the subject line. (I think that this suggested update to the script/patch does not require a reposting. Another comment below.) > def check_overall_format(self): > lines =3D self.msg.splitlines() > > @@ -196,9 +200,26 @@ class CommitMessageCheck: > self.error('Empty commit message!') > return > > - if count >=3D 1 and len(lines[0].rstrip()) >=3D 72: > - self.error('First line of commit message (subject line) ' + > - 'is too long.') > + if count >=3D 1 and re.search(self.cve_re, lines[0]): > + # > + # If CVE-xxxx-xxxxx is present in subject line, then limit l= ength of > + # subject line to 92 characters > + # > + if len(lines[0].rstrip()) >=3D 93: > + self.error( > + 'First line of commit message (subject line) is too = long (%d >=3D 93).' % > + (len(lines[0].rstrip())) > + ) > + else: > + # > + # If CVE-xxxx-xxxxx is not present in subject line, then lim= it > + # length of subject line to 75 characters > + # > + if len(lines[0].rstrip()) >=3D 76: > + self.error( > + 'First line of commit message (subject line) is too = long (%d >=3D 76).' % > + (len(lines[0].rstrip())) > + ) > > if count >=3D 1 and len(lines[0].strip()) =3D=3D 0: > self.error('First line of commit message (subject line) ' + > @@ -212,7 +233,14 @@ class CommitMessageCheck: > if (len(lines[i]) >=3D 76 and > len(lines[i].split()) > 1 and > not lines[i].startswith('git-svn-id:')): > - self.error('Line %d of commit message is too long.' % (i= + 1)) > + # > + # Print a warning if body line is longer than 75 charact= ers > + # > + print( > + 'WARNING - Line %d of commit message is too long (%d= >=3D 76).' % > + (i + 1, len(lines[i])) > + ) > + print(lines[i]) > > last_sig_line =3D None > for i in range(count - 1, 0, -1): > @@ -503,8 +531,25 @@ class CheckOnePatch: > else: > self.stat =3D mo.group('stat') > self.commit_msg =3D mo.group('commit_message') > + # > + # Parse subject line from email header. The subject line may be > + # composed of multiple parts with different encodings. Decode a= nd > + # combine all the parts to produce a single string with the cont= ents of > + # the decoded subject line. > + # > + parts =3D email.header.decode_header(pmail.get('subject')) > + subject =3D '' > + for (part, encoding) in parts: > + if encoding: > + part =3D part.decode(encoding) > + else: > + try: > + part =3D part.decode() > + except: > + pass > + subject =3D subject + part > > - self.commit_subject =3D pmail['subject'].replace('\r\n', '') > + self.commit_subject =3D subject.replace('\r\n', '') > self.commit_subject =3D self.commit_subject.replace('\n', '') > self.commit_subject =3D self.subject_prefix_re.sub('', self.comm= it_subject, 1) > > So I've applied this patch locally, such that the most recent prior patch for "BaseTools/Scripts/PatchCheck.py" has been commit 2649a735b249 ("BaseTools/PatchCheck.py: Ignore CR and LF characters in subject length", 2020-01-09). And I've attempted to test it against two patch sets I have pending on the list: [edk2-devel] [PATCH 0/2] UefiBootManagerLib, HttpDxe: tweaks for large HTTP(S) downloads [edk2-devel] [PATCH] UefiCpuPkg/PiSmmCpuDxeSmm: fix 2M->4K page splitting regression for PDEs (2) The update seems to break python2 compatibility: > Traceback (most recent call last): > File "BaseTools/Scripts/PatchCheck.py", line 688, in > sys.exit(PatchCheckApp().retval) > File "BaseTools/Scripts/PatchCheck.py", line 648, in __init__ > self.process_one_arg(patch) > File "BaseTools/Scripts/PatchCheck.py", line 665, in process_one_arg > self.ok &=3D CheckOneArg(arg, self.count).ok > File "BaseTools/Scripts/PatchCheck.py", line 632, in __init__ > checker =3D CheckGitCommits(param, max_count) > File "BaseTools/Scripts/PatchCheck.py", line 576, in __init__ > self.ok &=3D CheckOnePatch(commit, patch).ok > File "BaseTools/Scripts/PatchCheck.py", line 451, in __init__ > self.find_patch_pieces() > File "BaseTools/Scripts/PatchCheck.py", line 540, in find_patch_pieces > parts =3D email.header.decode_header(pmail.get('subject')) > AttributeError: 'module' object has no attribute 'header' However, the patch works very well with python3. I've also checked a local one-off commit with a bunch of UTF-8 sequences in the subject line. (In Hungarian we have the nonsensical phrase "=C3=A1rv=C3=ADzt=C5=B1r=C5=91 t=C3=BCk=C3=B6rf=C3=BAr=C3=B3g=C3=A9p =C3=81= RV=C3=8DZT=C5=B0R=C5=90 T=C3=9CK=C3=96RF=C3=9AR=C3=93G=C3=89P" for easily t= esting all the Hungarian diacritical marks, which are part of latin2. Verbatim, it means "flood tolerant mirror drill" :) ) In order to unblock some pending patches wrt. the edk2 CI system, I'd suggest pushing this patch at once. For that: Tested-by: Laszlo Ersek Then, the regex improvement for (1) could be a separate, future patch. (Normally I'd propose implementing that small change right before pushing, but then my Tested-by would not be valid any longer. "Reviewed-by" can survive small changes, but "Tested-by" can't.) Similarly, python2 support (2) (*if* we care -- not saying we should necessarily care) could be added as a subsequent patch. In summary I'd be thankful if the patch could be pushed as-is, with my Tested-by (and with an R-b from a BaseTools maintainer, of course). Thank you! Laszlo