From: "Leif Lindholm" <leif.lindholm@linaro.org>
To: Laszlo Ersek <lersek@redhat.com>
Cc: devel@edk2.groups.io, "Feng, Bob C" <bob.c.feng@intel.com>,
Liming Gao <liming.gao@intel.com>, Andrew Fish <afish@apple.com>,
Michael D Kinney <michael.d.kinney@intel.com>
Subject: Re: [BIKESHEDDING-TARGET] BaseTools: add GetMaintainer.py helper script
Date: Wed, 12 Jun 2019 20:47:41 +0100 [thread overview]
Message-ID: <20190612194741.btevnozrb7riqu6p@bivouac.eciton.net> (raw)
In-Reply-To: <673ee087-2f93-7667-8d05-cda310610737@redhat.com>
On Wed, Jun 12, 2019 at 03:27:15PM +0200, Laszlo Ersek wrote:
> On 06/12/19 13:19, Leif Lindholm wrote:
> > This is work-in-progress, but I figured it would be a good time to start
> > having a conversation about how we want it to work.
> >
> > The script currently does nothing clever whatsoever with regards to
> > looking at historical contributors to the modified code - it only
> > performs a lookup in Maintainers.txt.
>
> I think that's good enough.
>
> > Known shortcomings:
> > - Does not provide any metainformation about why certain people
> > or mailing lists were picked.
> > - Does not reason about precedence of sections - all maintainers
> > for all sections that match a wildcard are returned.
> > - Does not substantially update Maintainers.txt.
> > - Does not implement the X: tag, to explicitly exclude subpaths.
> > (But scans for it.)
>
> All of the above should be fine, for a start.
>
> > - The splitting of Maintainers.txt into sections is based on
> > directly adjacent lines starting with a valid tag (so I expect
> > Star is left out from MdeModulePkg hits, and OvmfPkg loses most
> > of its Reviewers. (My feeling is this is valid, and Maintainers.txt
> > should change - certainly all the problematic lines would become
> > redundant with this format change.)
>
> I agree -- but then, Maintainers.txt should be updated in the same
> *series* (not patch) just enough to preserve all those R: roles.
Yes. I just didn't want to juggle a series before I even get to RFC.
> > Supported wildcards are '*' and '?' - there is no special treatment of
> > trailing '/' at this point.
>
> That should be fine. QEMU supports:
>
> F: Files and directories with wildcard patterns.
> A trailing slash includes all files and subdirectory files.
> F: drivers/net/ all files in and below drivers/net
> F: drivers/net/* all files in drivers/net, but not below
> F: */net/* all files in "any top level directory"/net
> One pattern per line. Multiple F: lines acceptable.
>
> and I don't think we *need* to distinguish the first two cases from
> each other (I don't see much use for "but not below" ATM).
>
> However, that distinction might work out of the box, due to another
> comment I'd like to make below.
>
> >
> > Oh, and like 'SetupGit.py', it requires the gitpython module.
> >
> > Laszlo: if you could provide some english language description of how
> > the path (F:) format is supposed to work, and how you think section
> > precedence should happen, I can implement that.
>
> Section precedence would be a much later feature (if we cared at all).
Right. After your explanation below, I'm starting to suspect that the
reason I couldn't figure out where QEMU's get_maintainer.pl did that
is that it doesn't. I had misunderstood the format. This simplifies
the remaining work.
> English language description for "F:" -- I can only quote QEMU's description.
>
> > The qemu MAINTAINERS file does not contain this information, and my
> > perl knowledge is rustier than my python knowledge is incomplete.
> >
> > Signed-off-by: Leif Lindholm <leif.lindholm@linaro.org>
> > ---
> > BaseTools/Scripts/GetMaintainer.py | 131 +++++++++++++++++++++++++++++++++++++
> > Maintainers.txt | 38 +++++++++++
> > 2 files changed, 169 insertions(+)
> > create mode 100644 BaseTools/Scripts/GetMaintainer.py
> >
> > diff --git a/BaseTools/Scripts/GetMaintainer.py b/BaseTools/Scripts/GetMaintainer.py
> > new file mode 100644
> > index 0000000000..939930b052
> > --- /dev/null
> > +++ b/BaseTools/Scripts/GetMaintainer.py
> > @@ -0,0 +1,131 @@
> > +## @file
> > +# Retrieves the people to request review from on submission of a commit.
> > +#
> > +# Copyright (c) 2019, Linaro Ltd. All rights reserved.<BR>
> > +#
> > +# SPDX-License-Identifier: BSD-2-Clause-Patent
> > +#
> > +
> > +from __future__ import print_function
> > +from collections import defaultdict
> > +from collections import OrderedDict
> > +import argparse
> > +import os
> > +import re
> > +import SetupGit
> > +
> > +EXPRESSIONS = {
> > + 'exclude': re.compile(r'^X:\s*(?P<exclude>.*?)\r*$'),
> > + 'file': re.compile(r'^F:\s*(?P<file>.*?)\r*$'),
> > + 'list': re.compile(r'^L:\s*(?P<list>.*?)\r*$'),
> > + 'maintainer': re.compile(r'^M:\s*(?P<maintainer>.*<.*?>)\r*$'),
> > + 'reviewer': re.compile(r'^R:\s*(?P<reviewer>.*?)\r*$'),
> > + 'status': re.compile(r'^S:\s*(?P<status>.*?)\r*$'),
> > + 'tree': re.compile(r'^T:\s*(?P<tree>.*?)\r*$'),
> > + 'webpage': re.compile(r'^W:\s*(?P<webpage>.*?)\r*$')
> > +}
> > +
> > +def printsection(section):
> > + """Prints out the dictionary describing a Maintainers.txt section."""
> > + print('===')
> > + for key in section.keys():
> > + print("Key: %s" % key)
> > + for item in section[key]:
> > + print(' %s' % item)
> > +
> > +def pattern_to_regex(pattern):
> > + """Takes a string containing regular UNIX path wildcards
> > + and returns a string suitable for matching with regex."""
> > + pattern = pattern.replace('.', r'\.')
> > + pattern = pattern.replace('*', r'.*')
> > + pattern = pattern.replace('?', r'.')
> > +
> > + return pattern
>
> So, this is what I disagree with, at first sight anyway. For
> example, this seems to cause a bare "*" to match all pathnames (with
> multiple pathname components, that is) -- normally we don't expect
> "*" to cross pathname separators.
Aaah. This is the missing piece of the puzzle.
> Consider the pattern you added under "Tianocore Stewards" -- the
> intent is to CC them on patches that touch the project root
> directory's contents. However, I think the above translation will
> cause them to be CC'd on everything. Is that right?
Correct - due to aforementioned misunderstanding.
The Stewards section should have no F: tags at all, and we should have
a separate one with
F: *
F: */
L: devel@edk2.groups.io
> - I was about to suggest "fnmatch" instead:
>
> https://docs.python.org/3/library/fnmatch.html
>
> However, according to the docs, "fnmatch" doesn't consider "/" a special character. :(
>
> (I probably remembered "fnmatch" from POSIX, where FNM_PATHNAME can make fnmatch() distinguish "/":
>
> http://pubs.opengroup.org/onlinepubs/9699919799/functions/fnmatch.html
>
> Too bad the same flag is not exposed in Python.)
>
> NB: "git" itself uses fnmatch() -- see "pathspec" in gitglossary(7).
>
> - "glob" is not helpful either, as it generates pathnames based on the pattern and the filesystem:
>
> https://docs.python.org/3/library/glob.html#module-glob
>
> - It seems that "pathlib" could *maybe* match "*" against just filenames (= pathname components):
>
> https://docs.python.org/3/library/pathlib.html
Also, it's only part of the standard library from 3.4 onwards.
But I'm actually fairly convinced I can untangle this now.
I'll experiment a bit and resubmit something I'm happy with, now I'm
actually working towards the correct goal.
Thanks!
/
Leif
> but it seems too heavy-weight either way.
>
> - so maybe we should stick with regular expressions, but exclude
> path separators from the translations of "?" and "*" (use negative
> character classes such as [^/] rather than just ".")?
>
> - should we anchor the expressions to the start of the pathnames
> being matched?
>
> Unfortunately we seem to have uses for both cases. E.g., stewards
> should be CC'd for files directly in the project root dir. That
> suggests both excluding "/" from "*", and anchoring the pattern to
> the pathname start. However, you and Ard would like to be CC'd on
> anything AARCH64 and ARM, and */AARCH64/* would not work as intended
> if "*" excluded "/", and the pattern was anchored to the left.
>
> The "rdiff-backup" utility solves this problem by introducing
> another wildcard, "**", which matches pathname separators
> explicitly. (And "*" doesn't match "/".)
>
> Under that approach, we'd spell the pattern for the stewards as just
> "*", and the AARCH64 pattern as **/AARCH64/**. I think this could be
> implemented in pattern_to_regex(), if you processed "**" first, and
> "*" second.
>
> Thanks
> Laszlo
>
>
> > +
> > +def get_section_maintainers(path, section):
> > + """Returns a list with email addresses to any M: and R: entries
> > + matching the provided path in the provided section."""
> > + maintainers = []
> > +
> > + for pattern in section['file']:
> > + regex = pattern_to_regex(pattern)
> > + match = re.match(regex, path)
> > + if match:
> > +# print('path: "%s" pattern: "%s" regex: "%s"' % (path, pattern, regex))
> > + for address in section['maintainer'], section['reviewer']:
> > + maintainers += address
> > +
> > + return maintainers
> > +
> > +def get_maintainers(path, sections):
> > + """For 'path', iterates over all sections, returning maintainers
> > + for matching ones."""
> > + maintainers = []
> > + for section in sections:
> > + addresses = get_section_maintainers(path, section)
> > + if addresses:
> > + maintainers += addresses
> > +
> > + # Remove any duplicates
> > + return list(OrderedDict.fromkeys(maintainers))
> > +
> > +def parse_maintainers_line(line):
> > + """Parse one line of Maintainers.txt, returning any match group and its key."""
> > + for key, expression in EXPRESSIONS.items():
> > + match = expression.match(line)
> > + if match:
> > + return key, match.group(key)
> > + return None, None
> > +
> > +def parse_maintainers_file(filename):
> > + """Parse the Maintainers.txt from top-level of repo and
> > + return a list containing dictionaries of all sections."""
> > + with open(filename, 'r') as text:
> > + line = text.readline()
> > + sectionlist = []
> > + section = defaultdict(list)
> > + while line:
> > + key, value = parse_maintainers_line(line)
> > + if key and value:
> > + section[key].append(value)
> > +
> > + line = text.readline()
> > + # If end of section (end of file, or non-tag line encountered)...
> > + if not key or not value or not line:
> > + # ...if non-empty, append area to list.
> > + if section:
> > + sectionlist.append(section.copy())
> > + section.clear()
> > +
> > + return sectionlist
> > +
> > +def get_modified_files(repo, args):
> > + """Returns a list of the files modified by the commit specified in 'args'."""
> > + commit = repo.commit(args.commit)
> > + return commit.stats.files
> > +
> > +if __name__ == '__main__':
> > + PARSER = argparse.ArgumentParser(
> > + description='Retrieves information on who to cc for review on a given commit')
> > + PARSER.add_argument('commit',
> > + action="store",
> > + help='git revision to examine (default: HEAD)',
> > + nargs='?',
> > + default='HEAD')
> > + ARGS = PARSER.parse_args()
> > +
> > + REPO = SetupGit.locate_repo()
> > +
> > + FILES = get_modified_files(REPO, ARGS)
> > + CONFIG_FILE = os.path.join(REPO.working_dir, 'Maintainers.txt')
> > +
> > + SECTIONS = parse_maintainers_file(CONFIG_FILE)
> > +
> > + ADDRESSES = []
> > +
> > + for file in FILES:
> > + print(file)
> > + ADDRESSES += get_maintainers(file, SECTIONS)
> > +
> > + for address in list(OrderedDict.fromkeys(ADDRESSES)):
> > + print(' %s' % address)
> > diff --git a/Maintainers.txt b/Maintainers.txt
> > index 762a684659..f4c851536e 100644
> > --- a/Maintainers.txt
> > +++ b/Maintainers.txt
> > @@ -47,6 +47,7 @@ T: svn (read-only, deprecated) - https://svn.code.sf.net/p/edk2/code/trunk/edk2
> >
> > Tianocore Stewards
> > ------------------
> > +F: *
> > M: Andrew Fish <afish@apple.com>
> > M: Laszlo Ersek <lersek@redhat.com>
> > M: Leif Lindholm <leif.lindholm@linaro.org>
> > @@ -61,19 +62,30 @@ EDK II Releases:
> > W: https://github.com/tianocore/tianocore.github.io/wiki/EDK-II-Release-Planning
> > M: Liming Gao <liming.gao@intel.com>
> >
> > +EDK II Architectures:
> > +---------------------
> > +ARM, AARCH64
> > +F: */AARCH64/*
> > +F: */ARM/*
> > +M: Leif Lindholm <leif.lindholm@linaro.org>
> > +M: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> > +
> > EDK II Packages:
> > ----------------
> > ArmPkg
> > +F: ArmPkg/*
> > W: https://github.com/tianocore/tianocore.github.io/wiki/ArmPkg
> > M: Leif Lindholm <leif.lindholm@linaro.org>
> > M: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> >
> > ArmPlatformPkg
> > +F: ArmPlatformPkg/*
> > W: https://github.com/tianocore/tianocore.github.io/wiki/ArmPlatformPkg
> > M: Leif Lindholm <leif.lindholm@linaro.org>
> > M: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> >
> > ArmVirtPkg
> > +F: ArmVirtPkg/*
> > W: https://github.com/tianocore/tianocore.github.io/wiki/ArmVirtPkg
> > M: Laszlo Ersek <lersek@redhat.com>
> > M: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> > @@ -81,26 +93,31 @@ R: Julien Grall <julien.grall@arm.com>
> > (Xen modules)
> >
> > BaseTools
> > +F: BaseTools/*
> > W: https://github.com/tianocore/tianocore.github.io/wiki/BaseTools
> > M: Bob Feng <bob.c.feng@intel.com>
> > M: Liming Gao <liming.gao@intel.com>
> >
> > CryptoPkg
> > +F: CryptoPkg/*
> > W: https://github.com/tianocore/tianocore.github.io/wiki/CryptoPkg
> > M: Jian Wang <jian.j.wang@intel.com>
> > R: Ting Ye <ting.ye@intel.com>
> >
> > DynamicTablesPkg
> > +F: DynamicTablesPkg/*
> > W: https://github.com/tianocore/tianocore.github.io/wiki/DynamicTablesPkg
> > M: Sami Mujawar <Sami.Mujawar@arm.com>
> > M: Alexei Fedorov <Alexei.Fedorov@arm.com>
> >
> > EmbeddedPkg
> > +F: EmbeddedPkg/*
> > W: https://github.com/tianocore/tianocore.github.io/wiki/EmbeddedPkg
> > M: Leif Lindholm <leif.lindholm@linaro.org>
> > M: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> >
> > EmulatorPkg
> > +F: EmulatorPkg/*
> > W: https://github.com/tianocore/tianocore.github.io/wiki/EmulatorPkg
> > M: Jordan Justen <jordan.l.justen@intel.com>
> > M: Andrew Fish <afish@apple.com>
> > @@ -108,55 +125,65 @@ M: Ray Ni <ray.ni@intel.com>
> > S: Maintained
> >
> > FatPkg
> > +F: FatPkg/*
> > W: https://github.com/tianocore/tianocore.github.io/wiki/Edk2-fat-driver
> > M: Ray Ni <ray.ni@intel.com>
> > T: svn - https://svn.code.sf.net/p/edk2-fatdriver2/code/trunk/EnhancedFat
> > T: git - https://github.com/tianocore/edk2-FatPkg.git
> >
> > FmpDevicePkg
> > +F: FmpDevicePkg/*
> > W: https://github.com/tianocore/tianocore.github.io/wiki/FmpDevicePkg
> > M: Liming Gao <liming.gao@intel.com>
> > M: Michael D Kinney <michael.d.kinney@intel.com>
> >
> > IntelFrameworkModulePkg
> > +F: IntelFrameworkModulePkg/*
> > W: https://github.com/tianocore/tianocore.github.io/wiki/IntelFrameworkModulePkg
> > M: Liming Gao <liming.gao@intel.com>
> >
> > IntelFrameworkPkg
> > +F: IntelFrameworkPkg/*
> > W: https://github.com/tianocore/tianocore.github.io/wiki/IntelFrameworkPkg
> > M: Michael D Kinney <michael.d.kinney@intel.com>
> > M: Liming Gao <liming.gao@intel.com>
> >
> > IntelFsp2Pkg
> > +F: IntelFsp2Pkg/*
> > W: https://github.com/tianocore/tianocore.github.io/wiki/IntelFsp2Pkg
> > M: Chasel Chiu <chasel.chiu@intel.com>
> > R: Nate DeSimone <nathaniel.l.desimone@intel.com>
> > R: Star Zeng <star.zeng@intel.com>
> >
> > IntelFsp2WrapperPkg
> > +F: IntelFsp2WrapperPkg/*
> > W: https://github.com/tianocore/tianocore.github.io/wiki/IntelFsp2WrapperPkg
> > M: Chasel Chiu <chasel.chiu@intel.com>
> > R: Nate DeSimone <nathaniel.l.desimone@intel.com>
> > R: Star Zeng <star.zeng@intel.com>
> >
> > IntelFspPkg
> > +F: IntelFspPkg/*
> > W: https://github.com/tianocore/tianocore.github.io/wiki/IntelFspPkg
> > M: Chasel Chiu <chasel.chiu@intel.com>
> > R: Nate DeSimone <nathaniel.l.desimone@intel.com>
> > R: Star Zeng <star.zeng@intel.com>
> >
> > IntelFspWrapperPkg
> > +F: IntelFspWrapperPkg/*
> > W: https://github.com/tianocore/tianocore.github.io/wiki/IntelFspWrapperPkg
> > M: Chasel Chiu <chasel.chiu@intel.com>
> > R: Nate DeSimone <nathaniel.l.desimone@intel.com>
> > R: Star Zeng <star.zeng@intel.com>
> >
> > IntelSiliconPkg
> > +F: IntelSiliconPkg/*
> > W: https://github.com/tianocore/tianocore.github.io/wiki/IntelSiliconPkg
> > M: Ray Ni <ray.ni@intel.com>
> > M: Rangasai V Chaganty <rangasai.v.chaganty@intel.com>
> >
> > MdeModulePkg
> > +F: MdeModulePkg/*
> > W: https://github.com/tianocore/tianocore.github.io/wiki/MdeModulePkg
> > M: Jian J Wang <jian.j.wang@intel.com>
> > M: Hao A Wu <hao.a.wu@intel.com>
> > @@ -166,16 +193,19 @@ R: Ray Ni <ray.ni@intel.com>
> > R: Star Zeng <star.zeng@intel.com>
> >
> > MdePkg
> > +F: MdePkg/*
> > W: https://github.com/tianocore/tianocore.github.io/wiki/MdePkg
> > M: Michael D Kinney <michael.d.kinney@intel.com>
> > M: Liming Gao <liming.gao@intel.com>
> >
> > NetworkPkg
> > +F: NetworkPkg/*
> > W: https://github.com/tianocore/tianocore.github.io/wiki/NetworkPkg
> > M: Siyuan Fu <siyuan.fu@intel.com>
> > M: Jiaxin Wu <jiaxin.wu@intel.com>
> >
> > OvmfPkg
> > +F: OvmfPkg/*
> > W: http://www.tianocore.org/ovmf/
> > M: Jordan Justen <jordan.l.justen@intel.com>
> > M: Laszlo Ersek <lersek@redhat.com>
> > @@ -191,16 +221,19 @@ R: Stefan Berger <stefanb@linux.ibm.com>
> > S: Maintained
> >
> > PcAtChipsetPkg
> > +F: PcAtChipsetPkg/*
> > W: https://github.com/tianocore/tianocore.github.io/wiki/PcAtChipsetPkg
> > M: Ray Ni <ray.ni@intel.com>
> >
> > SecurityPkg
> > +F: SecurityPkg/*
> > W: https://github.com/tianocore/tianocore.github.io/wiki/SecurityPkg
> > M: Chao Zhang <chao.b.zhang@intel.com>
> > M: Jiewen Yao <jiewen.yao@intel.com>
> > M: Jian Wang <jian.j.wang@intel.com>
> >
> > ShellPkg
> > +F: ShellPkg/*
> > W: https://github.com/tianocore/tianocore.github.io/wiki/ShellPkg
> > M: Jaben Carsey <jaben.carsey@intel.com>
> > M: Ray Ni <ray.ni@intel.com>
> > @@ -214,21 +247,25 @@ M: Leif Lindholm <leif.lindholm@linaro.org> (ARM/AArch64)
> > M: Ard Biesheuvel <ard.biesheuvel@linaro.org> (ARM/AArch64)
> >
> > SignedCapsulePkg
> > +F: SignedCapsulePkg/*
> > W: https://github.com/tianocore/tianocore.github.io/wiki/SignedCapsulePkg
> > M: Jiewen Yao <jiewen.yao@intel.com>
> > M: Chao Zhang <chao.b.zhang@intel.com>
> >
> > SourceLevelDebugPkg
> > +F: SourceLevelDebugPkg/*
> > W: https://github.com/tianocore/tianocore.github.io/wiki/SourceLevelDebugPkg
> > M: Hao A Wu <hao.a.wu@intel.com>
> >
> > UefiCpuPkg
> > +F: UefiCpuPkg/*
> > W: https://github.com/tianocore/tianocore.github.io/wiki/UefiCpuPkg
> > M: Eric Dong <eric.dong@intel.com>
> > M: Ray Ni <ray.ni@intel.com>
> > R: Laszlo Ersek <lersek@redhat.com>
> >
> > UefiPayloadPkg
> > +F: UefiPayloadPkg/*
> > W: https://github.com/tianocore/tianocore.github.io/wiki/UefiPayloadPkg
> > M: Maurice Ma <maurice.ma@intel.com>
> > M: Guo Dong <guo.dong@intel.com>
> > @@ -236,6 +273,7 @@ M: Benjamin You <benjamin.you@intel.com>
> > S: Maintained
> >
> > StandaloneMmPkg
> > +F: StandaloneMmPkg/*
> > M: Achin Gupta <achin.gupta@arm.com>
> > M: Jiewen Yao <jiewen.yao@intel.com>
> > R: Supreeth Venkatesh <supreeth.venkatesh@arm.com>
> >
>
prev parent reply other threads:[~2019-06-12 19:47 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-06-12 11:19 [BIKESHEDDING-TARGET] BaseTools: add GetMaintainer.py helper script Leif Lindholm
2019-06-12 13:27 ` Laszlo Ersek
2019-06-12 19:47 ` Leif Lindholm [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-list from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190612194741.btevnozrb7riqu6p@bivouac.eciton.net \
--to=devel@edk2.groups.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox