From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: mx.groups.io; dkim=pass header.i=@linaro.org header.s=google header.b=mHJ5x58g; spf=pass (domain: linaro.org, ip: 209.85.128.67, mailfrom: leif.lindholm@linaro.org) Received: from mail-wm1-f67.google.com (mail-wm1-f67.google.com [209.85.128.67]) by groups.io with SMTP; Wed, 12 Jun 2019 12:47:45 -0700 Received: by mail-wm1-f67.google.com with SMTP id g135so7771964wme.4 for ; Wed, 12 Jun 2019 12:47:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=voqUP5ZN33bfvINbhr6fyy+p56qlQpVgXTrfBwVI3oY=; b=mHJ5x58gtzs8veC30whl4KJxxk8dErj/nRlELveJ6FZSCPnHZ4eT5TVI0E3vcPCK+n QeRPesmbidIF0L9EYEYxLVJFgj14SCUzygCQfMrZYOL8RdWSjOcdo5yA0p8W28IhZfiL ExlURUFc5O4pmhrE+VDMjkUW2EkEQChC+SiR2Qz8I1TvVqlBwKQXxm5UTT71v/OfH1ZX Xdy468XnDB7HUrngd1MSHSWRsFA/gec4G3hNRYPdQ0x1FyKoWYiPmq/0vHg2EO//D2Vh zTikLnDyUIJYSRtpYD8vEKi2BVg2KChmhCXY/vOoNJloy8s5MoLWaWQ3lK1ea1PwrwzS Wmwg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=voqUP5ZN33bfvINbhr6fyy+p56qlQpVgXTrfBwVI3oY=; b=rWRyHUNAtheX/FMpXZDYORmURsU8S6uTRUMXviDicohwD/RoAP8ucsBQJ19iVwchnH 7j6KnFUgnN8EfPi8nFD4aIDPltLbHAP3y5WMHLxW8b514qHH4wn3iqpaB+Fv1NwWFV3L GE0Ky7TvFoWZKVWeyASiyiUpOEK4008bw2/l2Z/IpA1BdX8ge+W39lOsMVrHr+FEJUgu BOXVxYgUGM+NhPoQonolNVWvl5d+hU2rb/LYmjuwBDlZ43V/+I+tyhFkf5Wsi8XWmCiy RF36opt4lQ5UJydDgxPVz7lEEuNMVjUEIyv67ndFPYAdlO2luPTTvm6Y4sFRKk3U2qN4 Bpzw== X-Gm-Message-State: APjAAAXwMZBDWg4joxAMtk4X0Dtn7MEs/AadPn4XTxYOIueFbg8+xsDR sBeoubnHxArB+i9JYUlMyHbgMw== X-Google-Smtp-Source: APXvYqwl0NJiB74et+bZsemt2KVckMAkP95/OnjuzLc+8tpEq4jmiJ/k3YDFWHOkIDIdaJXiJB/7/g== X-Received: by 2002:a1c:ab42:: with SMTP id u63mr605662wme.130.1560368863144; Wed, 12 Jun 2019 12:47:43 -0700 (PDT) Return-Path: Received: from bivouac.eciton.net (bivouac.eciton.net. [2a00:1098:0:86:1000:23:0:2]) by smtp.gmail.com with ESMTPSA id l17sm1098264wrq.37.2019.06.12.12.47.42 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 12 Jun 2019 12:47:42 -0700 (PDT) Date: Wed, 12 Jun 2019 20:47:41 +0100 From: "Leif Lindholm" To: Laszlo Ersek Cc: devel@edk2.groups.io, "Feng, Bob C" , Liming Gao , Andrew Fish , Michael D Kinney Subject: Re: [BIKESHEDDING-TARGET] BaseTools: add GetMaintainer.py helper script Message-ID: <20190612194741.btevnozrb7riqu6p@bivouac.eciton.net> References: <20190612111951.2429-1-leif.lindholm@linaro.org> <673ee087-2f93-7667-8d05-cda310610737@redhat.com> MIME-Version: 1.0 In-Reply-To: <673ee087-2f93-7667-8d05-cda310610737@redhat.com> User-Agent: NeoMutt/20170113 (1.7.2) Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Wed, Jun 12, 2019 at 03:27:15PM +0200, Laszlo Ersek wrote: > On 06/12/19 13:19, Leif Lindholm wrote: > > This is work-in-progress, but I figured it would be a good time to start > > having a conversation about how we want it to work. > > > > The script currently does nothing clever whatsoever with regards to > > looking at historical contributors to the modified code - it only > > performs a lookup in Maintainers.txt. > > I think that's good enough. > > > Known shortcomings: > > - Does not provide any metainformation about why certain people > > or mailing lists were picked. > > - Does not reason about precedence of sections - all maintainers > > for all sections that match a wildcard are returned. > > - Does not substantially update Maintainers.txt. > > - Does not implement the X: tag, to explicitly exclude subpaths. > > (But scans for it.) > > All of the above should be fine, for a start. > > > - The splitting of Maintainers.txt into sections is based on > > directly adjacent lines starting with a valid tag (so I expect > > Star is left out from MdeModulePkg hits, and OvmfPkg loses most > > of its Reviewers. (My feeling is this is valid, and Maintainers.txt > > should change - certainly all the problematic lines would become > > redundant with this format change.) > > I agree -- but then, Maintainers.txt should be updated in the same > *series* (not patch) just enough to preserve all those R: roles. Yes. I just didn't want to juggle a series before I even get to RFC. > > Supported wildcards are '*' and '?' - there is no special treatment of > > trailing '/' at this point. > > That should be fine. QEMU supports: > > F: Files and directories with wildcard patterns. > A trailing slash includes all files and subdirectory files. > F: drivers/net/ all files in and below drivers/net > F: drivers/net/* all files in drivers/net, but not below > F: */net/* all files in "any top level directory"/net > One pattern per line. Multiple F: lines acceptable. > > and I don't think we *need* to distinguish the first two cases from > each other (I don't see much use for "but not below" ATM). > > However, that distinction might work out of the box, due to another > comment I'd like to make below. > > > > > Oh, and like 'SetupGit.py', it requires the gitpython module. > > > > Laszlo: if you could provide some english language description of how > > the path (F:) format is supposed to work, and how you think section > > precedence should happen, I can implement that. > > Section precedence would be a much later feature (if we cared at all). Right. After your explanation below, I'm starting to suspect that the reason I couldn't figure out where QEMU's get_maintainer.pl did that is that it doesn't. I had misunderstood the format. This simplifies the remaining work. > English language description for "F:" -- I can only quote QEMU's description. > > > The qemu MAINTAINERS file does not contain this information, and my > > perl knowledge is rustier than my python knowledge is incomplete. > > > > Signed-off-by: Leif Lindholm > > --- > > BaseTools/Scripts/GetMaintainer.py | 131 +++++++++++++++++++++++++++++++++++++ > > Maintainers.txt | 38 +++++++++++ > > 2 files changed, 169 insertions(+) > > create mode 100644 BaseTools/Scripts/GetMaintainer.py > > > > diff --git a/BaseTools/Scripts/GetMaintainer.py b/BaseTools/Scripts/GetMaintainer.py > > new file mode 100644 > > index 0000000000..939930b052 > > --- /dev/null > > +++ b/BaseTools/Scripts/GetMaintainer.py > > @@ -0,0 +1,131 @@ > > +## @file > > +# Retrieves the people to request review from on submission of a commit. > > +# > > +# Copyright (c) 2019, Linaro Ltd. All rights reserved.
> > +# > > +# SPDX-License-Identifier: BSD-2-Clause-Patent > > +# > > + > > +from __future__ import print_function > > +from collections import defaultdict > > +from collections import OrderedDict > > +import argparse > > +import os > > +import re > > +import SetupGit > > + > > +EXPRESSIONS = { > > + 'exclude': re.compile(r'^X:\s*(?P.*?)\r*$'), > > + 'file': re.compile(r'^F:\s*(?P.*?)\r*$'), > > + 'list': re.compile(r'^L:\s*(?P.*?)\r*$'), > > + 'maintainer': re.compile(r'^M:\s*(?P.*<.*?>)\r*$'), > > + 'reviewer': re.compile(r'^R:\s*(?P.*?)\r*$'), > > + 'status': re.compile(r'^S:\s*(?P.*?)\r*$'), > > + 'tree': re.compile(r'^T:\s*(?P.*?)\r*$'), > > + 'webpage': re.compile(r'^W:\s*(?P.*?)\r*$') > > +} > > + > > +def printsection(section): > > + """Prints out the dictionary describing a Maintainers.txt section.""" > > + print('===') > > + for key in section.keys(): > > + print("Key: %s" % key) > > + for item in section[key]: > > + print(' %s' % item) > > + > > +def pattern_to_regex(pattern): > > + """Takes a string containing regular UNIX path wildcards > > + and returns a string suitable for matching with regex.""" > > + pattern = pattern.replace('.', r'\.') > > + pattern = pattern.replace('*', r'.*') > > + pattern = pattern.replace('?', r'.') > > + > > + return pattern > > So, this is what I disagree with, at first sight anyway. For > example, this seems to cause a bare "*" to match all pathnames (with > multiple pathname components, that is) -- normally we don't expect > "*" to cross pathname separators. Aaah. This is the missing piece of the puzzle. > Consider the pattern you added under "Tianocore Stewards" -- the > intent is to CC them on patches that touch the project root > directory's contents. However, I think the above translation will > cause them to be CC'd on everything. Is that right? Correct - due to aforementioned misunderstanding. The Stewards section should have no F: tags at all, and we should have a separate one with F: * F: */ L: devel@edk2.groups.io > - I was about to suggest "fnmatch" instead: > > https://docs.python.org/3/library/fnmatch.html > > However, according to the docs, "fnmatch" doesn't consider "/" a special character. :( > > (I probably remembered "fnmatch" from POSIX, where FNM_PATHNAME can make fnmatch() distinguish "/": > > http://pubs.opengroup.org/onlinepubs/9699919799/functions/fnmatch.html > > Too bad the same flag is not exposed in Python.) > > NB: "git" itself uses fnmatch() -- see "pathspec" in gitglossary(7). > > - "glob" is not helpful either, as it generates pathnames based on the pattern and the filesystem: > > https://docs.python.org/3/library/glob.html#module-glob > > - It seems that "pathlib" could *maybe* match "*" against just filenames (= pathname components): > > https://docs.python.org/3/library/pathlib.html Also, it's only part of the standard library from 3.4 onwards. But I'm actually fairly convinced I can untangle this now. I'll experiment a bit and resubmit something I'm happy with, now I'm actually working towards the correct goal. Thanks! / Leif > but it seems too heavy-weight either way. > > - so maybe we should stick with regular expressions, but exclude > path separators from the translations of "?" and "*" (use negative > character classes such as [^/] rather than just ".")? > > - should we anchor the expressions to the start of the pathnames > being matched? > > Unfortunately we seem to have uses for both cases. E.g., stewards > should be CC'd for files directly in the project root dir. That > suggests both excluding "/" from "*", and anchoring the pattern to > the pathname start. However, you and Ard would like to be CC'd on > anything AARCH64 and ARM, and */AARCH64/* would not work as intended > if "*" excluded "/", and the pattern was anchored to the left. > > The "rdiff-backup" utility solves this problem by introducing > another wildcard, "**", which matches pathname separators > explicitly. (And "*" doesn't match "/".) > > Under that approach, we'd spell the pattern for the stewards as just > "*", and the AARCH64 pattern as **/AARCH64/**. I think this could be > implemented in pattern_to_regex(), if you processed "**" first, and > "*" second. > > Thanks > Laszlo > > > > + > > +def get_section_maintainers(path, section): > > + """Returns a list with email addresses to any M: and R: entries > > + matching the provided path in the provided section.""" > > + maintainers = [] > > + > > + for pattern in section['file']: > > + regex = pattern_to_regex(pattern) > > + match = re.match(regex, path) > > + if match: > > +# print('path: "%s" pattern: "%s" regex: "%s"' % (path, pattern, regex)) > > + for address in section['maintainer'], section['reviewer']: > > + maintainers += address > > + > > + return maintainers > > + > > +def get_maintainers(path, sections): > > + """For 'path', iterates over all sections, returning maintainers > > + for matching ones.""" > > + maintainers = [] > > + for section in sections: > > + addresses = get_section_maintainers(path, section) > > + if addresses: > > + maintainers += addresses > > + > > + # Remove any duplicates > > + return list(OrderedDict.fromkeys(maintainers)) > > + > > +def parse_maintainers_line(line): > > + """Parse one line of Maintainers.txt, returning any match group and its key.""" > > + for key, expression in EXPRESSIONS.items(): > > + match = expression.match(line) > > + if match: > > + return key, match.group(key) > > + return None, None > > + > > +def parse_maintainers_file(filename): > > + """Parse the Maintainers.txt from top-level of repo and > > + return a list containing dictionaries of all sections.""" > > + with open(filename, 'r') as text: > > + line = text.readline() > > + sectionlist = [] > > + section = defaultdict(list) > > + while line: > > + key, value = parse_maintainers_line(line) > > + if key and value: > > + section[key].append(value) > > + > > + line = text.readline() > > + # If end of section (end of file, or non-tag line encountered)... > > + if not key or not value or not line: > > + # ...if non-empty, append area to list. > > + if section: > > + sectionlist.append(section.copy()) > > + section.clear() > > + > > + return sectionlist > > + > > +def get_modified_files(repo, args): > > + """Returns a list of the files modified by the commit specified in 'args'.""" > > + commit = repo.commit(args.commit) > > + return commit.stats.files > > + > > +if __name__ == '__main__': > > + PARSER = argparse.ArgumentParser( > > + description='Retrieves information on who to cc for review on a given commit') > > + PARSER.add_argument('commit', > > + action="store", > > + help='git revision to examine (default: HEAD)', > > + nargs='?', > > + default='HEAD') > > + ARGS = PARSER.parse_args() > > + > > + REPO = SetupGit.locate_repo() > > + > > + FILES = get_modified_files(REPO, ARGS) > > + CONFIG_FILE = os.path.join(REPO.working_dir, 'Maintainers.txt') > > + > > + SECTIONS = parse_maintainers_file(CONFIG_FILE) > > + > > + ADDRESSES = [] > > + > > + for file in FILES: > > + print(file) > > + ADDRESSES += get_maintainers(file, SECTIONS) > > + > > + for address in list(OrderedDict.fromkeys(ADDRESSES)): > > + print(' %s' % address) > > diff --git a/Maintainers.txt b/Maintainers.txt > > index 762a684659..f4c851536e 100644 > > --- a/Maintainers.txt > > +++ b/Maintainers.txt > > @@ -47,6 +47,7 @@ T: svn (read-only, deprecated) - https://svn.code.sf.net/p/edk2/code/trunk/edk2 > > > > Tianocore Stewards > > ------------------ > > +F: * > > M: Andrew Fish > > M: Laszlo Ersek > > M: Leif Lindholm > > @@ -61,19 +62,30 @@ EDK II Releases: > > W: https://github.com/tianocore/tianocore.github.io/wiki/EDK-II-Release-Planning > > M: Liming Gao > > > > +EDK II Architectures: > > +--------------------- > > +ARM, AARCH64 > > +F: */AARCH64/* > > +F: */ARM/* > > +M: Leif Lindholm > > +M: Ard Biesheuvel > > + > > EDK II Packages: > > ---------------- > > ArmPkg > > +F: ArmPkg/* > > W: https://github.com/tianocore/tianocore.github.io/wiki/ArmPkg > > M: Leif Lindholm > > M: Ard Biesheuvel > > > > ArmPlatformPkg > > +F: ArmPlatformPkg/* > > W: https://github.com/tianocore/tianocore.github.io/wiki/ArmPlatformPkg > > M: Leif Lindholm > > M: Ard Biesheuvel > > > > ArmVirtPkg > > +F: ArmVirtPkg/* > > W: https://github.com/tianocore/tianocore.github.io/wiki/ArmVirtPkg > > M: Laszlo Ersek > > M: Ard Biesheuvel > > @@ -81,26 +93,31 @@ R: Julien Grall > > (Xen modules) > > > > BaseTools > > +F: BaseTools/* > > W: https://github.com/tianocore/tianocore.github.io/wiki/BaseTools > > M: Bob Feng > > M: Liming Gao > > > > CryptoPkg > > +F: CryptoPkg/* > > W: https://github.com/tianocore/tianocore.github.io/wiki/CryptoPkg > > M: Jian Wang > > R: Ting Ye > > > > DynamicTablesPkg > > +F: DynamicTablesPkg/* > > W: https://github.com/tianocore/tianocore.github.io/wiki/DynamicTablesPkg > > M: Sami Mujawar > > M: Alexei Fedorov > > > > EmbeddedPkg > > +F: EmbeddedPkg/* > > W: https://github.com/tianocore/tianocore.github.io/wiki/EmbeddedPkg > > M: Leif Lindholm > > M: Ard Biesheuvel > > > > EmulatorPkg > > +F: EmulatorPkg/* > > W: https://github.com/tianocore/tianocore.github.io/wiki/EmulatorPkg > > M: Jordan Justen > > M: Andrew Fish > > @@ -108,55 +125,65 @@ M: Ray Ni > > S: Maintained > > > > FatPkg > > +F: FatPkg/* > > W: https://github.com/tianocore/tianocore.github.io/wiki/Edk2-fat-driver > > M: Ray Ni > > T: svn - https://svn.code.sf.net/p/edk2-fatdriver2/code/trunk/EnhancedFat > > T: git - https://github.com/tianocore/edk2-FatPkg.git > > > > FmpDevicePkg > > +F: FmpDevicePkg/* > > W: https://github.com/tianocore/tianocore.github.io/wiki/FmpDevicePkg > > M: Liming Gao > > M: Michael D Kinney > > > > IntelFrameworkModulePkg > > +F: IntelFrameworkModulePkg/* > > W: https://github.com/tianocore/tianocore.github.io/wiki/IntelFrameworkModulePkg > > M: Liming Gao > > > > IntelFrameworkPkg > > +F: IntelFrameworkPkg/* > > W: https://github.com/tianocore/tianocore.github.io/wiki/IntelFrameworkPkg > > M: Michael D Kinney > > M: Liming Gao > > > > IntelFsp2Pkg > > +F: IntelFsp2Pkg/* > > W: https://github.com/tianocore/tianocore.github.io/wiki/IntelFsp2Pkg > > M: Chasel Chiu > > R: Nate DeSimone > > R: Star Zeng > > > > IntelFsp2WrapperPkg > > +F: IntelFsp2WrapperPkg/* > > W: https://github.com/tianocore/tianocore.github.io/wiki/IntelFsp2WrapperPkg > > M: Chasel Chiu > > R: Nate DeSimone > > R: Star Zeng > > > > IntelFspPkg > > +F: IntelFspPkg/* > > W: https://github.com/tianocore/tianocore.github.io/wiki/IntelFspPkg > > M: Chasel Chiu > > R: Nate DeSimone > > R: Star Zeng > > > > IntelFspWrapperPkg > > +F: IntelFspWrapperPkg/* > > W: https://github.com/tianocore/tianocore.github.io/wiki/IntelFspWrapperPkg > > M: Chasel Chiu > > R: Nate DeSimone > > R: Star Zeng > > > > IntelSiliconPkg > > +F: IntelSiliconPkg/* > > W: https://github.com/tianocore/tianocore.github.io/wiki/IntelSiliconPkg > > M: Ray Ni > > M: Rangasai V Chaganty > > > > MdeModulePkg > > +F: MdeModulePkg/* > > W: https://github.com/tianocore/tianocore.github.io/wiki/MdeModulePkg > > M: Jian J Wang > > M: Hao A Wu > > @@ -166,16 +193,19 @@ R: Ray Ni > > R: Star Zeng > > > > MdePkg > > +F: MdePkg/* > > W: https://github.com/tianocore/tianocore.github.io/wiki/MdePkg > > M: Michael D Kinney > > M: Liming Gao > > > > NetworkPkg > > +F: NetworkPkg/* > > W: https://github.com/tianocore/tianocore.github.io/wiki/NetworkPkg > > M: Siyuan Fu > > M: Jiaxin Wu > > > > OvmfPkg > > +F: OvmfPkg/* > > W: http://www.tianocore.org/ovmf/ > > M: Jordan Justen > > M: Laszlo Ersek > > @@ -191,16 +221,19 @@ R: Stefan Berger > > S: Maintained > > > > PcAtChipsetPkg > > +F: PcAtChipsetPkg/* > > W: https://github.com/tianocore/tianocore.github.io/wiki/PcAtChipsetPkg > > M: Ray Ni > > > > SecurityPkg > > +F: SecurityPkg/* > > W: https://github.com/tianocore/tianocore.github.io/wiki/SecurityPkg > > M: Chao Zhang > > M: Jiewen Yao > > M: Jian Wang > > > > ShellPkg > > +F: ShellPkg/* > > W: https://github.com/tianocore/tianocore.github.io/wiki/ShellPkg > > M: Jaben Carsey > > M: Ray Ni > > @@ -214,21 +247,25 @@ M: Leif Lindholm (ARM/AArch64) > > M: Ard Biesheuvel (ARM/AArch64) > > > > SignedCapsulePkg > > +F: SignedCapsulePkg/* > > W: https://github.com/tianocore/tianocore.github.io/wiki/SignedCapsulePkg > > M: Jiewen Yao > > M: Chao Zhang > > > > SourceLevelDebugPkg > > +F: SourceLevelDebugPkg/* > > W: https://github.com/tianocore/tianocore.github.io/wiki/SourceLevelDebugPkg > > M: Hao A Wu > > > > UefiCpuPkg > > +F: UefiCpuPkg/* > > W: https://github.com/tianocore/tianocore.github.io/wiki/UefiCpuPkg > > M: Eric Dong > > M: Ray Ni > > R: Laszlo Ersek > > > > UefiPayloadPkg > > +F: UefiPayloadPkg/* > > W: https://github.com/tianocore/tianocore.github.io/wiki/UefiPayloadPkg > > M: Maurice Ma > > M: Guo Dong > > @@ -236,6 +273,7 @@ M: Benjamin You > > S: Maintained > > > > StandaloneMmPkg > > +F: StandaloneMmPkg/* > > M: Achin Gupta > > M: Jiewen Yao > > R: Supreeth Venkatesh > > >