public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
From: "Laszlo Ersek" <lersek@redhat.com>
To: Leif Lindholm <leif.lindholm@linaro.org>, devel@edk2.groups.io
Cc: "Feng, Bob C" <bob.c.feng@intel.com>,
	Liming Gao <liming.gao@intel.com>, Andrew Fish <afish@apple.com>,
	Michael D Kinney <michael.d.kinney@intel.com>
Subject: Re: [BIKESHEDDING-TARGET] BaseTools: add GetMaintainer.py helper script
Date: Wed, 12 Jun 2019 15:27:15 +0200	[thread overview]
Message-ID: <673ee087-2f93-7667-8d05-cda310610737@redhat.com> (raw)
In-Reply-To: <20190612111951.2429-1-leif.lindholm@linaro.org>

On 06/12/19 13:19, Leif Lindholm wrote:
> This is work-in-progress, but I figured it would be a good time to start
> having a conversation about how we want it to work.
> 
> The script currently does nothing clever whatsoever with regards to
> looking at historical contributors to the modified code - it only
> performs a lookup in Maintainers.txt.

I think that's good enough.

> Known shortcomings:
> - Does not provide any metainformation about why certain people
>   or mailing lists were picked.
> - Does not reason about precedence of sections - all maintainers
>   for all sections that match a wildcard are returned.
> - Does not substantially update Maintainers.txt.
> - Does not implement the X: tag, to explicitly exclude subpaths.
>   (But scans for it.)

All of the above should be fine, for a start.

> - The splitting of Maintainers.txt into sections is based on
>   directly adjacent lines starting with a valid tag (so I expect
>   Star is left out from MdeModulePkg hits, and OvmfPkg loses most
>   of its Reviewers. (My feeling is this is valid, and Maintainers.txt
>   should change - certainly all the problematic lines would become
>   redundant with this format change.)

I agree -- but then, Maintainers.txt should be updated in the same *series* (not patch) just enough to preserve all those R: roles.

> Supported wildcards are '*' and '?' - there is no special treatment of
> trailing '/' at this point.

That should be fine. QEMU supports:

        F: Files and directories with wildcard patterns.
           A trailing slash includes all files and subdirectory files.
           F:   drivers/net/    all files in and below drivers/net
           F:   drivers/net/*   all files in drivers/net, but not below
           F:   */net/*         all files in "any top level directory"/net
           One pattern per line.  Multiple F: lines acceptable.

and I don't think we *need* to distinguish the first two cases from each other (I don't see much use for "but not below" ATM).

However, that distinction might work out of the box, due to another comment I'd like to make below.

> 
> Oh, and like 'SetupGit.py', it requires the gitpython module.
> 
> Laszlo: if you could provide some english language description of how
> the path (F:) format is supposed to work, and how you think section
> precedence should happen, I can implement that.

Section precedence would be a much later feature (if we cared at all).

English language description for "F:" -- I can only quote QEMU's description.

> The qemu MAINTAINERS file does not contain this information, and my
> perl knowledge is rustier than my python knowledge is incomplete.
> 
> Signed-off-by: Leif Lindholm <leif.lindholm@linaro.org>
> ---
>  BaseTools/Scripts/GetMaintainer.py | 131 +++++++++++++++++++++++++++++++++++++
>  Maintainers.txt                    |  38 +++++++++++
>  2 files changed, 169 insertions(+)
>  create mode 100644 BaseTools/Scripts/GetMaintainer.py
> 
> diff --git a/BaseTools/Scripts/GetMaintainer.py b/BaseTools/Scripts/GetMaintainer.py
> new file mode 100644
> index 0000000000..939930b052
> --- /dev/null
> +++ b/BaseTools/Scripts/GetMaintainer.py
> @@ -0,0 +1,131 @@
> +## @file
> +#  Retrieves the people to request review from on submission of a commit.
> +#
> +#  Copyright (c) 2019, Linaro Ltd. All rights reserved.<BR>
> +#
> +#  SPDX-License-Identifier: BSD-2-Clause-Patent
> +#
> +
> +from __future__ import print_function
> +from collections import defaultdict
> +from collections import OrderedDict
> +import argparse
> +import os
> +import re
> +import SetupGit
> +
> +EXPRESSIONS = {
> +    'exclude':    re.compile(r'^X:\s*(?P<exclude>.*?)\r*$'),
> +    'file':       re.compile(r'^F:\s*(?P<file>.*?)\r*$'),
> +    'list':       re.compile(r'^L:\s*(?P<list>.*?)\r*$'),
> +    'maintainer': re.compile(r'^M:\s*(?P<maintainer>.*<.*?>)\r*$'),
> +    'reviewer':   re.compile(r'^R:\s*(?P<reviewer>.*?)\r*$'),
> +    'status':     re.compile(r'^S:\s*(?P<status>.*?)\r*$'),
> +    'tree':       re.compile(r'^T:\s*(?P<tree>.*?)\r*$'),
> +    'webpage':    re.compile(r'^W:\s*(?P<webpage>.*?)\r*$')
> +}
> +
> +def printsection(section):
> +    """Prints out the dictionary describing a Maintainers.txt section."""
> +    print('===')
> +    for key in section.keys():
> +        print("Key: %s" % key)
> +        for item in section[key]:
> +            print('  %s' % item)
> +
> +def pattern_to_regex(pattern):
> +    """Takes a string containing regular UNIX path wildcards
> +       and returns a string suitable for matching with regex."""
> +    pattern = pattern.replace('.', r'\.')
> +    pattern = pattern.replace('*', r'.*')
> +    pattern = pattern.replace('?', r'.')
> +
> +    return pattern

So, this is what I disagree with, at first sight anyway. For example, this seems to cause a bare "*" to match all pathnames (with multiple pathname components, that is) -- normally we don't expect "*" to cross pathname separators.

Consider the pattern you added under "Tianocore Stewards" -- the intent is to CC them on patches that touch the project root directory's contents. However, I think the above translation will cause them to be CC'd on everything. Is that right?


- I was about to suggest "fnmatch" instead:

  https://docs.python.org/3/library/fnmatch.html

However, according to the docs, "fnmatch" doesn't consider "/" a special character. :(

(I probably remembered "fnmatch" from POSIX, where FNM_PATHNAME can make fnmatch() distinguish "/":

  http://pubs.opengroup.org/onlinepubs/9699919799/functions/fnmatch.html

Too bad the same flag is not exposed in Python.)

NB: "git" itself uses fnmatch() -- see "pathspec" in gitglossary(7).


- "glob" is not helpful either, as it generates pathnames based on the pattern and the filesystem:

  https://docs.python.org/3/library/glob.html#module-glob


- It seems that "pathlib" could *maybe* match "*" against just filenames (= pathname components):

  https://docs.python.org/3/library/pathlib.html

but it seems too heavy-weight either way.


- so maybe we should stick with regular expressions, but exclude path separators from the translations of "?" and "*" (use negative character classes such as [^/] rather than just ".")?

- should we anchor the expressions to the start of the pathnames being matched?

Unfortunately we seem to have uses for both cases. E.g., stewards should be CC'd for files directly in the project root dir. That suggests both excluding "/" from "*", and anchoring the pattern to the pathname start. However, you and Ard would like to be CC'd on anything AARCH64 and ARM, and */AARCH64/* would not work as intended if "*" excluded "/", and the pattern was anchored to the left.

The "rdiff-backup" utility solves this problem by introducing another wildcard, "**", which matches pathname separators explicitly. (And "*" doesn't match "/".)

Under that approach, we'd spell the pattern for the stewards as just "*", and the AARCH64 pattern as **/AARCH64/**. I think this could be implemented in pattern_to_regex(), if you processed "**" first, and "*" second.

Thanks
Laszlo


> +
> +def get_section_maintainers(path, section):
> +    """Returns a list with email addresses to any M: and R: entries
> +       matching the provided path in the provided section."""
> +    maintainers = []
> +
> +    for pattern in section['file']:
> +        regex = pattern_to_regex(pattern)
> +        match = re.match(regex, path)
> +        if match:
> +#            print('path: "%s" pattern: "%s" regex: "%s"' % (path, pattern, regex))
> +            for address in section['maintainer'], section['reviewer']:
> +                maintainers += address
> +
> +    return maintainers
> +
> +def get_maintainers(path, sections):
> +    """For 'path', iterates over all sections, returning maintainers
> +       for matching ones."""
> +    maintainers = []
> +    for section in sections:
> +        addresses = get_section_maintainers(path, section)
> +        if addresses:
> +            maintainers += addresses
> +
> +    # Remove any duplicates
> +    return list(OrderedDict.fromkeys(maintainers))
> +
> +def parse_maintainers_line(line):
> +    """Parse one line of Maintainers.txt, returning any match group and its key."""
> +    for key, expression in EXPRESSIONS.items():
> +        match = expression.match(line)
> +        if match:
> +            return key, match.group(key)
> +    return None, None
> +
> +def parse_maintainers_file(filename):
> +    """Parse the Maintainers.txt from top-level of repo and
> +       return a list containing dictionaries of all sections."""
> +    with open(filename, 'r') as text:
> +        line = text.readline()
> +        sectionlist = []
> +        section = defaultdict(list)
> +        while line:
> +            key, value = parse_maintainers_line(line)
> +            if key and value:
> +                section[key].append(value)
> +
> +            line = text.readline()
> +            # If end of section (end of file, or non-tag line encountered)...
> +            if not key or not value or not line:
> +                # ...if non-empty, append area to list.
> +                if section:
> +                    sectionlist.append(section.copy())
> +                    section.clear()
> +
> +        return sectionlist
> +
> +def get_modified_files(repo, args):
> +    """Returns a list of the files modified by the commit specified in 'args'."""
> +    commit = repo.commit(args.commit)
> +    return commit.stats.files
> +
> +if __name__ == '__main__':
> +    PARSER = argparse.ArgumentParser(
> +        description='Retrieves information on who to cc for review on a given commit')
> +    PARSER.add_argument('commit',
> +                        action="store",
> +                        help='git revision to examine (default: HEAD)',
> +                        nargs='?',
> +                        default='HEAD')
> +    ARGS = PARSER.parse_args()
> +
> +    REPO = SetupGit.locate_repo()
> +
> +    FILES = get_modified_files(REPO, ARGS)
> +    CONFIG_FILE = os.path.join(REPO.working_dir, 'Maintainers.txt')
> +
> +    SECTIONS = parse_maintainers_file(CONFIG_FILE)
> +
> +    ADDRESSES = []
> +
> +    for file in FILES:
> +        print(file)
> +        ADDRESSES += get_maintainers(file, SECTIONS)
> +
> +    for address in list(OrderedDict.fromkeys(ADDRESSES)):
> +        print('  %s' % address)
> diff --git a/Maintainers.txt b/Maintainers.txt
> index 762a684659..f4c851536e 100644
> --- a/Maintainers.txt
> +++ b/Maintainers.txt
> @@ -47,6 +47,7 @@ T: svn (read-only, deprecated) - https://svn.code.sf.net/p/edk2/code/trunk/edk2
>  
>  Tianocore Stewards
>  ------------------
> +F: *
>  M: Andrew Fish <afish@apple.com>
>  M: Laszlo Ersek <lersek@redhat.com>
>  M: Leif Lindholm <leif.lindholm@linaro.org>
> @@ -61,19 +62,30 @@ EDK II Releases:
>  W: https://github.com/tianocore/tianocore.github.io/wiki/EDK-II-Release-Planning
>  M: Liming Gao <liming.gao@intel.com>
>  
> +EDK II Architectures:
> +---------------------
> +ARM, AARCH64
> +F: */AARCH64/*
> +F: */ARM/*
> +M: Leif Lindholm <leif.lindholm@linaro.org>
> +M: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> +
>  EDK II Packages:
>  ----------------
>  ArmPkg
> +F: ArmPkg/*
>  W: https://github.com/tianocore/tianocore.github.io/wiki/ArmPkg
>  M: Leif Lindholm <leif.lindholm@linaro.org>
>  M: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>  
>  ArmPlatformPkg
> +F: ArmPlatformPkg/*
>  W: https://github.com/tianocore/tianocore.github.io/wiki/ArmPlatformPkg
>  M: Leif Lindholm <leif.lindholm@linaro.org>
>  M: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>  
>  ArmVirtPkg
> +F: ArmVirtPkg/*
>  W: https://github.com/tianocore/tianocore.github.io/wiki/ArmVirtPkg
>  M: Laszlo Ersek <lersek@redhat.com>
>  M: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> @@ -81,26 +93,31 @@ R: Julien Grall <julien.grall@arm.com>
>     (Xen modules)
>  
>  BaseTools
> +F: BaseTools/*
>  W: https://github.com/tianocore/tianocore.github.io/wiki/BaseTools
>  M: Bob Feng <bob.c.feng@intel.com>
>  M: Liming Gao <liming.gao@intel.com>
>  
>  CryptoPkg
> +F: CryptoPkg/*
>  W: https://github.com/tianocore/tianocore.github.io/wiki/CryptoPkg
>  M: Jian Wang <jian.j.wang@intel.com>
>  R: Ting Ye <ting.ye@intel.com>
>  
>  DynamicTablesPkg
> +F: DynamicTablesPkg/*
>  W: https://github.com/tianocore/tianocore.github.io/wiki/DynamicTablesPkg
>  M: Sami Mujawar <Sami.Mujawar@arm.com>
>  M: Alexei Fedorov <Alexei.Fedorov@arm.com>
>  
>  EmbeddedPkg
> +F: EmbeddedPkg/*
>  W: https://github.com/tianocore/tianocore.github.io/wiki/EmbeddedPkg
>  M: Leif Lindholm <leif.lindholm@linaro.org>
>  M: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>  
>  EmulatorPkg
> +F: EmulatorPkg/*
>  W: https://github.com/tianocore/tianocore.github.io/wiki/EmulatorPkg
>  M: Jordan Justen <jordan.l.justen@intel.com>
>  M: Andrew Fish <afish@apple.com>
> @@ -108,55 +125,65 @@ M: Ray Ni <ray.ni@intel.com>
>  S: Maintained
>  
>  FatPkg
> +F: FatPkg/*
>  W: https://github.com/tianocore/tianocore.github.io/wiki/Edk2-fat-driver
>  M: Ray Ni <ray.ni@intel.com>
>  T: svn - https://svn.code.sf.net/p/edk2-fatdriver2/code/trunk/EnhancedFat
>  T: git - https://github.com/tianocore/edk2-FatPkg.git
>  
>  FmpDevicePkg
> +F: FmpDevicePkg/*
>  W: https://github.com/tianocore/tianocore.github.io/wiki/FmpDevicePkg
>  M: Liming Gao <liming.gao@intel.com>
>  M: Michael D Kinney <michael.d.kinney@intel.com>
>  
>  IntelFrameworkModulePkg
> +F: IntelFrameworkModulePkg/*
>  W: https://github.com/tianocore/tianocore.github.io/wiki/IntelFrameworkModulePkg
>  M: Liming Gao <liming.gao@intel.com>
>  
>  IntelFrameworkPkg
> +F: IntelFrameworkPkg/*
>  W: https://github.com/tianocore/tianocore.github.io/wiki/IntelFrameworkPkg
>  M: Michael D Kinney <michael.d.kinney@intel.com>
>  M: Liming Gao <liming.gao@intel.com>
>  
>  IntelFsp2Pkg
> +F: IntelFsp2Pkg/*
>  W: https://github.com/tianocore/tianocore.github.io/wiki/IntelFsp2Pkg
>  M: Chasel Chiu <chasel.chiu@intel.com>
>  R: Nate DeSimone <nathaniel.l.desimone@intel.com>
>  R: Star Zeng <star.zeng@intel.com>
>  
>  IntelFsp2WrapperPkg
> +F: IntelFsp2WrapperPkg/*
>  W: https://github.com/tianocore/tianocore.github.io/wiki/IntelFsp2WrapperPkg
>  M: Chasel Chiu <chasel.chiu@intel.com>
>  R: Nate DeSimone <nathaniel.l.desimone@intel.com>
>  R: Star Zeng <star.zeng@intel.com>
>  
>  IntelFspPkg
> +F: IntelFspPkg/*
>  W: https://github.com/tianocore/tianocore.github.io/wiki/IntelFspPkg
>  M: Chasel Chiu <chasel.chiu@intel.com>
>  R: Nate DeSimone <nathaniel.l.desimone@intel.com>
>  R: Star Zeng <star.zeng@intel.com>
>  
>  IntelFspWrapperPkg
> +F: IntelFspWrapperPkg/*
>  W: https://github.com/tianocore/tianocore.github.io/wiki/IntelFspWrapperPkg
>  M: Chasel Chiu <chasel.chiu@intel.com>
>  R: Nate DeSimone <nathaniel.l.desimone@intel.com>
>  R: Star Zeng <star.zeng@intel.com>
>  
>  IntelSiliconPkg
> +F: IntelSiliconPkg/*
>  W: https://github.com/tianocore/tianocore.github.io/wiki/IntelSiliconPkg
>  M: Ray Ni <ray.ni@intel.com>
>  M: Rangasai V Chaganty <rangasai.v.chaganty@intel.com>
>  
>  MdeModulePkg
> +F: MdeModulePkg/*
>  W: https://github.com/tianocore/tianocore.github.io/wiki/MdeModulePkg
>  M: Jian J Wang <jian.j.wang@intel.com>
>  M: Hao A Wu <hao.a.wu@intel.com>
> @@ -166,16 +193,19 @@ R: Ray Ni <ray.ni@intel.com>
>  R: Star Zeng <star.zeng@intel.com>
>  
>  MdePkg
> +F: MdePkg/*
>  W: https://github.com/tianocore/tianocore.github.io/wiki/MdePkg
>  M: Michael D Kinney <michael.d.kinney@intel.com>
>  M: Liming Gao <liming.gao@intel.com>
>  
>  NetworkPkg
> +F: NetworkPkg/*
>  W: https://github.com/tianocore/tianocore.github.io/wiki/NetworkPkg
>  M: Siyuan Fu <siyuan.fu@intel.com>
>  M: Jiaxin Wu <jiaxin.wu@intel.com>
>  
>  OvmfPkg
> +F: OvmfPkg/*
>  W: http://www.tianocore.org/ovmf/
>  M: Jordan Justen <jordan.l.justen@intel.com>
>  M: Laszlo Ersek <lersek@redhat.com>
> @@ -191,16 +221,19 @@ R: Stefan Berger <stefanb@linux.ibm.com>
>  S: Maintained
>  
>  PcAtChipsetPkg
> +F: PcAtChipsetPkg/*
>  W: https://github.com/tianocore/tianocore.github.io/wiki/PcAtChipsetPkg
>  M: Ray Ni <ray.ni@intel.com>
>  
>  SecurityPkg
> +F: SecurityPkg/*
>  W: https://github.com/tianocore/tianocore.github.io/wiki/SecurityPkg
>  M: Chao Zhang <chao.b.zhang@intel.com>
>  M: Jiewen Yao <jiewen.yao@intel.com>
>  M: Jian Wang <jian.j.wang@intel.com>
>  
>  ShellPkg
> +F: ShellPkg/*
>  W: https://github.com/tianocore/tianocore.github.io/wiki/ShellPkg
>  M: Jaben Carsey <jaben.carsey@intel.com>
>  M: Ray Ni <ray.ni@intel.com>
> @@ -214,21 +247,25 @@ M: Leif Lindholm <leif.lindholm@linaro.org>   (ARM/AArch64)
>  M: Ard Biesheuvel <ard.biesheuvel@linaro.org> (ARM/AArch64)
>  
>  SignedCapsulePkg
> +F: SignedCapsulePkg/*
>  W: https://github.com/tianocore/tianocore.github.io/wiki/SignedCapsulePkg
>  M: Jiewen Yao <jiewen.yao@intel.com>
>  M: Chao Zhang <chao.b.zhang@intel.com>
>  
>  SourceLevelDebugPkg
> +F: SourceLevelDebugPkg/*
>  W: https://github.com/tianocore/tianocore.github.io/wiki/SourceLevelDebugPkg
>  M: Hao A Wu <hao.a.wu@intel.com>
>  
>  UefiCpuPkg
> +F: UefiCpuPkg/*
>  W: https://github.com/tianocore/tianocore.github.io/wiki/UefiCpuPkg
>  M: Eric Dong <eric.dong@intel.com>
>  M: Ray Ni <ray.ni@intel.com>
>  R: Laszlo Ersek <lersek@redhat.com>
>  
>  UefiPayloadPkg
> +F: UefiPayloadPkg/*
>  W: https://github.com/tianocore/tianocore.github.io/wiki/UefiPayloadPkg
>  M: Maurice Ma <maurice.ma@intel.com>
>  M: Guo Dong <guo.dong@intel.com>
> @@ -236,6 +273,7 @@ M: Benjamin You <benjamin.you@intel.com>
>  S: Maintained
>  
>  StandaloneMmPkg
> +F: StandaloneMmPkg/*
>  M: Achin Gupta <achin.gupta@arm.com>
>  M: Jiewen Yao <jiewen.yao@intel.com>
>  R: Supreeth Venkatesh <supreeth.venkatesh@arm.com>
> 


  reply	other threads:[~2019-06-12 13:27 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-12 11:19 [BIKESHEDDING-TARGET] BaseTools: add GetMaintainer.py helper script Leif Lindholm
2019-06-12 13:27 ` Laszlo Ersek [this message]
2019-06-12 19:47   ` Leif Lindholm

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=673ee087-2f93-7667-8d05-cda310610737@redhat.com \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox