public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
From: "Laszlo Ersek" <lersek@redhat.com>
To: devel@edk2.groups.io, christian.rodriguez@intel.com
Cc: Bob Feng <bob.c.feng@intel.com>,
	Liming Gao <liming.gao@intel.com>,
	Yonghong Zhu <yonghong.zhu@intel.com>
Subject: Re: [edk2-devel] [PATCH] BaseTools: Include headers not mentioned in inf are not hashed
Date: Fri, 10 May 2019 01:53:22 +0200	[thread overview]
Message-ID: <f1b8499a-268a-ced9-09e4-01d5dc512b19@redhat.com> (raw)
In-Reply-To: <20190509212719.6156-1-christian.rodriguez@intel.com>

Hello Christian,

On 05/09/19 23:27, Christian Rodriguez wrote:
> BZ: https://bugzilla.tianocore.org/show_bug.cgi?id=1787
> 
> Get a list of local header files that are not present in the
> MetaFile for this module. Add those local header files into
> the hashing algorithm for a module. If a local header file is
> not present in the MetaFile, the module will still build correctly
> though the hashing system didn't know about it before.
> 
> Signed-off-by: Christian Rodriguez <christian.rodriguez@intel.com>
> Cc: Bob Feng <bob.c.feng@intel.com>
> Cc: Liming Gao <liming.gao@intel.com>
> Cc: Yonghong Zhu <yonghong.zhu@intel.com>
> ---
>  BaseTools/Source/Python/AutoGen/AutoGen.py | 24 ++++++++++++++++++++++++
>  1 file changed, 24 insertions(+)

I saw the BZ soon after it was reported. I almost commented, but
ultimately I couldn't decide what the real use case was.

With this particular use case (i.e. INF file is missing some
module-specific header files that it could easily list), I think I
disagree, mildly (not too strongly). E.g., we fixed such omissions in a
bunch of INF files, last March, in the series

  [PATCH 00/45]
  ArmVirtPkg, OvmfPkg: list module-internal headers in INF files

So my thinking is that this change is neither really required (people
can simply fix INF files), nor really sufficient.

Here's why the change is not sufficient (deduced purely from the commit
message -- because the commit message clarifies what the BZ report
didn't). Assume you are developing a module (driver or library) that
consumes a new EDKII extension protocol that your series *also*
introduces, under MdeModulePkg/Include/Protocol, in an earlier,
stand-alone patch. So one of your later patches, in the driver or lib
instance, says:

#include <Protocol/EdkiiAwesomeCoolFeature.h>

As you develop the driver, and perhaps even other code that consumes the
protocol produced by the new driver, you realize you need a new member
function, or one of the member functions needs a new parameter. You
change the protocol header file -- for example, you insert a new
function pointer at the *top* of the protocol structure -- and some
consumer code now breaks in testing. Because, the consumer code's hash
was never changed, and so it was never rebuilt, against the member
function field offsets from the updated protocol structure.

In other words, BaseTools doesn't do recursive dependency tracking (with
or without hashes). It's not a huge deal, it's just why I'm saying this
patch is not enough to address the real problem. And, if we had
recursive dependency tracking (i.e. a change in the hash of the *public*
<EdkiiAwesomeCoolFeature.h>  header triggered a rebuild of all dependent
modules), then we wouldn't have to look at module-internal unlisted
header files specifically.

Now, if the present change is being implemented because it's a small
patch, in comparison to updating hundreds of INF files in out-of-tree
platforms, I'm happy with that; just please state this motivation in the
commit message (and the BZ too, if possible).

One comment below, on the code:

> 
> diff --git a/BaseTools/Source/Python/AutoGen/AutoGen.py b/BaseTools/Source/Python/AutoGen/AutoGen.py
> index 31721a6f9f..bd282d3ec1 100644
> --- a/BaseTools/Source/Python/AutoGen/AutoGen.py
> +++ b/BaseTools/Source/Python/AutoGen/AutoGen.py
> @@ -4098,8 +4098,10 @@ class ModuleAutoGen(AutoGen):
>          if self.Name in GlobalData.gModuleHash[self.Arch] and GlobalData.gBinCacheSource and self.AttemptModuleCacheCopy():
>              return False
>          m = hashlib.md5()
> +
>          # Add Platform level hash
>          m.update(GlobalData.gPlatformHash.encode('utf-8'))
> +
>          # Add Package level hash
>          if self.DependentPackageList:
>              for Pkg in sorted(self.DependentPackageList, key=lambda x: x.PackageName):
> @@ -4118,14 +4120,36 @@ class ModuleAutoGen(AutoGen):
>          Content = f.read()
>          f.close()
>          m.update(Content)
> +
>          # Add Module's source files
> +        localSourceFileList = set()
>          if self.SourceFileList:
>              for File in sorted(self.SourceFileList, key=lambda x: str(x)):
> +                localSourceFileList.add(str(File))
>                  f = open(str(File), 'rb')
>                  Content = f.read()
>                  f.close()
>                  m.update(Content)
>  
> +        # Get a list of Module's local header files not included in metaFile
> +        localHeaderList = set()
> +        if self.SourceDir:
> +            for root, dirs, files in os.walk(self.SourceDir):
> +                for aFile in files:
> +                    filePath = os.path.join(self.WorkspaceDir, os.path.join(root, aFile))
> +                    if not filePath.endswith(('.h', '.H')):

".H" files should not be covered, in my opinion, as they stand for C++
header files under at least some toolchains, and C++ is not a supported
edk2 language.

... But, if there are hundreds of .H files out-of-tree, I guess I can be
convinced about this too. :)

Thanks
Laszlo

> +                        continue
> +                    if filePath not in localSourceFileList:
> +                        localHeaderList.add(filePath)
> +
> +        # Add Module's local header files
> +        localHeaderList = list(localHeaderList)
> +        for File in sorted(localHeaderList):
> +            f = open(str(File), 'rb')
> +            Content = f.read()
> +            f.close()
> +            m.update(Content)
> +
>          ModuleHashFile = path.join(self.BuildDir, self.Name + ".hash")
>          if self.Name not in GlobalData.gModuleHash[self.Arch]:
>              GlobalData.gModuleHash[self.Arch][self.Name] = m.hexdigest()
> 


  parent reply	other threads:[~2019-05-09 23:53 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-09 21:27 [PATCH] BaseTools: Include headers not mentioned in inf are not hashed Christian Rodriguez
2019-05-09 23:39 ` [edk2-devel] " Carsey, Jaben
2019-05-10 15:28   ` Christian Rodriguez
2019-05-10 16:14     ` Carsey, Jaben
2019-05-09 23:53 ` Laszlo Ersek [this message]
2019-05-10 13:41   ` Felix Polyudov
2019-05-10 19:13     ` Christian Rodriguez
2019-05-10 19:32       ` Felix Polyudov
2019-05-10 19:45         ` Christian Rodriguez
2019-05-13 11:39           ` Laszlo Ersek
2019-05-13 12:23             ` Bob Feng
2019-05-13 18:41               ` Christian Rodriguez
2019-05-14  2:52                 ` Bob Feng
2019-05-13 18:53             ` Christian Rodriguez
2019-05-13 20:19               ` Laszlo Ersek
2019-05-13 20:23                 ` Christian Rodriguez
     [not found]             ` <159E52DAFBF01090.24406@groups.io>
2019-05-13 19:39               ` Christian Rodriguez
2019-05-13 11:41           ` Bob Feng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f1b8499a-268a-ced9-09e4-01d5dc512b19@redhat.com \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox