From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: mx.groups.io; dkim=missing; spf=pass (domain: redhat.com, ip: 209.132.183.28, mailfrom: lersek@redhat.com) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by groups.io with SMTP; Thu, 09 May 2019 16:53:25 -0700 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 316CEC0495BB; Thu, 9 May 2019 23:53:25 +0000 (UTC) Received: from lacos-laptop-7.usersys.redhat.com (ovpn-120-234.rdu2.redhat.com [10.10.120.234]) by smtp.corp.redhat.com (Postfix) with ESMTP id C1D5C45A5; Thu, 9 May 2019 23:53:23 +0000 (UTC) Subject: Re: [edk2-devel] [PATCH] BaseTools: Include headers not mentioned in inf are not hashed To: devel@edk2.groups.io, christian.rodriguez@intel.com Cc: Bob Feng , Liming Gao , Yonghong Zhu References: <20190509212719.6156-1-christian.rodriguez@intel.com> From: "Laszlo Ersek" Message-ID: Date: Fri, 10 May 2019 01:53:22 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <20190509212719.6156-1-christian.rodriguez@intel.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Thu, 09 May 2019 23:53:25 +0000 (UTC) Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Hello Christian, On 05/09/19 23:27, Christian Rodriguez wrote: > BZ: https://bugzilla.tianocore.org/show_bug.cgi?id=1787 > > Get a list of local header files that are not present in the > MetaFile for this module. Add those local header files into > the hashing algorithm for a module. If a local header file is > not present in the MetaFile, the module will still build correctly > though the hashing system didn't know about it before. > > Signed-off-by: Christian Rodriguez > Cc: Bob Feng > Cc: Liming Gao > Cc: Yonghong Zhu > --- > BaseTools/Source/Python/AutoGen/AutoGen.py | 24 ++++++++++++++++++++++++ > 1 file changed, 24 insertions(+) I saw the BZ soon after it was reported. I almost commented, but ultimately I couldn't decide what the real use case was. With this particular use case (i.e. INF file is missing some module-specific header files that it could easily list), I think I disagree, mildly (not too strongly). E.g., we fixed such omissions in a bunch of INF files, last March, in the series [PATCH 00/45] ArmVirtPkg, OvmfPkg: list module-internal headers in INF files So my thinking is that this change is neither really required (people can simply fix INF files), nor really sufficient. Here's why the change is not sufficient (deduced purely from the commit message -- because the commit message clarifies what the BZ report didn't). Assume you are developing a module (driver or library) that consumes a new EDKII extension protocol that your series *also* introduces, under MdeModulePkg/Include/Protocol, in an earlier, stand-alone patch. So one of your later patches, in the driver or lib instance, says: #include As you develop the driver, and perhaps even other code that consumes the protocol produced by the new driver, you realize you need a new member function, or one of the member functions needs a new parameter. You change the protocol header file -- for example, you insert a new function pointer at the *top* of the protocol structure -- and some consumer code now breaks in testing. Because, the consumer code's hash was never changed, and so it was never rebuilt, against the member function field offsets from the updated protocol structure. In other words, BaseTools doesn't do recursive dependency tracking (with or without hashes). It's not a huge deal, it's just why I'm saying this patch is not enough to address the real problem. And, if we had recursive dependency tracking (i.e. a change in the hash of the *public* header triggered a rebuild of all dependent modules), then we wouldn't have to look at module-internal unlisted header files specifically. Now, if the present change is being implemented because it's a small patch, in comparison to updating hundreds of INF files in out-of-tree platforms, I'm happy with that; just please state this motivation in the commit message (and the BZ too, if possible). One comment below, on the code: > > diff --git a/BaseTools/Source/Python/AutoGen/AutoGen.py b/BaseTools/Source/Python/AutoGen/AutoGen.py > index 31721a6f9f..bd282d3ec1 100644 > --- a/BaseTools/Source/Python/AutoGen/AutoGen.py > +++ b/BaseTools/Source/Python/AutoGen/AutoGen.py > @@ -4098,8 +4098,10 @@ class ModuleAutoGen(AutoGen): > if self.Name in GlobalData.gModuleHash[self.Arch] and GlobalData.gBinCacheSource and self.AttemptModuleCacheCopy(): > return False > m = hashlib.md5() > + > # Add Platform level hash > m.update(GlobalData.gPlatformHash.encode('utf-8')) > + > # Add Package level hash > if self.DependentPackageList: > for Pkg in sorted(self.DependentPackageList, key=lambda x: x.PackageName): > @@ -4118,14 +4120,36 @@ class ModuleAutoGen(AutoGen): > Content = f.read() > f.close() > m.update(Content) > + > # Add Module's source files > + localSourceFileList = set() > if self.SourceFileList: > for File in sorted(self.SourceFileList, key=lambda x: str(x)): > + localSourceFileList.add(str(File)) > f = open(str(File), 'rb') > Content = f.read() > f.close() > m.update(Content) > > + # Get a list of Module's local header files not included in metaFile > + localHeaderList = set() > + if self.SourceDir: > + for root, dirs, files in os.walk(self.SourceDir): > + for aFile in files: > + filePath = os.path.join(self.WorkspaceDir, os.path.join(root, aFile)) > + if not filePath.endswith(('.h', '.H')): ".H" files should not be covered, in my opinion, as they stand for C++ header files under at least some toolchains, and C++ is not a supported edk2 language. ... But, if there are hundreds of .H files out-of-tree, I guess I can be convinced about this too. :) Thanks Laszlo > + continue > + if filePath not in localSourceFileList: > + localHeaderList.add(filePath) > + > + # Add Module's local header files > + localHeaderList = list(localHeaderList) > + for File in sorted(localHeaderList): > + f = open(str(File), 'rb') > + Content = f.read() > + f.close() > + m.update(Content) > + > ModuleHashFile = path.join(self.BuildDir, self.Name + ".hash") > if self.Name not in GlobalData.gModuleHash[self.Arch]: > GlobalData.gModuleHash[self.Arch][self.Name] = m.hexdigest() >