TianoCore Community Design Meeting Minutes

public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed

* TianoCore Community Design Meeting Minutes
@ 2019-04-19  5:55 Ni, Ray
  2019-04-23 18:22 ` [edk2-devel] " Laszlo Ersek
  0 siblings, 1 reply; 8+ messages in thread
From: Ni, Ray @ 2019-04-19  5:55 UTC (permalink / raw)
  To: devel@edk2.groups.io

[-- Attachment #1: Type: text/plain, Size: 3444 bytes --]

Hi everyone,

In the first design meeting, Matthew and Sean from Microsoft presented the Mu tools.

Below are some notes Mike and I captured from the meeting.

Please reply to this mail for any questions and comments.



Matthew Carlson / Sean Brogan - Microsoft - Project Mu Tools https://edk2.groups.io/g/devel/files/Designs/2019/0418/2019-04-18%20Microsoft%20-%20Build%20Tools%20-%20Design%20Review%20.pdf

------------------------------------

EDKII tool

- uses batch/scripts with build command

- does not provide a consistent recipe for doing platforms



Background

- Mu is to organize code to support building products.

- Boot strapping refers to setting up initial build environment for building a product or platform.

- Splitting the code: A platform only needs to see the code the platform uses to build.



Mu tools use Python

- Install modules/tools with pip.

- Makes modules/tools available globally.

- Simplified python sources - import statements and version management.

- Allows python source to move within repo or between repos without source changes.



3 layers of pip modules from top to bottom:

- Mu-build (support CI)

- Mu-environment (support platform build)

- Mu-python-library (common code)



pip modules are already released in python3 pip storage.

- Will do release on demand.

- Expect to work in Win and Linux.

- Will try on MAC.



Build a platform through PlatformBuild.py

- Starts with ~1% of platform code

- Dependencies resolving phase pulls additional platform code

   * Multiple GIT repos are needed by platform. The dep resolving phase simplifies the code setup. "setup" phase is isolated and can be skipped or replaced with other similar tools.



Plugin Manager

- Only accept python plugins

- If tool is an exe, then a python wrapper is required.

- This plugin manager is not standard.  It is part of the mu tool extensions.



Question: Checkpoint and reproducibility

- Check pointing allows multiple builds with each build starting with same state

- Check pointing limited to system env and sys env variables.

- Must clean up platforms to not generate output files in source dirs.



Developer experience

- Can switch from one platform env to another easily.

- Devs working on common code and test across multiple platforms by switching platform env



Example of override - will send pointer

- #Override : 00000001 | MdeModulePkg/Universal/DisplayEngineDxe/DisplayEngineDxe.inf | c02f12780a5b035346be539c80ccd0e5 | 2018-10-05T21-43-51

- https://github.com/Microsoft/mu_plus/blob/release/201903/MsGraphicsPkg/DisplayEngineDxe/DisplayEngineDxe.inf

- Files go here?  https://edk2.groups.io/g/devel/files/Designs/2019/0418



VS2017 - VSWhere

- VS2017 supports multiple versions being installed

- Selects most recent version by default

- Platform can select a specific version.

- Gap in EDK II tools_def.txt today.  Requires different tool chain tag for each version of VS2017



Documentation

- Links in presentation

- RFC will be sent out in middle of May

- More coming



Question: CI tests- are they build only?

- They can do build and run tests.  Can also do pre-checks based on plugins



IMX is an example platform build that anyone can pull and try (https://github.com/ms-iot/MU_PLATFORM_NXP)



Thanks,

Ray



[-- Attachment #2: Type: text/html, Size: 8937 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [edk2-devel] TianoCore Community Design Meeting Minutes
  2019-04-19  5:55 TianoCore Community Design Meeting Minutes Ni, Ray
@ 2019-04-23 18:22 ` Laszlo Ersek
  2019-04-23 20:37   ` Brian J. Johnson
  0 siblings, 1 reply; 8+ messages in thread
From: Laszlo Ersek @ 2019-04-23 18:22 UTC (permalink / raw)
  To: devel; +Cc: ray.ni

On 04/19/19 07:55, Ni, Ray wrote:
> Hi everyone,
> 
> In the first design meeting, Matthew and Sean from Microsoft presented the Mu tools.
> 
> Below are some notes Mike and I captured from the meeting.
> 
> Please reply to this mail for any questions and comments.
> 
> 
> 
> Matthew Carlson / Sean Brogan - Microsoft - Project Mu Tools https://edk2.groups.io/g/devel/files/Designs/2019/0418/2019-04-18%20Microsoft%20-%20Build%20Tools%20-%20Design%20Review%20.pdf

I've checked the slides; I'd like to comment on / ask about one
particular topic. The following three items relate to that topic:

(1):

> Background
> 
> [...]
> 
> - Splitting the code: A platform only needs to see the code the platform uses to build.

(2):

> Build a platform through PlatformBuild.py
> 
> - Starts with ~1% of platform code
> 
> - Dependencies resolving phase pulls additional platform code
> 
>    * Multiple GIT repos are needed by platform. The dep resolving phase simplifies the code setup. "setup" phase is isolated and can be skipped or replaced with other similar tools.

(3): slide 25 / 34:

> How do you discover what repos you need?
> Platforms define what they need to build and SDE finds it

and "SDE" is explained earlier on slide 22 / 34, "Self Describing
Environment":

> Verifies dependencies declared thru ext_deps and updates as needed

While I agree that a platform need not "see" more code than it requires
for being built, the platform is also not *hurt* by seeing more than it
strictly requires.

On the other hand, under a split repos approach, how are
inter-dependencies (between sub-repos) tracked, and navigated? Assume
there is a regression (encountered in platform testing) -- how do you
narrow down the issue to a specific commit of a specific sub-repo? And,
how do you automate said narrowing-down?

In a common git repository / with a common git history, the
inter-dependencies are tracked implicitly, and they aren't hard to
navigate, manually or automatically. Such navigation doesn't need
external tooling; it's all built into git (for example into "git
checkout" and "git bisect").

git supports submodules internally, but that feature exists to mirror
the boundaries that already exist between developer communities. For
example, OpenSSL's developer community and edk2's developer community,
are mostly distinct. Their workflows differ, their release criteria
differ, their testing expectations differ, so it makes sense for edk2 to
consume OpenSSL via a submodule.

But, I don't think the same applies to core modules in e.g. MdeModulePkg
and UefiCpuPkg, vs. *open* firmware platforms. Those development
communities overlap (or should overlap) to a good extent; we shouldn't
fragment them by splitting repos. (Separate subsystem repos and mailing
lists are fine as long as everything is git-merged ultimately into the
central repo.)

Note: I'm not arguing what Project Mu should do for its own sake. I'm
arguing against adopting some Project Mu workflow bits for upstream
edk2, at the level I currently understand those workflow bits. My
understanding of Project Mu could be very lacking. (I missed the design
meeting due to an unresolvable, permanent conflict.) Slide 12/34 says,
"Next Steps: Propose RFC to TianoCore community: Create 3 git
repositories". I hope I can check that out in more detail.

Thanks,
Laszlo

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [edk2-devel] TianoCore Community Design Meeting Minutes
  2019-04-23 18:22 ` [edk2-devel] " Laszlo Ersek
@ 2019-04-23 20:37   ` Brian J. Johnson
  2019-05-02 19:33     ` Sean
  0 siblings, 1 reply; 8+ messages in thread
From: Brian J. Johnson @ 2019-04-23 20:37 UTC (permalink / raw)
  To: devel, lersek; +Cc: ray.ni

On 4/23/19 1:22 PM, Laszlo Ersek wrote:
> On 04/19/19 07:55, Ni, Ray wrote:
>> Hi everyone,
>>
>> In the first design meeting, Matthew and Sean from Microsoft presented the Mu tools.
>>
>> Below are some notes Mike and I captured from the meeting.
>>
>> Please reply to this mail for any questions and comments.
>>
>>
>>
>> Matthew Carlson / Sean Brogan - Microsoft - Project Mu Tools https://urldefense.proofpoint.com/v2/url?u=https-3A__edk2.groups.io_g_devel_files_Designs_2019_0418_2019-2D04-2D18-2520Microsoft-2520-2D-2520Build-2520Tools-2520-2D-2520Design-2520Review-2520.pdf&d=DwIC-g&c=C5b8zRQO1miGmBeVZ2LFWg&r=joEypYTP_0CJDmGFXzPM2s0mxEmiZkE9j8XY2t0muB0&m=JrIFm-OW7EUMJO_bZcr5RkYsyHrao3YmmSYnCOCMAAg&s=f18bByZUCGrcf2VKMVUAoPNNBz2TKQFLJw1BNphrDc0&e=
> 
> I've checked the slides; I'd like to comment on / ask about one
> particular topic. The following three items relate to that topic:
> 
> (1):
> 
>> Background
>>
>> [...]
>>
>> - Splitting the code: A platform only needs to see the code the platform uses to build.
> 
> (2):
> 
>> Build a platform through PlatformBuild.py
>>
>> - Starts with ~1% of platform code
>>
>> - Dependencies resolving phase pulls additional platform code
>>
>>     * Multiple GIT repos are needed by platform. The dep resolving phase simplifies the code setup. "setup" phase is isolated and can be skipped or replaced with other similar tools.
> 
> (3): slide 25 / 34:
> 
>> How do you discover what repos you need?
>> Platforms define what they need to build and SDE finds it
> 
> and "SDE" is explained earlier on slide 22 / 34, "Self Describing
> Environment":
> 
>> Verifies dependencies declared thru ext_deps and updates as needed
> 
> 
> While I agree that a platform need not "see" more code than it requires
> for being built, the platform is also not *hurt* by seeing more than it
> strictly requires.
> 
> On the other hand, under a split repos approach, how are
> inter-dependencies (between sub-repos) tracked, and navigated? Assume
> there is a regression (encountered in platform testing) -- how do you
> narrow down the issue to a specific commit of a specific sub-repo? And,
> how do you automate said narrowing-down?
> 
> In a common git repository / with a common git history, the
> inter-dependencies are tracked implicitly, and they aren't hard to
> navigate, manually or automatically. Such navigation doesn't need
> external tooling; it's all built into git (for example into "git
> checkout" and "git bisect").
> 
> git supports submodules internally, but that feature exists to mirror
> the boundaries that already exist between developer communities. For
> example, OpenSSL's developer community and edk2's developer community,
> are mostly distinct. Their workflows differ, their release criteria
> differ, their testing expectations differ, so it makes sense for edk2 to
> consume OpenSSL via a submodule.
> 
> But, I don't think the same applies to core modules in e.g. MdeModulePkg
> and UefiCpuPkg, vs. *open* firmware platforms. Those development
> communities overlap (or should overlap) to a good extent; we shouldn't
> fragment them by splitting repos. (Separate subsystem repos and mailing
> lists are fine as long as everything is git-merged ultimately into the
> central repo.)
> 
> Note: I'm not arguing what Project Mu should do for its own sake. I'm
> arguing against adopting some Project Mu workflow bits for upstream
> edk2, at the level I currently understand those workflow bits. My
> understanding of Project Mu could be very lacking. (I missed the design
> meeting due to an unresolvable, permanent conflict.) Slide 12/34 says,
> "Next Steps: Propose RFC to TianoCore community: Create 3 git
> repositories". I hope I can check that out in more detail.
> 
> Thanks,
> Laszlo

I noticed similar things, and agree with Laszlo's points.  My group has 
attempted to develop a complex edk2-based project using multiple repos 
and some external tooling in the past, and found it completely 
unworkable.  Perhaps Project Mu's tooling is better than ours was.  But 
for modules which are developed together by the same group of people, 
keeping all the code in a single git repo lets you make the best use of 
git, and removes a lot of room for errors when committing code across 
multiple modules.
-- 
Brian J. Johnson
Enterprise X86 Lab

Hewlett Packard Enterprise


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [edk2-devel] TianoCore Community Design Meeting Minutes
  2019-04-23 20:37   ` Brian J. Johnson
@ 2019-05-02 19:33     ` Sean
  2019-05-03  8:45       ` Laszlo Ersek
  2019-05-03 21:41       ` Brian J. Johnson
  0 siblings, 2 replies; 8+ messages in thread
From: Sean @ 2019-05-02 19:33 UTC (permalink / raw)
  To: Brian J. Johnson, devel

[-- Attachment #1: Type: text/plain, Size: 3161 bytes --]

Laszlo,

Except for a very few platforms that are in the current edk2 repo today, everyone building products has to deal with this "split repo" reality.  It gets even harder when you account for different business units, different silicon partners, IBVs, ODMs, open source, closed source, secret projects, and so on.  The reason I want to bring forth the Project Mu tools to Edk2 is that everyone is forced to reinvent this process and tooling every time and it doesn't work very well.  If edk2 continues to ignore the challenge of most downstream consumers it continues to lead to the results of today.  Very high cost of integration.  Very low quality of integrations and very few integrations into products (meaning products don’t get updates).  The last couple of years has brought significant light to the challenges of updating platform firmware and has shown that the majority of products don't/can't get updates resulting in customers not being fully protected.  

Regarding submodules and boundaries. 
I completely agree except that I believe there are numerous boundaries within a UEFI code base.  As mentioned above one of our goals with splitting the code repositories is to have all code within a repository owned/supported by a single entity.  Another point to splitting is attempting to get code with business logic separated from core/common code.  Business logic often is different across products and if intermixed with core logic it adds significantly to the cost of maintaining the product.  Along your same thinking these different repositories do have different development models.  Many are closed source and have proprietary change request process.  They all release on different cadences, different dependencies and very different testing expectations so without a framework that provides some support this leads to challenging and complicated integration processes. 

Single repo:

It is not possible for most products.  Again when integrating large amounts of code from numerous places all with different restrictions it is not practical to have a single bisectable repository with good history tracking.  Some entities still deliver by zip files with absolutely no src control history.  Many entities mix open and closed source code and make hundreds/thousands of in-line changes.  I just don’t see a path where a product can have 1 git-merged repo and still be able to efficiently integrate from their upstream sources and track history.   

These tools are just a first step down a path to reshaping tianocore edk2 to be easier to consume (and update) by the ecosystem that depends on Edk2 for products.  These tools also have solutions for Ci builds, binary dependencies, plugins, testing, and other features that edk2 will need for some of the practical next steps. 

Brian,

I would really like to hear about the challenges your team faced and issues that caused those solutions to be unworkable.  Project Mu has and continues to invest a lot in testing capabilities, build automation, and finding ways to improve quality that scale.

Thanks

Sean

[-- Attachment #2: Type: text/html, Size: 5659 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [edk2-devel] TianoCore Community Design Meeting Minutes
  2019-05-02 19:33     ` Sean
@ 2019-05-03  8:45       ` Laszlo Ersek
  2019-05-03 21:41       ` Brian J. Johnson
  1 sibling, 0 replies; 8+ messages in thread
From: Laszlo Ersek @ 2019-05-03  8:45 UTC (permalink / raw)
  To: devel, sean.brogan, Brian J. Johnson

On 05/02/19 21:33, Sean via Groups.Io wrote:
> Laszlo,

(please add me to Cc: or To: directly when the email is directed at me
(as well); I could have easily missed this message, as it is only in my
list folder)

> Except for a very few platforms that are in the current edk2 repo
> today, everyone building products has to deal with this "split repo"
> reality.  It gets even harder when you account for different business
> units, different silicon partners, IBVs, ODMs, open source, closed
> source, secret projects, and so on.

I don't contest the problem statement.

We can look at what other high-profile open source projects do with this
problem.

For one example, the Linux kernel says, if it's not upstream, you're on
your own. And Linux carries loads of platform code.

I'm making an effort not to take such a hard-liner stance. We need to
find a balance as to how much we want to harm the *truly* open source
platforms, for the sake of proprietary downstreams of edk2.

My understanding is that the maintainers of edk2-platforms (the open
source ones anyway) would prefer an opposite code movement; i.e.
integrate everything in one upstream repository, for better
bisectability & tracking.

> The reason I want to bring forth the Project Mu tools to Edk2 is that
> everyone is forced to reinvent this process and tooling every time and
> it doesn't work very well.  If edk2 continues to ignore the challenge
> of most downstream consumers it continues to lead to the results of
> today. Very high cost of integration.  Very low quality of
> integrations and very few integrations into products (meaning products
> don't get updates).  The last couple of years has brought significant
> light to the challenges of updating platform firmware and has shown
> that the majority of products don't/can't get updates resulting in
> customers not being fully protected.
>
> Regarding submodules and boundaries. I completely agree except that I
> believe there are numerous boundaries within a UEFI code base.  As
> mentioned above one of our goals with splitting the code repositories
> is to have all code within a repository owned/supported by a single
> entity.

This is 100% counter to Open Source / Open Development / Distributed
Development.

I didn't invent the idea that edk2 *should* follow Open Development,
i.e. that it should be a good Open Source citizen. That decision had not
been made by me, it was in place when I started contributing.

I'm just saying that the direction you describe is *incompatible* with
Open Source / Open Development. A real open source project is
community-oriented, and the comminuty has shared ownership, as a whole.
If tomorrow we decide that FooPkg is now officially owned/supported by
the "single entity" called Company Bar (who happen to be a competitor to
Company Baz), will help Baz's contributions to FooPkg?

What you describe is a real problem of course. The open development
answer is that, if one prefers to keep things downstream, then the
rebase and integration burden stays with them, downstream, as well.

Maybe edk2 doesn't *want* to practice Open Development that much.

If we adopted "single entity ownership per repo", would that stay
aligned with the current mission statement?

    https://www.tianocore.org/

    Welcome to TianoCore, the community supporting an open source
    implementation of the Unified Extensible Firmware Interface (UEFI).
    EDK II is a modern, feature-rich, cross-platform firmware
    development environment for the UEFI and UEFI Platform
    Initialization (PI) specifications. [...]

> Another point to splitting is attempting to get code with business
> logic separated from core/common code.  Business logic often is
> different across products and if intermixed with core logic it adds
> significantly to the cost of maintaining the product.  Along your same
> thinking these different repositories do have different development
> models.  Many are closed source and have proprietary change request
> process.  They all release on different cadences, different
> dependencies and very different testing expectations so without a
> framework that provides some support this leads to challenging and
> complicated integration processes.

I certainly don't intend to dismiss these use cases. In my opinion, the
"upstream first" principle would help here too. Basically, don't ship
anything until at least the critical parts of it are merged upstream.

But, I don't want to get accused of being out of touch with reality. I
guess I'll have to defer to others with more experience in such
environments, and I should comment on specific edk2 proposals then.

> Single repo:
>
> It is not possible for most products.  Again when integrating large
> amounts of code from numerous places all with different restrictions
> it is not practical to have a single bisectable repository with good
> history tracking.  Some entities still deliver by zip files with
> absolutely no src control history.

Wouldn't you say that that is *their* problem? Absolutely deplorable
source code management?

Should we really destroy bisectability in upstream edk2 for their sake,
especially given that you can always extract sub-repositories
(mechanically, at that) from a unified repo, but not the other way
around?

> Many entities mix open and closed source code and make
> hundreds/thousands of in-line changes.  I just don't see a path where
> a product can have 1 git-merged repo and still be able to efficiently
> integrate from their upstream sources and track history.

Normally this is why someone tries to push as much code upstream as
possible; then downstream rebases are manageable.

> These tools are just a first step down a path to reshaping tianocore
> edk2 to be easier to consume (and update) by the ecosystem that
> depends on Edk2 for products.

Put differently, the first step to abandon the Open Development model,
for the sake of proprietary downstreams.

But, I can see myself becoming non-constructive here. I guess I should
withdraw from the general discussion, and comment on specific proposals
for edk2, when they appear.

(Speaking for Red Hat in this sentence: we certainly depend on edk2 for
products.)

> These tools also have solutions for Ci builds, binary dependencies,
> plugins, testing, and other features that edk2 will need for some of
> the practical next steps.

A side question here: what development model would you follow with these
tools themselves? (Because... the "willingness" we've witnessed, for
extending the email notifications sent by GitHub.com, is not
reassuring.)

Laszlo

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [edk2-devel] TianoCore Community Design Meeting Minutes
  2019-05-02 19:33     ` Sean
  2019-05-03  8:45       ` Laszlo Ersek
@ 2019-05-03 21:41       ` Brian J. Johnson
  2019-05-06 16:06         ` Laszlo Ersek
  1 sibling, 1 reply; 8+ messages in thread
From: Brian J. Johnson @ 2019-05-03 21:41 UTC (permalink / raw)
  To: Sean, devel

On 5/2/19 2:33 PM, sean.brogan via groups.io wrote:
> Brian,
> 
> I would really like to hear about the challenges your team faced and 
> issues that caused those solutions to be unworkable.  Project Mu has and 
> continues to invest a lot in testing capabilities, build automation, and 
> finding ways to improve quality that scale.
> 

Our products depend on a reference BIOS tree provided to us by a major 
processor vendor.  That tree includes portions of Edk2, plus numerous 
proprietary additions.  Each new platform starts with a new drop of 
vendor code.  They provide additional drops throughout the platform's 
life.  In the past these were distributed as zip files, but more 
recently they have transitioned to git.  We end up having to make 
extensive changes to their code to port it to our platform.  In 
addition, we maintain internally several packages of code used on all 
our platforms, designed to be platform-independent, plus a 
platform-dependent package which is intended to be modified for each 
platform.

When we first started using git, we looked for a way to share our 
all-platform code among platforms, and move our platform-dependent code 
easily to new platforms, while making it easy to integrate new drops 
from our vendor.  We considered using git submodules, but decided that 
would be too awkward.  Modifying code in a submodule involves committing 
in the submodule, then committing in the module containing it.  This 
seemed like too much trouble for our developers, who were all new to 
git.  Plus, it didn't interact well at all with our internal bug 
tracking system.  Basically, there was no good way to tie commits in 
various sub- and super-modules together in a straightforward, trackable way.

We tried a package called gitslave (http://gitslave.sourceforge.net/), 
which automates running git commands across a super-repo and various 
sub-repos, with some sugar for aggregating the results into a readable 
whole.  It's a bit more transparent than submodules.  But at the end of 
the day, you're still trying to coordinate multiple git repositories. 
We gave it a try for a month or two, but having to manage multiple 
repositories for day-to-day work, and the lack of a single commit 
history spanning the entire tree doomed that scheme.  Developers rebelled.

Ever since, we've used a single git repo per platform.  We keep the 
vendor code in a "base" branch, which we update as they provide drops, 
then merge into our master branch.  When we start a new platform, we use 
git filter-branch to extract our all-platform and platform-dependent 
code into a new branch, which we move to the new platform's repo and 
merge into master.  It's possible to re-extract the code if we need to 
pick up updates.  This doesn't provide total flexibility... for 
instance, backporting a fix in our all-platform code back to a previous 
platform involves manual cherrypicking.  But for day-to-day development, 
it lets us work in a single git tree, with a bisectable history, working 
git-blame, commit IDs which tie directly to our bug tracker, and no 
external tooling.  It's a bit of a pain to merge a new drop (shell 
scripts are our friends), but we're optimizing for ease of local 
development.  That seems like the best use of our resources.

So I'm leery of any scheme which involves multiple repos managed by an 
external tool.  It sounds like a difficult way to do day-to-day 
development.  If Edk2 does move to split repos, we could filter-branch 
and merge them all together into a single branch for internal use, I 
suppose.  But that does make it harder to push fixes upstream.  (Not 
that we end up doing a lot of that... we're not developing an 
open-source BIOS, just making use of open-source upstream components. 
So our use case is quite a bit different from Laszlo's.)  We're also 
generally focusing on one platform at a time, not trying to update 
shared code across many at once.  So our use case may be different from 
Sean's.

This got rather long... I hope it helps explain where we're coming from.
-- 
Brian J. Johnson
Enterprise X86 Lab
Hewlett Packard Enterprise
brian.johnson@hpe.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [edk2-devel] TianoCore Community Design Meeting Minutes
  2019-05-03 21:41       ` Brian J. Johnson
@ 2019-05-06 16:06         ` Laszlo Ersek
  2019-05-07 17:23           ` Brian J. Johnson
  0 siblings, 1 reply; 8+ messages in thread
From: Laszlo Ersek @ 2019-05-06 16:06 UTC (permalink / raw)
  To: devel, brian.johnson, Sean

On 05/03/19 23:41, Brian J. Johnson wrote:
> On 5/2/19 2:33 PM, sean.brogan via groups.io wrote:
>> Brian,
>>
>> I would really like to hear about the challenges your team faced and
>> issues that caused those solutions to be unworkable.  Project Mu has
>> and continues to invest a lot in testing capabilities, build
>> automation, and finding ways to improve quality that scale.
>>
> 
> Our products depend on a reference BIOS tree provided to us by a major
> processor vendor.  That tree includes portions of Edk2, plus numerous
> proprietary additions.  Each new platform starts with a new drop of
> vendor code.  They provide additional drops throughout the platform's
> life.  In the past these were distributed as zip files, but more
> recently they have transitioned to git.  We end up having to make
> extensive changes to their code to port it to our platform.  In
> addition, we maintain internally several packages of code used on all
> our platforms, designed to be platform-independent, plus a
> platform-dependent package which is intended to be modified for each
> platform.
> 
> When we first started using git, we looked for a way to share our
> all-platform code among platforms, and move our platform-dependent code
> easily to new platforms, while making it easy to integrate new drops
> from our vendor.  We considered using git submodules, but decided that
> would be too awkward.  Modifying code in a submodule involves committing
> in the submodule, then committing in the module containing it.  This
> seemed like too much trouble for our developers, who were all new to
> git.  Plus, it didn't interact well at all with our internal bug
> tracking system.  Basically, there was no good way to tie commits in
> various sub- and super-modules together in a straightforward, trackable
> way.
> 
> We tried a package called gitslave (http://gitslave.sourceforge.net/),
> which automates running git commands across a super-repo and various
> sub-repos, with some sugar for aggregating the results into a readable
> whole.  It's a bit more transparent than submodules.  But at the end of
> the day, you're still trying to coordinate multiple git repositories. We
> gave it a try for a month or two, but having to manage multiple
> repositories for day-to-day work, and the lack of a single commit
> history spanning the entire tree doomed that scheme.  Developers rebelled.
> 
> Ever since, we've used a single git repo per platform.  We keep the
> vendor code in a "base" branch, which we update as they provide drops,
> then merge into our master branch.  When we start a new platform, we use
> git filter-branch to extract our all-platform and platform-dependent
> code into a new branch, which we move to the new platform's repo and
> merge into master.  It's possible to re-extract the code if we need to
> pick up updates.  This doesn't provide total flexibility... for
> instance, backporting a fix in our all-platform code back to a previous
> platform involves manual cherrypicking.

Good point -- and cherry-picking is a first class citizen in the git
toolset. Upstream projects use it all the time, between their master and
stable branches. And we (RH) happen to use it all the time too. "git
cherry-pick -s -x" (possibly "-e" too) is the main tool for backporting
upstream patches to downstream branches.

> But for day-to-day development,
> it lets us work in a single git tree, with a bisectable history, working
> git-blame, commit IDs which tie directly to our bug tracker, and no
> external tooling.  It's a bit of a pain to merge a new drop (shell
> scripts are our friends), but we're optimizing for ease of local
> development.  That seems like the best use of our resources.
> 
> So I'm leery of any scheme which involves multiple repos managed by an
> external tool.  It sounds like a difficult way to do day-to-day
> development.  If Edk2 does move to split repos, we could filter-branch
> and merge them all together into a single branch for internal use, I
> suppose.  But that does make it harder to push fixes upstream.

Even if that re-merging worked in practica, and even if two consumers of
edk2 followed the exact same procedure for re-unifying the repo, they
would still end up with different commit hashes -- and that would make
it more difficult to reference the same commits in upstream discussion.

> (Not that we end up doing a lot of that... we're not developing an
> open-source BIOS, just making use of open-source upstream components. So
> our use case is quite a bit different from Laszlo's.)  We're also
> generally focusing on one platform at a time, not trying to update
> shared code across many at once.  So our use case may be different from
> Sean's.
> 
> This got rather long... I hope it helps explain where we're coming from.

It's very educational to me -- I don't have to deal with "ZIP drops"
from vendors, and I'm impressed by the "commit vendor drop on side
branch, merge into master separately" workflow.

How difficult have your git-merges been? (You mention shell scripts.)
Have you found a correlation between merge difficulty and vendor drop
frequency? (I'd expect the less frequently new code is dropped, the
harder the merge is.)

At RH, we generally rebase our product branches on new upstream fork-off
points (typically stable releases), instead of merging. (And, this
strategy applies to more projects than just edk2.)

Downstream, we don't create merge commits -- the downstream branches
(consisting of a handful of downstream-only commits, and a large number
of upstream backports, as time passes) have a linear history. The
"web-like" git history is inherited from upstream up to the new fork-off
point (= an upstream stable tag). The linear nature of the downstream
branches is very suitable for "RPM", where you have a base tarball (a
flat source tree generated at the upstream tag), plus a list of
downstream patches that can be applied in strict (linear) sequence, for
binary package building.

Thanks!
Laszlo

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [edk2-devel] TianoCore Community Design Meeting Minutes
  2019-05-06 16:06         ` Laszlo Ersek
@ 2019-05-07 17:23           ` Brian J. Johnson
  0 siblings, 0 replies; 8+ messages in thread
From: Brian J. Johnson @ 2019-05-07 17:23 UTC (permalink / raw)
  To: devel, lersek, Sean

On 5/6/19 11:06 AM, Laszlo Ersek wrote:
> On 05/03/19 23:41, Brian J. Johnson wrote:
>> On 5/2/19 2:33 PM, sean.brogan via groups.io wrote:
>>> Brian,
>>>
>>> I would really like to hear about the challenges your team faced and
>>> issues that caused those solutions to be unworkable.  Project Mu has
>>> and continues to invest a lot in testing capabilities, build
>>> automation, and finding ways to improve quality that scale.
>>>
>>
>> Our products depend on a reference BIOS tree provided to us by a major
>> processor vendor.  That tree includes portions of Edk2, plus numerous
>> proprietary additions.  Each new platform starts with a new drop of
>> vendor code.  They provide additional drops throughout the platform's
>> life.  In the past these were distributed as zip files, but more
>> recently they have transitioned to git.  We end up having to make
>> extensive changes to their code to port it to our platform.  In
>> addition, we maintain internally several packages of code used on all
>> our platforms, designed to be platform-independent, plus a
>> platform-dependent package which is intended to be modified for each
>> platform.
>>
>> When we first started using git, we looked for a way to share our
>> all-platform code among platforms, and move our platform-dependent code
>> easily to new platforms, while making it easy to integrate new drops
>> from our vendor.  We considered using git submodules, but decided that
>> would be too awkward.  Modifying code in a submodule involves committing
>> in the submodule, then committing in the module containing it.  This
>> seemed like too much trouble for our developers, who were all new to
>> git.  Plus, it didn't interact well at all with our internal bug
>> tracking system.  Basically, there was no good way to tie commits in
>> various sub- and super-modules together in a straightforward, trackable
>> way.
>>
>> We tried a package called gitslave (https://urldefense.proofpoint.com/v2/url?u=http-3A__gitslave.sourceforge.net_&d=DwIFaQ&c=C5b8zRQO1miGmBeVZ2LFWg&r=joEypYTP_0CJDmGFXzPM2s0mxEmiZkE9j8XY2t0muB0&m=1tiBKTNUl1hsutcV6QO4vfS5z-mNbJG27saNg6g5oxE&s=_kECBP00BbccSKeE1CThEYHF7EtrPa7XGIRfRUPq8i0&e=),
>> which automates running git commands across a super-repo and various
>> sub-repos, with some sugar for aggregating the results into a readable
>> whole.  It's a bit more transparent than submodules.  But at the end of
>> the day, you're still trying to coordinate multiple git repositories. We
>> gave it a try for a month or two, but having to manage multiple
>> repositories for day-to-day work, and the lack of a single commit
>> history spanning the entire tree doomed that scheme.  Developers rebelled.
>>
>> Ever since, we've used a single git repo per platform.  We keep the
>> vendor code in a "base" branch, which we update as they provide drops,
>> then merge into our master branch.  When we start a new platform, we use
>> git filter-branch to extract our all-platform and platform-dependent
>> code into a new branch, which we move to the new platform's repo and
>> merge into master.  It's possible to re-extract the code if we need to
>> pick up updates.  This doesn't provide total flexibility... for
>> instance, backporting a fix in our all-platform code back to a previous
>> platform involves manual cherrypicking.
> 
> Good point -- and cherry-picking is a first class citizen in the git
> toolset. Upstream projects use it all the time, between their master and
> stable branches. And we (RH) happen to use it all the time too. "git
> cherry-pick -s -x" (possibly "-e" too) is the main tool for backporting
> upstream patches to downstream branches.
> 
>> But for day-to-day development,
>> it lets us work in a single git tree, with a bisectable history, working
>> git-blame, commit IDs which tie directly to our bug tracker, and no
>> external tooling.  It's a bit of a pain to merge a new drop (shell
>> scripts are our friends), but we're optimizing for ease of local
>> development.  That seems like the best use of our resources.
>>
>> So I'm leery of any scheme which involves multiple repos managed by an
>> external tool.  It sounds like a difficult way to do day-to-day
>> development.  If Edk2 does move to split repos, we could filter-branch
>> and merge them all together into a single branch for internal use, I
>> suppose.  But that does make it harder to push fixes upstream.
> 
> Even if that re-merging worked in practica, and even if two consumers of
> edk2 followed the exact same procedure for re-unifying the repo, they
> would still end up with different commit hashes -- and that would make
> it more difficult to reference the same commits in upstream discussion.
> 

Yes, we end up having to cherry pick (or more likely, outright port) any 
changes we want to send upstream back onto the upstream branch(es).  One 
reason we don't do a lot of that....

>> (Not that we end up doing a lot of that... we're not developing an
>> open-source BIOS, just making use of open-source upstream components. So
>> our use case is quite a bit different from Laszlo's.)  We're also
>> generally focusing on one platform at a time, not trying to update
>> shared code across many at once.  So our use case may be different from
>> Sean's.
>>
>> This got rather long... I hope it helps explain where we're coming from.
> 
> It's very educational to me -- I don't have to deal with "ZIP drops"
> from vendors, and I'm impressed by the "commit vendor drop on side
> branch, merge into master separately" workflow.
> 
> How difficult have your git-merges been? (You mention shell scripts.)
> Have you found a correlation between merge difficulty and vendor drop
> frequency? (I'd expect the less frequently new code is dropped, the
> harder the merge is.)
> 

In general, yes, the less frequently code is dropped, the greater the 
merge effort, and the greater the likelihood of merge mistakes.  Our 
vendor has begun releasing much more frequently than they used to, which 
is generally a good thing.  But there tends to be a minimum level of 
effort required for any drop, so if the drops are very frequent, we end 
up with someone doing merges pretty much full time.

One project I'm working on involves four separate upstream repos, which 
require individual filter-branch scripts to extract and reorganize code 
into staging repos, plus an additional script to pull all the results 
together into the final base branch.  Then we can merge that to our 
master.  Sigh... git isn't supposed to be this complicated.  But at 
least it gives us the machinery to do what we need to.  And most of our 
developers don't need to worry about all the merge hassles.

> At RH, we generally rebase our product branches on new upstream fork-off
> points (typically stable releases), instead of merging. (And, this
> strategy applies to more projects than just edk2.)
> 
> Downstream, we don't create merge commits -- the downstream branches
> (consisting of a handful of downstream-only commits, and a large number
> of upstream backports, as time passes) have a linear history. The
> "web-like" git history is inherited from upstream up to the new fork-off
> point (= an upstream stable tag). The linear nature of the downstream
> branches is very suitable for "RPM", where you have a base tarball (a
> flat source tree generated at the upstream tag), plus a list of
> downstream patches that can be applied in strict (linear) sequence, for
> binary package building.
> 

Unfortunately, our downstreams end up with many (probably thousands, I 
haven't counted) changes to the base code, not even counting the new 
code we add.  So rebasing isn't an attractive option for us, and a 
patch-based development process just isn't feasible.

I guess the takeaway is that Edk2 is used in many ways by many different 
people.  So it's good to keep everyone in the discussion.

> Thanks!
> Laszlo


-- 
Brian J. Johnson
Enterprise X86 Lab

Hewlett Packard Enterprise

brian.johnson@hpe.com


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-05-07 17:23 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-04-19  5:55 TianoCore Community Design Meeting Minutes Ni, Ray
2019-04-23 18:22 ` [edk2-devel] " Laszlo Ersek
2019-04-23 20:37   ` Brian J. Johnson
2019-05-02 19:33     ` Sean
2019-05-03  8:45       ` Laszlo Ersek
2019-05-03 21:41       ` Brian J. Johnson
2019-05-06 16:06         ` Laszlo Ersek
2019-05-07 17:23           ` Brian J. Johnson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox