From mboxrd@z Thu Jan  1 00:00:00 1970
Authentication-Results: mx.groups.io;
 dkim=missing; spf=pass (domain: hpe.com, ip: 148.163.147.86, mailfrom: brian.johnson@hpe.com)
Received: from mx0a-002e3701.pphosted.com (mx0a-002e3701.pphosted.com [148.163.147.86])
 by groups.io with SMTP; Tue, 07 May 2019 10:23:17 -0700
Received: from pps.filterd (m0150242.ppops.net [127.0.0.1])
	by mx0a-002e3701.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x47HH7Ix001233;
	Tue, 7 May 2019 17:23:17 GMT
Received: from g2t2353.austin.hpe.com (g2t2353.austin.hpe.com [15.233.44.26])
	by mx0a-002e3701.pphosted.com with ESMTP id 2sbdgegeg4-1
	(version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT);
	Tue, 07 May 2019 17:23:17 +0000
Received: from g2t2360.austin.hpecorp.net (g2t2360.austin.hpecorp.net [16.196.225.135])
	by g2t2353.austin.hpe.com (Postfix) with ESMTP id A7A1E89;
	Tue,  7 May 2019 17:23:16 +0000 (UTC)
Received: from [10.33.152.19] (bjj-laptop2.americas.hpqcorp.net [10.33.152.19])
	by g2t2360.austin.hpecorp.net (Postfix) with ESMTP id 43E9C36;
	Tue,  7 May 2019 17:23:16 +0000 (UTC)
Subject: Re: [edk2-devel] TianoCore Community Design Meeting Minutes
To: devel@edk2.groups.io, lersek@redhat.com, Sean <sean.brogan@microsoft.com>
References: <70d2f499-bc28-058a-8675-069beee5835e@hpe.com>
 <31264.1556825609503060272@groups.io>
 <ce7c4a31-0ed1-2228-7d19-6d69abb30c7c@hpe.com>
 <d6f4852f-bc4d-3f17-8b5c-ce3415f1e44d@redhat.com>
From: "Brian J. Johnson" <brian.johnson@hpe.com>
Message-ID: <ab3c01ce-e852-69d4-f766-caaa930343c8@hpe.com>
Date: Tue, 7 May 2019 12:23:16 -0500
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101
 Thunderbird/60.6.1
MIME-Version: 1.0
In-Reply-To: <d6f4852f-bc4d-3f17-8b5c-ce3415f1e44d@redhat.com>
X-HPE-SCL: -1
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2019-05-07_09:,,
 signatures=0
X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501
 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0
 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0
 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx
 scancount=1 engine=8.0.1-1810050000 definitions=main-1905070112
X-MIME-Autoconverted: from 8bit to quoted-printable by mx0a-002e3701.pphosted.com id x47HH7Ix001233
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: quoted-printable

On 5/6/19 11:06 AM, Laszlo Ersek wrote:
> On 05/03/19 23:41, Brian J. Johnson wrote:
>> On 5/2/19 2:33 PM, sean.brogan via groups.io wrote:
>>> Brian,
>>>
>>> I would really like to hear about the challenges your team faced and
>>> issues that caused those solutions to be unworkable.=C2=A0=C2=A0Proje=
ct Mu has
>>> and continues to invest a lot in testing capabilities, build
>>> automation, and finding ways to improve quality that scale.
>>>
>>
>> Our products depend on a reference BIOS tree provided to us by a major
>> processor vendor.=C2=A0 That tree includes portions of Edk2, plus nume=
rous
>> proprietary additions.=C2=A0 Each new platform starts with a new drop =
of
>> vendor code.=C2=A0 They provide additional drops throughout the platfo=
rm's
>> life.=C2=A0 In the past these were distributed as zip files, but more
>> recently they have transitioned to git.=C2=A0 We end up having to make
>> extensive changes to their code to port it to our platform.=C2=A0 In
>> addition, we maintain internally several packages of code used on all
>> our platforms, designed to be platform-independent, plus a
>> platform-dependent package which is intended to be modified for each
>> platform.
>>
>> When we first started using git, we looked for a way to share our
>> all-platform code among platforms, and move our platform-dependent cod=
e
>> easily to new platforms, while making it easy to integrate new drops
>> from our vendor.=C2=A0 We considered using git submodules, but decided=
 that
>> would be too awkward.=C2=A0 Modifying code in a submodule involves com=
mitting
>> in the submodule, then committing in the module containing it.=C2=A0 T=
his
>> seemed like too much trouble for our developers, who were all new to
>> git.=C2=A0 Plus, it didn't interact well at all with our internal bug
>> tracking system.=C2=A0 Basically, there was no good way to tie commits=
 in
>> various sub- and super-modules together in a straightforward, trackabl=
e
>> way.
>>
>> We tried a package called gitslave (https://urldefense.proofpoint.com/=
v2/url?u=3Dhttp-3A__gitslave.sourceforge.net_&d=3DDwIFaQ&c=3DC5b8zRQO1miG=
mBeVZ2LFWg&r=3DjoEypYTP_0CJDmGFXzPM2s0mxEmiZkE9j8XY2t0muB0&m=3D1tiBKTNUl1=
hsutcV6QO4vfS5z-mNbJG27saNg6g5oxE&s=3D_kECBP00BbccSKeE1CThEYHF7EtrPa7XGIR=
fRUPq8i0&e=3D),
>> which automates running git commands across a super-repo and various
>> sub-repos, with some sugar for aggregating the results into a readable
>> whole.=C2=A0 It's a bit more transparent than submodules.=C2=A0 But at=
 the end of
>> the day, you're still trying to coordinate multiple git repositories. =
We
>> gave it a try for a month or two, but having to manage multiple
>> repositories for day-to-day work, and the lack of a single commit
>> history spanning the entire tree doomed that scheme.=C2=A0 Developers =
rebelled.
>>
>> Ever since, we've used a single git repo per platform.=C2=A0 We keep t=
he
>> vendor code in a "base" branch, which we update as they provide drops,
>> then merge into our master branch.=C2=A0 When we start a new platform,=
 we use
>> git filter-branch to extract our all-platform and platform-dependent
>> code into a new branch, which we move to the new platform's repo and
>> merge into master.=C2=A0 It's possible to re-extract the code if we ne=
ed to
>> pick up updates.=C2=A0 This doesn't provide total flexibility... for
>> instance, backporting a fix in our all-platform code back to a previou=
s
>> platform involves manual cherrypicking.
>=20
> Good point -- and cherry-picking is a first class citizen in the git
> toolset. Upstream projects use it all the time, between their master an=
d
> stable branches. And we (RH) happen to use it all the time too. "git
> cherry-pick -s -x" (possibly "-e" too) is the main tool for backporting
> upstream patches to downstream branches.
>=20
>> But for day-to-day development,
>> it lets us work in a single git tree, with a bisectable history, worki=
ng
>> git-blame, commit IDs which tie directly to our bug tracker, and no
>> external tooling.=C2=A0 It's a bit of a pain to merge a new drop (shel=
l
>> scripts are our friends), but we're optimizing for ease of local
>> development.=C2=A0 That seems like the best use of our resources.
>>
>> So I'm leery of any scheme which involves multiple repos managed by an
>> external tool.=C2=A0 It sounds like a difficult way to do day-to-day
>> development.=C2=A0 If Edk2 does move to split repos, we could filter-b=
ranch
>> and merge them all together into a single branch for internal use, I
>> suppose.=C2=A0 But that does make it harder to push fixes upstream.
>=20
> Even if that re-merging worked in practica, and even if two consumers o=
f
> edk2 followed the exact same procedure for re-unifying the repo, they
> would still end up with different commit hashes -- and that would make
> it more difficult to reference the same commits in upstream discussion.
>=20

Yes, we end up having to cherry pick (or more likely, outright port) any=20
changes we want to send upstream back onto the upstream branch(es).  One=20
reason we don't do a lot of that....

>> (Not that we end up doing a lot of that... we're not developing an
>> open-source BIOS, just making use of open-source upstream components. =
So
>> our use case is quite a bit different from Laszlo's.)=C2=A0 We're also
>> generally focusing on one platform at a time, not trying to update
>> shared code across many at once.=C2=A0 So our use case may be differen=
t from
>> Sean's.
>>
>> This got rather long... I hope it helps explain where we're coming fro=
m.
>=20
> It's very educational to me -- I don't have to deal with "ZIP drops"
> from vendors, and I'm impressed by the "commit vendor drop on side
> branch, merge into master separately" workflow.
>=20
> How difficult have your git-merges been? (You mention shell scripts.)
> Have you found a correlation between merge difficulty and vendor drop
> frequency? (I'd expect the less frequently new code is dropped, the
> harder the merge is.)
>=20

In general, yes, the less frequently code is dropped, the greater the=20
merge effort, and the greater the likelihood of merge mistakes.  Our=20
vendor has begun releasing much more frequently than they used to, which=20
is generally a good thing.  But there tends to be a minimum level of=20
effort required for any drop, so if the drops are very frequent, we end=20
up with someone doing merges pretty much full time.

One project I'm working on involves four separate upstream repos, which=20
require individual filter-branch scripts to extract and reorganize code=20
into staging repos, plus an additional script to pull all the results=20
together into the final base branch.  Then we can merge that to our=20
master.  Sigh... git isn't supposed to be this complicated.  But at=20
least it gives us the machinery to do what we need to.  And most of our=20
developers don't need to worry about all the merge hassles.

> At RH, we generally rebase our product branches on new upstream fork-of=
f
> points (typically stable releases), instead of merging. (And, this
> strategy applies to more projects than just edk2.)
>=20
> Downstream, we don't create merge commits -- the downstream branches
> (consisting of a handful of downstream-only commits, and a large number
> of upstream backports, as time passes) have a linear history. The
> "web-like" git history is inherited from upstream up to the new fork-of=
f
> point (=3D an upstream stable tag). The linear nature of the downstream
> branches is very suitable for "RPM", where you have a base tarball (a
> flat source tree generated at the upstream tag), plus a list of
> downstream patches that can be applied in strict (linear) sequence, for
> binary package building.
>=20

Unfortunately, our downstreams end up with many (probably thousands, I=20
haven't counted) changes to the base code, not even counting the new=20
code we add.  So rebasing isn't an attractive option for us, and a=20
patch-based development process just isn't feasible.

I guess the takeaway is that Edk2 is used in many ways by many different=20
people.  So it's good to keep everyone in the discussion.

> Thanks!
> Laszlo


--=20
Brian J. Johnson
Enterprise X86 Lab

Hewlett Packard Enterprise

brian.johnson@hpe.com