From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124])
 by mx.groups.io with SMTP id smtpd.web08.35800.1623072798416599092
 for <devel@edk2.groups.io>;
 Mon, 07 Jun 2021 06:33:18 -0700
Authentication-Results: mx.groups.io;
 dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=EaIH++dI;
 spf=pass (domain: redhat.com, ip: 170.10.133.124, mailfrom: lersek@redhat.com)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com;
	s=mimecast20190719; t=1623072797;
	h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
	 to:to:cc:cc:mime-version:mime-version:content-type:content-type:
	 content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references;
	bh=3vkYW8C8JDWjbhSurRvdmfVwe//PO7/QlMn7zpM5m2o=;
	b=EaIH++dImpFbtbcdntSyQhT5KMfrD1N1dyXuD2moFIF2ch8/XigyTSNT/hHicmv48/xrr1
	3o2zw2QDaBJPA+utZFJrRpk/4RSjHGmXN4RB4BMQ2aa8Gw/Un5AkDOP8xk1deT9FHr4A7i
	EMqNEkBGie0qJU8L/k0aNJufRKaOAOM=
Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com
 [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id
 us-mta-526-p_Hg_wdCMQuowJXYnncYiw-1; Mon, 07 Jun 2021 09:33:13 -0400
X-MC-Unique: p_Hg_wdCMQuowJXYnncYiw-1
Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12])
	(using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
	(No client certificate requested)
	by mimecast-mx01.redhat.com (Postfix) with ESMTPS id A5A41196632C;
	Mon,  7 Jun 2021 13:33:11 +0000 (UTC)
Received: from lacos-laptop-7.usersys.redhat.com (ovpn-114-75.ams2.redhat.com [10.36.114.75])
	by smtp.corp.redhat.com (Postfix) with ESMTPS id 565FD60CC6;
	Mon,  7 Jun 2021 13:33:08 +0000 (UTC)
Subject: Re: [edk2-devel] [PATCH v1 0/8] Measured SEV boot with kernel/initrd/cmdline
To: Dov Murik <dovmurik@linux.ibm.com>, devel@edk2.groups.io,
 Ard Biesheuvel <ardb+tianocore@kernel.org>
Cc: Tobin Feldman-Fitzthum <tobin@linux.ibm.com>,
 Tobin Feldman-Fitzthum <tobin@ibm.com>, Jim Cadden <jcadden@ibm.com>,
 James Bottomley <jejb@linux.ibm.com>, Hubertus Franke <frankeh@us.ibm.com>,
 Jordan Justen <jordan.l.justen@intel.com>,
 Ashish Kalra <ashish.kalra@amd.com>, Brijesh Singh <brijesh.singh@amd.com>,
 Erdem Aktas <erdemaktas@google.com>, Jiewen Yao <jiewen.yao@intel.com>,
 Min Xu <min.m.xu@intel.com>, Tom Lendacky <thomas.lendacky@amd.com>
References: <20210525053116.1533673-1-dovmurik@linux.ibm.com>
 <5d8c598e-31de-7973-df51-e913bba54587@redhat.com>
 <af367178-bf85-8954-5484-a54218ba5698@linux.ibm.com>
 <ff44d976-a246-782a-ec28-af96772f52b6@redhat.com>
 <8ee75de9-9a62-c62b-fd98-810b5dae63b3@linux.ibm.com>
From: "Laszlo Ersek" <lersek@redhat.com>
Message-ID: <bf707f1a-e8de-78bd-d469-df0bf590d776@redhat.com>
Date: Mon, 7 Jun 2021 15:33:06 +0200
MIME-Version: 1.0
In-Reply-To: <8ee75de9-9a62-c62b-fd98-810b5dae63b3@linux.ibm.com>
X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12
Authentication-Results: relay.mimecast.com;
	auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=lersek@redhat.com
X-Mimecast-Spam-Score: 0
X-Mimecast-Originator: redhat.com
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: 7bit

On 06/06/21 15:21, Dov Murik wrote:
> 
> 
> On 04/06/2021 14:26, Laszlo Ersek wrote:
>> On 06/04/21 12:30, Dov Murik wrote:
>>
>>> So I argue to keep the existing approach with two separate areas:
>>> existing one for injected secrets, and new one for a table of approved
>>> hashes (filled by QEMU and updated as initial encrypted measured guest
>>> memory).
>>
>> OK.
>>
>>> If the issue is MEMFD space,
>>
>> Yes, that's it.
>>
>>> maybe we can do something like: use the
>>> existing secrets page (4KB) for two uses: first 3KB for secrets, and
>>> last 1KB for hashes.  If this is not enough, the hashes are even
>>> smaller than 1KB; and we can even publish only one hash - the hash of
>>> all 3 hashes (need to think about edge cases when there's no
>>> cmdline/initrd). But all these "solutions" feel a bit hacky for me and
>>> might complicate the code.
>>
>> All these PCDs come in pairs -- base and size. (IIRC.) If there's no
>> architectural requirement to keep these two kinds of info in different
>> pages (such as different page protections or whatever), then packing
>> them into a single page is something I'd like. The above 3K+1K
>> subdivision sounds OK to me.
>>
> 
> I'll go with 3KB secrets + 1KB hashes.
> 
> 
>>>
>>> I don't understand your suggestion: "I'd *really* like us to extend
>>> one of the existent structures. If necessary, introduce a new GUID,
>>> for a table that contains both previously injected data, and the new
>>> data."; does this mean to use a single MEMFD page for the injected
>>> secrets and the hashes?
>>
>> Yes, it's the same (say, 3K+1K) idea, just expressed differently. In one
>> case, you have two GUIDed structs in the (plaintext, not compressed)
>> reset vector in the pflash, and the base+size structures associated wth
>> those two separate GUIDs happen to identify distinct ranges of the same
>> MEMFD page. In the other case, you have just one GUIDed structure (with
>> base+size contents), and the page pointed-to by this base+size pair is
>> subdivided by *internal* structuring -- such as internal GUIDs and so
>> on. Whichever is simpler to implement in both QEMU and edk2; I just want
>> to avoid wasing a full page for three hashes.
>>
> 
> I'll go with the two GUIDed structures in the reset vector (which will
> point to distinct parts of a single 4KB page).
> 
> That actually means shortening the existing secrets MEMFD area from 4KB
> to 3KB. Is that OK?

I don't know how that area is used in practice; from my perspective,
shortening it to 3KB is OK.

> 
> 
> 
>>>
>>> Also, in general, I don't really understand the implications of
>>> running out of MEMFD place;
>>
>> Here's one implication of enlarging MEMFD. It pushes BS Code, BS Data,
>> Loader Code, Loader Data, perhaps some AcpiNVS and Reserved memory
>> allocations to higher addresses. Then when the kernel is loaded, its
>> load address may be higher too. I'm not worried about wasted guest
>> memory, but abut various silent assumptions as to where the kernel
>> "should be". For example, after one round of enlarging DXEFV, the
>> "crash" utility stopped opening guest memory dumps, because it couldn't
>> find a kernel signature in the (low) address range that it used to scan.
>> The fix wasn't too difficult (the range to scan could be specified on
>> the "crash" commadn line, and then my colleague Dave Anderson just
>> modified "crash"), but it was a *surprise*. I don't like those.
>>
>>> maybe you have other ideas around this (for example,
>>> can we make MEMFD bigger only for AmdSevX64 platform?).
>>
>> Yes, experimenting with a larger MEMFD in just the AmdSevX64 platform is
>> fine.
>>
> 
> But now I understand that failures can appear way later in userspace
> (the crash utility), so just testing that a modern AMD VM boots fine is
> not enough to get confidence here.

Indeed if you expect the same userspace to work seamlessly, there is a
risk.

Cheers
Laszlo

> 
> 
>> NB reordering various PCDs between each other, so that their relative
>> relationships (orders) change, is a *lot* more risky than just enlarging
>> existing areas. The code in OVMF tends not to rely on actual bases and
>> sizes, but it may very well rely on a particular BasePCD + SizePCD sum
>> not exceeding another particular BasePCD.
>>
> 
> Thanks for pointing this out. I'll avoid reordering.
> 
> 
>>>
>>>
>>>> - Modifying the QemuFwCfgLib class for this purpose is inappropriate.
>>>> Even if we do our own home-brewed verifier, none of it must go into
>>>> QemuFwCfgLib class. QemuFwCfgLib is for transport.
>>>>
>>>
>>> OK, we'll take the verifier out (as you suggested below - to a
>>> BlobVerifierLib with two implementations).
>>>
>>>
>>>> [Ard, please see this one question:]
>>>>
>>>> - A major complication for hashing all three of: kernel, initrd,
>>>> cmdline, is that the *fetching* of this triplet is split between two
>>>> places. (Well, it is split between *three* places in fact, but I'm
>>>> going to ignore LinuxInitrdDynamicShellCommand for now, because the
>>>> AmdSevX64 platform sets BUILD_SHELL to FALSE for production.)
>>>>
>>>> The kernel and the initrd are fetched in QemuKernelLoaderFsDxe, but
>>>> the command line is fetched in (both) QemuLoadImageLib instances.
>>>> This requires that all these modules be littered with hashing as
>>>> well, which I find *really bad*. Even if we factor out the actual
>>>> logic, I strongly dislike having *just hooks* for hashing in multiple
>>>> modules.
>>>>
>>>> Now, please refer to efc52d67e157 ("OvmfPkg/QemuKernelLoaderFsDxe:
>>>> don't expose kernel command line", 2020-03-05). If we first
>>>>
>>>> (a) reverted that commit, and
>>>>
>>>> (b) modified *both* QemuLoadImageLib instances, to load the kernel
>>>> command line from the *synthetic filesystem* (rather than directly
>>>> from fw_cfg),
>>>>
>>>> then we could centralize the hashing to just QemuKernelLoaderFsDxe.
>>>>
>>>> Ard -- what's your thought on this?
>>>>
>>>
>>> I understand there's agreement here, and that both this suggested
>>> change (use the synthetic filesystem) and my patch series (add hash
>>> verification) touch the same code areas.  How do you envision this
>>> process in the mailing list?  Seperate patch serieses with dependency?
>>> One long patch series with both changes?  What goes first?
>>
>> Good point. I do have a kind of patch order laid out in my mind, but I
>> didn't think of whether we should have the patches in one patch series,
>> or in two "waves".
>>
>> OK, let's go with two patch sets.
>>
>> In the first set, we should just focus on the above steps (a) and (b).
>> Step (a) shouldn't be too hard. In step (b), you'd modify both
>> QemuLoadImageLib instances (two separate patches), replacing the
>> QemuFwCfgLib APIs for fetching the cmdline with
>> EFI_SIMPLE_FILE_SYSTEM_PROTOCOL and EFI_FILE_PROTOCOL APIs.
>>
>> Speaking from memory, the synthetic filesystem has a unique device path,
>> so the first step would be calling gBS->LocateDevicePath(), for finding
>> SimpleFs on the unique device path. Once you have the SimpleFs
>> interface, you can call OpenVolume, then open the "cmdline" file using
>> the EFI_FILE_PROTOCOL output by OpenVolume.
>>
>> Once we merge this series (basically just three patches), there is no
>> QemuFwCfgLib dependency left in either QemuLoadImageLib instance, I
>> reckon. Then you can post the second wave, in which:
>>
>> - a new "firmware config verifier" library class is introduced,
>>
>> - two library instances for that class are introduced (null, and the
>>   real thing),
>>
>> - the AmdSevX64.dsc platform resolves the new lib class to the "real"
>>   (hashing) instance,
>>
>> - all other platform DSCs using QemuKernelLoaderFsDxe resolve the new
>>   lib class to the null instance,
>>
>> - QemuKernelLoaderFsDxe is extended with a dependency on the new class,
>>   calling the proper APIs to (a) initialize the verifier, and (b) verify
>>   every fw_cfg blob that is about to be exposed as a synthetic file.
>>
>> Then QemuLoadImageLib needs no changes, as it will not depend on fw_cfg,
>> and every synthetic file it may want to access will have been verified
>> by QemuKernelLoaderFsDxe already, according to the verifier lib instance
>> that's used in the respective platform DSC file.
>>
>> I would recommend only posting the first patch set initially. It has a
>> very well defined goal (--> hide the fw_cfg dependency in both
>> QemuLoadImageLib instances behind the synthetic filesystem); we can
>> validate / review that regardless of the ultimate crypto / security
>> goal. Using the SimpleFs / FILE protocol APIs is not trivial IMO, so
>> it's possible that just the first wave will require a v2.
>>
> 
> OK, I'll try to follow this plan.
> 
>>>
>>>
>>>>
>>>> And then, we could eliminate the dynamic callback registration, plus
>>>> the separate SevFwCfgVerifier, SevHashFinderLib, and
>>>> SevQemuLoadImageLib stuff.
>>>>
>>>> We'd only need one new lib class, with *statically linked* hooks for
>>>> QemuKernelLoaderFsDxe, and two instances of this new class, a Null
>>>> one, and an actual (SEV hash verifier) one. The latter instance would
>>>> locate the hash values, calculate the fresh hashes, and perform the
>>>> comparisons. Only the AmdSevX64 platform would use the non-Null
>>>> instance of this library class.
>>>
>>> OK, I'll refactor to static linking with two BlobVerifierLib
>>> imlementations.
>>>
>>>
>>>>
>>>> (NB QemuKernelLoaderFsDxe is used by some ArmVirtPkg platforms, so
>>>> resolutions to the Null instance would be required there too.)
>>>
>>> I'll need to learn how to build edk2 for Arm to test this.  Thanks for
>>> the heads-up.
>>
>> With regard to QemuKernelLoaderFsDxe specifically:
>>
>>   build -b NOOPT -t GCC5 -p ArmVirtPkg/ArmVirtQemu.dsc               -a AARCH64
>>   build -b NOOPT -t GCC5 -p ArmVirtPkg/ArmVirtQemu.dsc       -a ARM
>>   build -b NOOPT -t GCC5 -p ArmVirtPkg/ArmVirtQemuKernel.dsc         -a AARCH64
>>   build -b NOOPT -t GCC5 -p ArmVirtPkg/ArmVirtQemuKernel.dsc -a ARM
>>
>> If you work on an x86_64 machine, you'll need cross-gcc and
>> cross-binutils for this. I have the following packages installed on my
>> laptop:
>>
>>   binutils-aarch64-linux-gnu-2.31.1-3.el7.x86_64
>>   binutils-arm-linux-gnu-2.31.1-3.el7.x86_64
>>   cross-binutils-common-2.31.1-3.el7.noarch
>>
>>   cross-gcc-common-9.2.1-3.el7.1.noarch
>>   gcc-aarch64-linux-gnu-9.2.1-3.el7.1.x86_64
>>   gcc-arm-linux-gnu-9.2.1-3.el7.1.x86_64
>>   gcc-c++-aarch64-linux-gnu-9.2.1-3.el7.1.x86_64
>>   gcc-c++-arm-linux-gnu-9.2.1-3.el7.1.x86_64
>>
>> (I don't remember why I have the c++ cross-compiler installed.)
>>
> 
> Thanks for the details; I'll try it.
> 
> -Dov
>