From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pj1-f42.google.com (mail-pj1-f42.google.com [209.85.216.42]) by mx.groups.io with SMTP id smtpd.web11.22815.1683737515683919618 for ; Wed, 10 May 2023 09:51:55 -0700 Authentication-Results: mx.groups.io; dkim=pass header.i=@gmail.com header.s=20221208 header.b=I1ewR0eD; spf=pass (domain: gmail.com, ip: 209.85.216.42, mailfrom: pedro.falcato@gmail.com) Received: by mail-pj1-f42.google.com with SMTP id 98e67ed59e1d1-24de3a8bfcfso7113254a91.1 for ; Wed, 10 May 2023 09:51:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1683737515; x=1686329515; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=U7k8RhHlGyl85ATrSiFnvzsvRVldd964c1fOS7LsmRM=; b=I1ewR0eDL9g3ZlOOyhjoXIfGXaO5ALCUxBPvE5UAUVUzN1fZrUViiDHlmMg49bXiAV 9MjGCDTgPLoNOwtOSA/QTIA9TwG/B1jG8CC4nkYZtk+nsAd4MlLigKOIIARz6pOhArDq icBQ7vdDCUD7CnWzixnY3Fy4HGFokgiiDJj1BmstVhIud+I8jWSVoChwHi1AY64te0TN kQyD7OtwpBIHxG4SgjE1QvmlVFjDuaOqY3pmoqmfZJotPOLSLrgIXce38i+c+I0Y+5x+ Qbh+nuXg8ULOhgzwf9b0En4mtZBFOFMsXRRW/p9sG7OVAXQZ0c08WZndMRsH20igcZNH osWg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683737515; x=1686329515; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=U7k8RhHlGyl85ATrSiFnvzsvRVldd964c1fOS7LsmRM=; b=j7mDYtSIQyphWLhkBr1+kOLvGUoMKJPKU9Ui8OWYTl6+ALkeonpzoPr22xIAkhSul+ fkapO+xBkN5oL3sgwg0vTEraDB3PU1QtCj650MdVpvyW6ugqGQcRMwaUIxUIJ85RW8Hr Poj2ZThpyg6+jJTakAGJH83yUsw8U3gKQcAXZHTLO9DjdPXsmARCUUmvrQmFQquTeo2X gHsXIUrl9ZzIsbMHz7m4LU8O3FUcL5YkyIoHYcyFCNCzC57fnp6u+E+MscmwtcE9J7c6 cKjMtGERUQfjRSNBKSMg+hlTfkuImjMa8QxHtRANh/Z6SuN9bHkPjSIvDBnuGGwySQ/8 hEvQ== X-Gm-Message-State: AC+VfDx60SpiUYfnBnWGDbxUzjSSo6GoKYinkvXKEcE7Z68ZEpjmDIoy 9QJVchwYsk4KvT5h5c+RjRB7tWhtBux4aJ2tamI= X-Google-Smtp-Source: ACHHUZ4VXYXjJJk/ZZXisdrI7r8rvX0mCVmwI0sbRqXQeI2D7b1++R4qeV7CU1IHLGk3wdltapyNDVAFa325zSmOAFA= X-Received: by 2002:a17:90a:ab81:b0:24b:a860:a09 with SMTP id n1-20020a17090aab8100b0024ba8600a09mr17475413pjq.49.1683737515155; Wed, 10 May 2023 09:51:55 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: "Pedro Falcato" Date: Wed, 10 May 2023 17:51:44 +0100 Message-ID: Subject: Re: Side effects of enabling PML5 in EFI To: Ard Biesheuvel Cc: edk2-devel-groups-io , Andrew Fish , "Kinney, Michael D" , Ray Ni Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wed, May 10, 2023 at 10:41=E2=80=AFAM Ard Biesheuvel w= rote: > > On Tue, 9 May 2023 at 19:24, Pedro Falcato wrot= e: > > > > Hi all, > > > > (+CC people vaguely related to the EFI spec, the PML5 implementation > > and kernel EFI boot code) > > > > As a result of the latest 5-level paging patches, I've been looking > > into how tiano supports PML5. > > This raised a question: Doesn't enabling PML5 in-firmware break > > compatibility with non-PML5-aware bootloaders and kernels? > > > > From an architectural point of view: > > - PML5 is enabled in CR4.LA57, but may only be toggled when not in > > IA32e mode (so, only in 32-bit) > > - Trying to mindlessly write to CR4 will #GP, and loading a 4-level > > page tables will crash with probable page faults or #GPs > > > > From an EFI spec point of view: > > - Whereas other architectures (arm64 for instance) specify the MMU > > state in detail, the x64 bits do not specify anything beyond "Paging > > enabled" (see 2.3.4). Which pre-PML5, was obviously well defined. > > We actually have a related problem on ARM: the size of the virtual > address space is not mandated by the spec, but it does require that > all memory is mapped 1:1. > Yes, your ARM problem sounds similar, but you do have the advantage that T0SZ (et al) have been there since forever, while for x86_64 this is just a surprise CR4 bit that you had no idea existed for the past 20 years. > This means that, if a system has any memory that is outside of the > 48-bit physical range, it must enable 5 level paging to map it 1:1 in > the 52-bit virtual range. > > Given that EDK2's page allocator allocates from top down from the end > of the address space, we might end up with allocations for ACPI tables > etc that cannot be mapped by kernels that do not implement support for > 5 level paging. Do we need to have all the memory mapped? I know the EFI spec demands identity mapping, but it does sound a lot like we could reserve everything from 2^(va_size) onwards with some (new or old) EFI memory type and keep going? Right? I imagine 128/256 TiB of DRAM will work just fine for EFI firmware..= . Basically, EFI allocations and usage would be restricted to commonly accessible ranges of memory (47-bit for x86, 48 for ARM64, etc...) > > I imagine a similar issue might exist on x86 as well, and this > suggests that using 5 level paging in the firmware is only sensible if > it is guaranteed that the OS and loader can deal with it (IOW, running > the firmware with 5 level pages and switching back to 4 in > ExitBootServices() may result in other issues) > Right. Sadly, I don't think you can even pull an EBS() trick here, as in theory you *can* set up alternate translations, even when under boot services (see 2.3.4.3). > > - When under boot services, this is likely not a problem as page > > tables are owned by boot services. Unless they touch them as defined > > in "2.3.4.3. Enabling Paging or Alternate Translations in an > > Application", which may run into problems. > > > > From an OS kernel/bootloader point of view: > > - A PML5 aware kernel/bootloader will likely correctly identify the > > PML5 capability and enable LA57, load 5-level page tables. As such, > > this scenario always works. > > - A non-PML5-aware one may incorrectly overwrite LA57 (and #GP), or > > just load a 4-level paging structure into CR3, and thus disastrously > > crash. > > > > So, how is any of this supposed to work? > > > > I don't think the firmware should ever use 5 level paging unless it is > strictly needed for a particular use case. And even then, it should > avoid allocating memory from the region that is only 1:1 accessible > when 5 level paging is enabled. I agree. But I would like it if the more EFI "big whigs" (Mike K, Andrew, etc) could give a heads up as to what should happen. And then possibly put wording into the spec. --=20 Pedro