From: "Wang, Jian J" <jian.j.wang@intel.com>
To: "Zeng, Star" <star.zeng@intel.com>, Laszlo Ersek <lersek@redhat.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>,
"edk2-devel@lists.01.org" <edk2-devel@lists.01.org>,
"Yao, Jiewen" <jiewen.yao@intel.com>,
"Dong, Eric" <eric.dong@intel.com>,
Ard Biesheuvel <ard.biesheuvel@linaro.org>
Subject: Re: [PATCH v5] UefiCpuPkg/CpuDxe: Fix multiple entries of RT_CODE in memory map
Date: Wed, 15 Nov 2017 07:36:35 +0000 [thread overview]
Message-ID: <D827630B58408649ACB04F44C510003624CAE82A@SHSMSX103.ccr.corp.intel.com> (raw)
In-Reply-To: <0C09AFA07DD0434D9E2A0C6AEB0483103B9B5921@shsmsx102.ccr.corp.intel.com>
Since OS never had chance to use the those capabilities before, I think it's feasible.
But it's just a workaround not solution because there's real gap between UEFI and
OS on how to interpret the memory capabilities.
> -----Original Message-----
> From: Zeng, Star
> Sent: Wednesday, November 15, 2017 2:53 PM
> To: Wang, Jian J <jian.j.wang@intel.com>; Laszlo Ersek <lersek@redhat.com>
> Cc: Matt Fleming <matt@codeblueprint.co.uk>; edk2-devel@lists.01.org; Yao,
> Jiewen <jiewen.yao@intel.com>; Dong, Eric <eric.dong@intel.com>; Ard
> Biesheuvel <ard.biesheuvel@linaro.org>; Zeng, Star <star.zeng@intel.com>
> Subject: RE: [PATCH v5] UefiCpuPkg/CpuDxe: Fix multiple entries of RT_CODE in
> memory map
>
> How about the code to filter out paging capabilities from UEFI memory map in
> Page.c CoreGetMemoryMap(), then with maximum compatibility, the UEFI
> memory map could be same with before
> 14dde9e903bb9a719ebb8f3381da72b19509bc36 [MdeModulePkg/Core: Fix out-
> of-sync issue in GCD].
>
> Thanks,
> Star
> -----Original Message-----
> From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of Wang,
> Jian J
> Sent: Tuesday, November 14, 2017 10:36 PM
> To: Laszlo Ersek <lersek@redhat.com>
> Cc: Matt Fleming <matt@codeblueprint.co.uk>; edk2-devel@lists.01.org; Yao,
> Jiewen <jiewen.yao@intel.com>; Dong, Eric <eric.dong@intel.com>; Ard
> Biesheuvel <ard.biesheuvel@linaro.org>
> Subject: Re: [edk2] [PATCH v5] UefiCpuPkg/CpuDxe: Fix multiple entries of
> RT_CODE in memory map
>
> Hi Laszlo,
>
> I did some investigation works on this issue. I think I found the root cause and
> tend to believe this is a Fedora kernel issue. Here're proves:
>
> memmap output patterns in which Fedora 26 failed to boot:
> 1) BS_Data 00000000AE200000-00000000AE20EFFF 000000000000000F
> 000000000002600F
> RT_Data 00000000AE20F000-00000000AE38EFFF 0000000000000180
> 800000000002600F
> RT_Code 00000000AE38F000-00000000AE48EFFF 0000000000000100
> 800000000002600F
> Reserved 00000000AE48F000-00000000AE58EFFF 0000000000000100
> 000000000002600F
>
> 2) BS_Data 00000000AE200000-00000000AE20EFFF 000000000000000F
> 000000000000000F
> RT_Data 00000000AE20F000-00000000AE38EFFF 0000000000000180
> 800000000000000F
> RT_Code 00000000AE38F000-00000000AE467FFF 00000000000000D9
> 800000000000000F
> RT_Code 00000000AE468000-00000000AE469FFF 0000000000000002
> 800000000002600F
> RT_Code 00000000AE46A000-00000000AE46EFFF 0000000000000005
> 800000000000000F
> RT_Code 00000000AE46F000-00000000AE470FFF 0000000000000002
> 800000000002600F
> RT_Code 00000000AE471000-00000000AE475FFF 0000000000000005
> 800000000000000F
> RT_Code 00000000AE476000-00000000AE479FFF 0000000000000004
> 800000000002600F
> RT_Code 00000000AE47A000-00000000AE47FFFF 0000000000000006
> 800000000000000F
> RT_Code 00000000AE480000-00000000AE481FFF 0000000000000002
> 800000000002600F
> RT_Code 00000000AE482000-00000000AE487FFF 0000000000000006
> 800000000000000F
> RT_Code 00000000AE488000-00000000AE489FFF 0000000000000002
> 800000000002600F
> RT_Code 00000000AE48A000-00000000AE48EFFF 0000000000000005
> 800000000000000F
> Reserved 00000000AE48F000-00000000AE58EFFF 0000000000000100
> 000000000000000F
>
> 3) RT_Data 00000000AE20F000-00000000AE38EFFF 0000000000000180
> 800000000000000F
> RT_Code 00000000AE38F000-00000000AE418FFF 000000000000008A
> 800000000000000F
> RT_Code 00000000AE419000-00000000AE48EFFF 0000000000000076
> 800000000002600F
> Reserved 00000000AE48F000-00000000AE58EFFF 0000000000000100
> 000000000000000F
>
> 4) BS_Data 00000000AE200000-00000000AE20EFFF 000000000000000F
> 000000000002400F
> RT_Data 00000000AE20F000-00000000AE38EFFF 0000000000000180
> 800000000002400F
> RT_Code 00000000AE38F000-00000000AE48EFFF 0000000000000100
> 800000000002400F
> Reserved 00000000AE48F000-00000000AE58EFFF 0000000000000100
> 000000000002400F
>
>
> memmap output pattern in which Fedora 26 booted successfully
> 5) BS_Data 00000000AE200000-00000000AE20EFFF 000000000000000F
> 000000000000000F
> RT_Data 00000000AE20F000-00000000AE38EFFF 0000000000000180
> 800000000000000F
> RT_Code 00000000AE38F000-00000000AE48EFFF 0000000000000100
> 800000000000000F
> Reserved 00000000AE48F000-00000000AE58EFFF 0000000000000100
> 000000000000000F
>
> 6) BS_Data 00000000AE200000-00000000AE20EFFF 000000000000000F
> 000000000000000F
> RT_Data 00000000AE20F000-00000000AE38EFFF 0000000000000180
> 800000000000000F
> RT_Code 00000000AE38F000-00000000AE418FFF 000000000000008A
> 800000000000000F
> RT_Code 00000000AE419000-00000000AE419FFF 0000000000000001
> 800000000000400F
> RT_Code 00000000AE41A000-00000000AE41BFFF 0000000000000002
> 800000000002000F
> RT_Code 00000000AE41C000-00000000AE420FFF 0000000000000005
> 800000000000400F
> RT_Code 00000000AE421000-00000000AE422FFF 0000000000000002
> 800000000002000F
> RT_Code 00000000AE423000-00000000AE427FFF 0000000000000005
> 800000000000400F
> RT_Code 00000000AE428000-00000000AE42AFFF 0000000000000003
> 800000000002000F
> RT_Code 00000000AE42B000-00000000AE42FFFF 0000000000000005
> 800000000000400F
> RT_Code 00000000AE430000-00000000AE432FFF 0000000000000003
> 800000000002000F
> RT_Code 00000000AE433000-00000000AE439FFF 0000000000000007
> 800000000000400F
> RT_Code 00000000AE43A000-00000000AE43DFFF 0000000000000004
> 800000000002000F
> RT_Code 00000000AE43E000-00000000AE444FFF 0000000000000007
> 800000000000400F
> RT_Code 00000000AE445000-00000000AE448FFF 0000000000000004
> 800000000002000F
> RT_Code 00000000AE449000-00000000AE44EFFF 0000000000000006
> 800000000000400F
> RT_Code 00000000AE44F000-00000000AE450FFF 0000000000000002
> 800000000002000F
> RT_Code 00000000AE451000-00000000AE455FFF 0000000000000005
> 800000000000400F
> RT_Code 00000000AE456000-00000000AE458FFF 0000000000000003
> 800000000002000F
> RT_Code 00000000AE459000-00000000AE45FFFF 0000000000000007
> 800000000000400F
> RT_Code 00000000AE460000-00000000AE461FFF 0000000000000002
> 800000000002000F
> RT_Code 00000000AE462000-00000000AE467FFF 0000000000000006
> 800000000000400F
> RT_Code 00000000AE468000-00000000AE469FFF 0000000000000002
> 800000000002000F
> RT_Code 00000000AE46A000-00000000AE46EFFF 0000000000000005
> 800000000000400F
> RT_Code 00000000AE46F000-00000000AE470FFF 0000000000000002
> 800000000002000F
> RT_Code 00000000AE471000-00000000AE475FFF 0000000000000005
> 800000000000400F
> RT_Code 00000000AE476000-00000000AE479FFF 0000000000000004
> 800000000002000F
> RT_Code 00000000AE47A000-00000000AE47FFFF 0000000000000006
> 800000000000400F
> RT_Code 00000000AE480000-00000000AE481FFF 0000000000000002
> 800000000002000F
> RT_Code 00000000AE482000-00000000AE487FFF 0000000000000006
> 800000000000400F
> RT_Code 00000000AE488000-00000000AE489FFF 0000000000000002
> 800000000002000F
> RT_Code 00000000AE48A000-00000000AE48EFFF 0000000000000005
> 800000000000400F
> Reserved 00000000AE48F000-00000000AE58EFFF 0000000000000100
> 000000000000000F
>
> You may notice that 4) will fail but 6) will succeed. Taking into account the fact
> that failure always happened in the runtime service api (set_variable), I think it
> must be that the kernel will mark the memory block to be RO/XP/RP memory
> according to its capabilities but not its current attributes.
>
> This can explain why 4) will fail but 6) will succeed. Although memmap shows all
> runtime driver image memory as RT_Code, but they're not all code segment.
> Instead some of them are actually data segment.
>
> It's normal to mark code segment to be RO and data segment to be XP. But mark
> data segment to be RO will cause runtime services failure. 4) shows all RT_Code
> to be XXX24XXX, which means Fedora kernel will mark all code segment and data
> segment to be RO and XP mistakenly, based on my previous hypothesis. This can
> also explain why 1), 2), 3) will fail the boot because XXX26XXX will let Fedora
> kernel to mark RT_Code memory block to be not-present.
>
> I haven't got time to look into the Linux kernel source so I can't confirm above
> analysis yet.
> I think you're more familiar with kernel source than us. Maybe you could help to
> take a look.
>
> Thanks,
> Jian
>
> > -----Original Message-----
> > From: Laszlo Ersek [mailto:lersek@redhat.com]
> > Sent: Friday, November 10, 2017 8:24 PM
> > To: Wang, Jian J <jian.j.wang@intel.com>
> > Cc: edk2-devel@lists.01.org; Dong, Eric <eric.dong@intel.com>; Yao,
> > Jiewen <jiewen.yao@intel.com>; Ard Biesheuvel
> > <ard.biesheuvel@linaro.org>; Matt Fleming <matt@codeblueprint.co.uk>
> > Subject: Re: [PATCH v5] UefiCpuPkg/CpuDxe: Fix multiple entries of
> > RT_CODE in memory map
> >
> > Hi Jian,
> >
> > I'm CC'ing Ard and Matt, and commenting at the bottom.
> >
> > On 11/10/17 02:02, Jian J Wang wrote:
> > >> v5:
> > >> Coding style clean-up
> > >
> > >> v4:
> > >> a. Remove DoUpdate and check attributes mismatch all the time to avoid
> > >> a logic hole
> > >> b. Add warning message if failed to update capability c. Add local
> > >> variable to hold new attributes to make code cleaner
> > >
> > >> v3:
> > >> a. Add comment to explain more on updating memory capabilities b.
> > >> Fix logic hole in updating attributes c. Instead of checking
> > >> illegal memory space address and size, use return
> > >> status of gDS->SetMemorySpaceCapabilities() to skip memory block which
> > >> cannot be updated with new capabilities.
> > >
> > >> v2
> > >> a. Fix an issue which will cause setting capability failure if size is smaller
> > >> than a page.
> > >
> > > More than one entry of RT_CODE memory might cause boot problem for
> > some
> > > old OSs. This patch will fix this issue to keep OS compatibility as
> > > much as possible.
> > >
> > > More detailed information, please refer to
> > > https://bugzilla.tianocore.org/show_bug.cgi?id=753
> > >
> > > Cc: Eric Dong <eric.dong@intel.com>
> > > Cc: Jiewen Yao <jiewen.yao@intel.com>
> > > Cc: Laszlo Ersek <lersek@redhat.com>
> > > Contributed-under: TianoCore Contribution Agreement 1.1
> > > Signed-off-by: Jian J Wang <jian.j.wang@intel.com>
> > > ---
> > > UefiCpuPkg/CpuDxe/CpuPageTable.c | 69
> > +++++++++++++++++++++++++++++-----------
> > > 1 file changed, 50 insertions(+), 19 deletions(-)
> > >
> > > diff --git a/UefiCpuPkg/CpuDxe/CpuPageTable.c
> > b/UefiCpuPkg/CpuDxe/CpuPageTable.c
> > > index d312eb66f8..61537838b7 100644
> > > --- a/UefiCpuPkg/CpuDxe/CpuPageTable.c
> > > +++ b/UefiCpuPkg/CpuDxe/CpuPageTable.c
> > > @@ -789,8 +789,7 @@ RefreshGcdMemoryAttributesFromPaging (
> > > UINT64 BaseAddress;
> > > UINT64 PageStartAddress;
> > > UINT64 Attributes;
> > > - UINT64 Capabilities;
> > > - BOOLEAN DoUpdate;
> > > + UINT64 NewAttributes;
> > > UINTN Index;
> > >
> > > //
> > > @@ -802,9 +801,8 @@ RefreshGcdMemoryAttributesFromPaging (
> > >
> > > GetCurrentPagingContext (&PagingContext);
> > >
> > > - DoUpdate = FALSE;
> > > - Capabilities = 0;
> > > Attributes = 0;
> > > + NewAttributes = 0;
> > > BaseAddress = 0;
> > > PageLength = 0;
> > >
> > > @@ -813,6 +811,34 @@ RefreshGcdMemoryAttributesFromPaging (
> > > continue;
> > > }
> > >
> > > + //
> > > + // Sync the actual paging related capabilities back to GCD service first.
> > > + // As a side effect (good one), this can also help to avoid unnecessary
> > > + // memory map entries due to the different capabilities of the same type
> > > + // memory, such as multiple RT_CODE and RT_DATA entries in
> > > + memory
> > map,
> > > + // which could cause boot failure of some old Linux distro (before v4.3).
> > > + //
> > > + Status = gDS->SetMemorySpaceCapabilities (
> > > + MemorySpaceMap[Index].BaseAddress,
> > > + MemorySpaceMap[Index].Length,
> > > + MemorySpaceMap[Index].Capabilities |
> > > + EFI_MEMORY_PAGETYPE_MASK
> > > + );
> > > + if (EFI_ERROR (Status)) {
> > > + //
> > > + // If we cannot udpate the capabilities, we cannot update its
> > > + // attributes either. So just simply skip current block of memory.
> > > + //
> > > + DEBUG ((
> > > + DEBUG_WARN,
> > > + "Failed to update capability: [%lu] %016lx - %016lx (%016lx
> > > + -
> > > %016lx)\r\n",
> > > + (UINT64)Index, BaseAddress, BaseAddress + Length - 1,
> > > + MemorySpaceMap[Index].Capabilities,
> > > + MemorySpaceMap[Index].Capabilities |
> EFI_MEMORY_PAGETYPE_MASK
> > > + ));
> > > + continue;
> > > + }
> > > +
> > > if (MemorySpaceMap[Index].BaseAddress >= (BaseAddress + PageLength))
> {
> > > //
> > > // Current memory space starts at a new page. Resetting
> > > PageLength will @@ -826,7 +852,9 @@
> RefreshGcdMemoryAttributesFromPaging (
> > > PageLength -= (MemorySpaceMap[Index].BaseAddress - BaseAddress);
> > > }
> > >
> > > - // Sync real page attributes to GCD
> > > + //
> > > + // Sync actual page attributes to GCD
> > > + //
> > > BaseAddress = MemorySpaceMap[Index].BaseAddress;
> > > MemorySpaceLength = MemorySpaceMap[Index].Length;
> > > while (MemorySpaceLength > 0) { @@ -842,23 +870,26 @@
> > > RefreshGcdMemoryAttributesFromPaging (
> > > PageStartAddress = (*PageEntry) &
> > (UINT64)PageAttributeToMask(PageAttribute);
> > > PageLength = PageAttributeToLength (PageAttribute) -
> (BaseAddress -
> > PageStartAddress);
> > > Attributes = GetAttributesFromPageEntry (PageEntry);
> > > -
> > > - if (Attributes != (MemorySpaceMap[Index].Attributes &
> > EFI_MEMORY_PAGETYPE_MASK)) {
> > > - DoUpdate = TRUE;
> > > - Attributes |= (MemorySpaceMap[Index].Attributes &
> > ~EFI_MEMORY_PAGETYPE_MASK);
> > > - Capabilities = Attributes | MemorySpaceMap[Index].Capabilities;
> > > - } else {
> > > - DoUpdate = FALSE;
> > > - }
> > > }
> > >
> > > Length = MIN (PageLength, MemorySpaceLength);
> > > - if (DoUpdate) {
> > > - gDS->SetMemorySpaceCapabilities (BaseAddress, Length, Capabilities);
> > > - gDS->SetMemorySpaceAttributes (BaseAddress, Length, Attributes);
> > > - DEBUG ((DEBUG_INFO, "Update memory space attribute:
> [%02d] %016lx
> > - %016lx (%08lx -> %08lx)\r\n",
> > > - Index, BaseAddress, BaseAddress + Length - 1,
> > > - MemorySpaceMap[Index].Attributes, Attributes));
> > > + if (Attributes != (MemorySpaceMap[Index].Attributes &
> > > + EFI_MEMORY_PAGETYPE_MASK)) {
> > > + NewAttributes = (MemorySpaceMap[Index].Attributes &
> > > + ~EFI_MEMORY_PAGETYPE_MASK) | Attributes;
> > > + Status = gDS->SetMemorySpaceAttributes (
> > > + BaseAddress,
> > > + Length,
> > > + NewAttributes
> > > + );
> > > + ASSERT_EFI_ERROR (Status);
> > > + DEBUG ((
> > > + DEBUG_INFO,
> > > + "Updated memory space attribute: [%lu] %016lx - %016lx
> > > + (%016lx -
> > > %016lx)\r\n",
> > > + (UINT64)Index, BaseAddress, BaseAddress + Length - 1,
> > > + MemorySpaceMap[Index].Attributes,
> > > + NewAttributes
> > > + ));
> > > }
> > >
> > > PageLength -= Length;
> > >
> >
> > So, I was ready to give my R-b for this patch, but then I also wanted
> > to test it. I applied the patch on current edk2 master (7e2a8dfe8a9a,
> > "ArmPlatformPkg/PrePeiCore: seed temporary stack before entering PEI
> > core", 2017-10-20), and built OVMF like this:
> >
> > $ build \
> > -a IA32 \
> > -a X64 \
> > -p OvmfPkg/OvmfPkgIa32X64.dsc \
> > -t GCC48 \
> > -b NOOPT \
> > -D SMM_REQUIRE \
> > -D SECURE_BOOT_ENABLE \
> > -D E1000_ENABLE \
> > -D HTTP_BOOT_ENABLE
> >
> > For testing I used a recent-ish upstream QEMU development build
> > (ae49fbbcd8e4, "Merge remote-tracking branch
> > 'remotes/rth/tags/pull-tcg-20171025' into staging", 2017-10-25), with
> > the Q35 machine type (which is required by SMM anyway).
> >
> > The results vary across guest OSes:
> >
> > (1) Up-to-date Fedora 26 guest crashes during boot, with the following
> > call stack:
> >
> > BUG: unable to handle kernel paging request at fffffffefe893018 Call
> > Trace:
> > ? __change_page_attr_set_clr+0xaa6/0xd70
> > ? kernel_map_pages_in_pgd+0xbc/0xd0
> > ? efi_call+0x58/0x90
> > ? virt_efi_set_variable.part.7+0x66/0x120
> > ? virt_efi_set_variable+0x4f/0x60
> > ? efi_delete_dummy_variable+0x62/0x90
> > ? efi_enter_virtual_mode+0x4d4/0x4e8
> > ? efi_enter_virtual_mode+0x4d4/0x4e8
> > ? start_kernel+0x442/0x4e6
> > ? early_idt_handler_array+0x120/0x120
> > ? x86_64_start_reservations+0x24/0x26
> > ? x86_64_start_kernel+0x13e/0x161
> > ? secondary_startup_64+0x9f/0x9f
> >
> > (2) The following Windows OSes all boot successfully:
> >
> > - Windows 7
> > - Windows Server 2008 R2
> > - Windows 8.1
> > - Windows Server 2012 R2
> > - Windows 10
> >
> > (3) Windows Server 2016 crashes with a BSOD; reporting "ATTEMPTED
> > WRITE TO READONLY MEMORY".
> >
> > (Without the patch, all OSes boot OK.)
> >
> >
> > I'm attaching a ZIP file with the following contents (note that I'll
> > attach the same file to TianoCore BZ#753 as well, because the mailing
> > list archive(s) don't seem to preserve attachments):
> >
> > - "ovmf.pre.txt", "shell.memmap.pre.txt", "kernel.pre.txt": OVMF log,
> > MEMMAP command output in the UEFI shell, and Fedora 26 kernel boot log
> > (successful) *before* applying your patch. The kernel log is detailed
> > (the cmdline had "ignore_loglevel" and "efi=debug").
> >
> > - "ovmf.post.txt", "shell.memmap.post.txt", "kernel.post.txt": same
> > files as above, but saved *after* applying your patch. This is when
> > the
> > F26 kernel crashes.
> >
> > - "win2016.post.png": screenshot of the Windows Server 2016 boot
> > failure (after the patch was applied).
> >
> > Thanks,
> > Laszlo
> _______________________________________________
> edk2-devel mailing list
> edk2-devel@lists.01.org
> https://lists.01.org/mailman/listinfo/edk2-devel
next prev parent reply other threads:[~2017-11-16 13:53 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-10 1:02 [PATCH v5] UefiCpuPkg/CpuDxe: Fix multiple entries of RT_CODE in memory map Jian J Wang
2017-11-10 12:23 ` Laszlo Ersek
2017-11-13 3:29 ` Wang, Jian J
2017-11-14 14:36 ` Wang, Jian J
2017-11-15 6:52 ` Zeng, Star
2017-11-15 7:36 ` Wang, Jian J [this message]
2017-11-15 9:27 ` Wang, Jian J
2017-11-15 15:48 ` Laszlo Ersek
2017-11-15 15:59 ` Ard Biesheuvel
2017-11-16 2:46 ` Zeng, Star
2017-11-16 3:03 ` Yao, Jiewen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-list from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=D827630B58408649ACB04F44C510003624CAE82A@SHSMSX103.ccr.corp.intel.com \
--to=devel@edk2.groups.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox