* Re: [edk2-discuss] Load Option passing. Either bugs or my confusion. [not found] ` <11181.1586825080096843656@groups.io> @ 2020-04-14 14:26 ` Laszlo Ersek [not found] ` <5211.1586899245384995995@groups.io> 0 siblings, 1 reply; 12+ messages in thread From: Laszlo Ersek @ 2020-04-14 14:26 UTC (permalink / raw) To: vlrzprgts; +Cc: discuss, edk2-devel-groups-io On 04/14/20 02:44, valerij zaporogeci wrote: > 1. what this pointer (OS Loader > ImageHandle)->(LOADED_IMAGE_PROTOCOL)->LoadOptions points to? According to the UEFI spec <https://uefi.org/specifications>, section "9.1 EFI Loaded Image Protocol": LoadOptionsSize The size in bytes of LoadOptions. LoadOptions A pointer to the image's binary load options. [...] Each loaded image has an image handle that supports EFI_LOADED_IMAGE_PROTOCOL. When an image is started, it is passed the image handle for itself. The image can use the handle to obtain its relevant image data stored in the EFI_LOADED_IMAGE_PROTOCOL structure, such as its load options. And from section "7.4 Image Services", near the LoadImage() boot service: Once the image is loaded, firmware creates and returns an EFI_HANDLE that identifies the image and supports EFI_LOADED_IMAGE_PROTOCOL and the EFI_LOADED_IMAGE_DEVICE_PATH_PROTOCOL. The caller may fill in the image's "load options" data, or add additional protocol support to the handle. So an agent calls the LoadImage() boot service to load the image. Then it looks up the EFI_LOADED_IMAGE_PROTOCOL instance on the handle that was output by LoadImage(). The agent populates LoadOptionsSize and LoadOptions in said EFI_LOADED_IMAGE_PROTOCOL instance, and then calls StartImage(). The started image can then consume LoadOptionsSize and LoadOptions. I think you mistook these fields for describing an EFI_LOAD_OPTION structure (from section "3.1.3 Load Options"). EFI_LOAD_OPTION is a different thing -- the EFI_LOADED_IMAGE_PROTOCOL.LoadOptions field points to a binary blob whose interpretation is specific to the image being started. Its internals are not regulated by the spec. More below: > > if it points to the Load Option Descriptor, then on OVMF it doesn't, > since it points to only OptionalData[]. > if it was meant to point to OptionalData[], then: > > 1. how does an OS Loader get _its_ Load Option FilePathList[]? ALL > elements. > and Again, EFI_LOADED_IMAGE_PROTOCOL.LoadOptions does not point to an EFI_LOAD_OPTION structure. Still, your question #1 makes sense; indeed an OS boot loader (or another UEFI app) is "entitled" to see the UEFI device path from which it was loaded from. The passage I quoted above already mentions EFI_LOADED_IMAGE_DEVICE_PATH_PROTOCOL. It is specified in detail in section "9.2 EFI Loaded Image Device Path Protocol". Excerpt: The Loaded Image Device Path Protocol must be installed onto the image handle of a PE/COFF image loaded through the EFI Boot Service LoadImage(). A copy of the device path specified by the DevicePath parameter to the EFI Boot Service LoadImage() is made before it is installed onto the image handle. It is legal to call LoadImage() for a buffer in memory with a NULL DevicePath parameter. In this case, the Loaded Image Device Path Protocol is installed with a NULL interface pointer. Summary: - Passing arguments to a UEFI image: locate EFI_LOADED_IMAGE_PROTOCOL on the image handle, and populate the LoadOptions / LoadOptionsSize members. The "arguments" are a binary blob, only defined by the receiving application. - Finding where a UEFI image was loaded from: locate EFI_LOADED_IMAGE_DEVICE_PATH_PROTOCOL on the image handle. - Regarding "EFI_LOAD_OPTION.OptionalData": in case you boot a Boot#### option, it is "EFI_LOAD_OPTION.OptionalData" that will be passed to the image in "EFI_LOADED_IMAGE_PROTOCOL.LoadOptions". More below: > 2. should FilePathList[] be a multiinstance Device Path (ending with > type 7F, subtype 1, except the last), or every element should end with > {7F, FF}? I cannot answer this. The spec writes, about "FilePathList": A packed array of UEFI device paths. The first element of the array is a device path that describes the device and location of the Image for this load option. The FilePathList[0] is specific to the device type. Other device paths may optionally exist in the FilePathList, but their usage is OSV specific. Each element in the array is variable length, and ends at the device path end structure. Because the size of Description is arbitrary, this data structure is not guaranteed to be aligned on a natural boundary. This data structure may have to be copied to an aligned natural boundary before it is used. I'm not sure if the "non-first" elements in this array are supposed to be 2nd and later devpath instances in a single multi-instance device path, or if they are entirely separate (themselves single or multi-instance) device paths. Either way, I don't think those "elements" are passed to the loaded image in any direct way. You could perhaps get back at the EFI_LOAD_OPTION once the loaded image is running with the help of the "BootCurrent" standard UEFI variable. Thanks Laszlo ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <5211.1586899245384995995@groups.io>]
* Re: [edk2-discuss] Load Option passing. Either bugs or my confusion. [not found] ` <5211.1586899245384995995@groups.io> @ 2020-04-15 15:05 ` Laszlo Ersek 2020-04-16 4:38 ` Hou Qiming 2020-04-20 14:13 ` Gerd Hoffmann 0 siblings, 2 replies; 12+ messages in thread From: Laszlo Ersek @ 2020-04-15 15:05 UTC (permalink / raw) To: valerij zaporogeci, Hou Qiming Cc: discuss, Gerd Hoffmann, Marcel Apfelbaum (GMail address), edk2-devel-groups-io, qemu devel list (CC Gerd, Qiming, Marcel, qemu-devel for ramfb:) On 04/14/20 23:20, valerij zaporogeci wrote: [snip] > There is a Boot Manager UI display problem, I don't know if this is > qemu problem, but with the ARM (both 32 and 64 bits, the qemu version > is 4.2.0, the OVMF is fresh), and using "ramfb" device, the Boot > Manager has troubles with drawing - it's interfase looks entirely > broken, like this (I'll try to attach the screenshot). UEFI shell > doesn't have this problem. switching to "serial" (which is -serial vc) > doesn't produce it too. Only when ramfb is chosen, the Boot Manager UI > gets smeared. But it takes input and presumable works properly, except > displaying things. qemu writes these messages in the command prompt: > ramfb_fw_cfg_write: 640x480 @ 0x4bd00000 > ramfb_fw_cfg_write: resolution locked, change rejected > ramfb_fw_cfg_write: 800x600 @ 0x4bd00000 > ramfb_fw_cfg_write: resolution locked, change rejected Gerd contributed the OVMF QemuRamfbDxe driver in edk2 commit 1d25ff51af5c ("OvmfPkg: add QemuRamfbDxe", 2018-06-14). Note the date: June 2018. The then-latest (released) QEMU version was v2.12.0, and v2.12.1 / v3.0.0 were in the making. At that time, the resolution change definitely worked -- note my "Tested-by" on the edk2 commit message. Running "git blame" on the QEMU source, I now find commit a9e0cb67b7f4 ("hw/display/ramfb: lock guest resolution after it's set", 2019-05-24). Again, note the date: May 2019 (and this commit was released with QEMU v4.1.0). So I would say that the symptom you see is a QEMU v4.1.0 regression. The QemuRamfbGraphicsOutputSetMode() function in the OVMF ramfb driver certainly needs the QemuFwCfgWriteBytes() call to work, for changing the resolution. Now, I'm not familiar with the reasons behind QEMU commit a9e0cb67b7f4. It says it intends to "prevent[] a crash when the guest writes garbage to the configuration space (e.g. when rebooting)". But I don't understand why locking the resolution was necessary for preventing "a crash": (1) Registering a device reset handler in QEMU seems sufficient, so that QEMU forget about the currently shared RAMFB area at platform reset. (2) The crash in question is apparently not a *QEMU* crash -- which might otherwise justify a heavy-handed approach. Instead, it is a *guest* crash. See the references below: (2a) http://mid.mail-archive.com/CABSdmrmU7FK90Bupq_ySowcc9Uk=8nQxNLHgzvDsNYdp_QLogA@mail.gmail.com https://lists.gnu.org/archive/html/qemu-devel/2019-05/msg02299.html (2b) https://github.com/intel/gvt-linux/issues/23#issuecomment-483651476 Therefore, I don't think that locking the resolution was justified! Importantly: - The QemuRamfbDxe driver allocates the framebuffer in *reserved* memory, therefore any well-behaving OS will *never* touch the framebuffer. - The QemuRamfbDxe driver allocates the framebuffer memory only once, namely for such a resolution that needs the largest amount of framebuffer memory. Therefore, framebuffer re-allocations in the guest driver -- and thereby guest RAM *re-mapping* in QEMU -- are *not* necessary, upon resolution change. The ramfb device reset handler in QEMU is justified (for unmapping / forgetting the previously shared RAMFB area). The resolution locking is *NOT* justified, and it breaks the OVMF driver. I suggest backing out the resolution locking from QEMU. Reference (2a) above indicates 'It could be a misguided attempt to "resize ramfb" by the guest Intel driver'. If that is the case, then please fix the Intel guest driver, without regressing the QEMU device model. I'm sad that the QEMU device model change was not regression-tested against the *upstream* OVMF driver (which, by then, had been upstream for almost a year). Laszlo ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [edk2-discuss] Load Option passing. Either bugs or my confusion. 2020-04-15 15:05 ` Laszlo Ersek @ 2020-04-16 4:38 ` Hou Qiming 2020-04-16 14:12 ` Laszlo Ersek 2020-04-20 14:19 ` Gerd Hoffmann 2020-04-20 14:13 ` Gerd Hoffmann 1 sibling, 2 replies; 12+ messages in thread From: Hou Qiming @ 2020-04-16 4:38 UTC (permalink / raw) To: Laszlo Ersek Cc: valerij zaporogeci, discuss, Gerd Hoffmann, Marcel Apfelbaum (GMail address), edk2-devel-groups-io, qemu devel list [-- Attachment #1: Type: text/plain, Size: 5377 bytes --] Very good point, I did neglect ramfb resolution changes... But there is one important thing: it *can* cause a QEMU crash, a potentially exploitable one, not always a guest crash. That's what motivated my heavy-handed approach since allowing resolution change would have necessitated a good deal of security checks. It has crashed my host *kernel* quite a few times. The point is, while the QemuRamfbDxe driver may behave properly, nothing prevents the guest from writing garbage or *malicious* values to the ramfb config space. Then the values are sent to the display component without any sanity check. For some GUI frontends, this means allocating an OpenGL texture with guest-supplied dimensions and uploading guest memory content to it, which means that guest memory content goes straight into a *kernel driver*, *completely unchecked*. Some integer overflow and a lenient GPU driver later, and the guest escapes straight to kernel. The proper way to enable ramfb resolution change again is adding sanity checks for ramfb resolution / pointer / etc. on the QEMU side. We have to make sure it doesn't exceed what the host GPU driver supports. Maybe clamp both width and height to between 1 and 2048? We also need to validate that OpenGL texture dimension update succeeds. Note that OpenGL is not obliged to validate anything and everything has to be checked on the QEMU side. Qiming On Wed, Apr 15, 2020 at 11:05 PM Laszlo Ersek <lersek@redhat.com> wrote: > (CC Gerd, Qiming, Marcel, qemu-devel for ramfb:) > > On 04/14/20 23:20, valerij zaporogeci wrote: > > [snip] > > > There is a Boot Manager UI display problem, I don't know if this is > > qemu problem, but with the ARM (both 32 and 64 bits, the qemu version > > is 4.2.0, the OVMF is fresh), and using "ramfb" device, the Boot > > Manager has troubles with drawing - it's interfase looks entirely > > broken, like this (I'll try to attach the screenshot). UEFI shell > > doesn't have this problem. switching to "serial" (which is -serial vc) > > doesn't produce it too. Only when ramfb is chosen, the Boot Manager UI > > gets smeared. But it takes input and presumable works properly, except > > displaying things. qemu writes these messages in the command prompt: > > ramfb_fw_cfg_write: 640x480 @ 0x4bd00000 > > ramfb_fw_cfg_write: resolution locked, change rejected > > ramfb_fw_cfg_write: 800x600 @ 0x4bd00000 > > ramfb_fw_cfg_write: resolution locked, change rejected > > Gerd contributed the OVMF QemuRamfbDxe driver in edk2 commit > 1d25ff51af5c ("OvmfPkg: add QemuRamfbDxe", 2018-06-14). Note the date: > June 2018. > > The then-latest (released) QEMU version was v2.12.0, and v2.12.1 / > v3.0.0 were in the making. > > At that time, the resolution change definitely worked -- note my > "Tested-by" on the edk2 commit message. > > > Running "git blame" on the QEMU source, I now find commit a9e0cb67b7f4 > ("hw/display/ramfb: lock guest resolution after it's set", 2019-05-24). > > Again, note the date: May 2019 (and this commit was released with QEMU > v4.1.0). > > So I would say that the symptom you see is a QEMU v4.1.0 regression. The > QemuRamfbGraphicsOutputSetMode() function in the OVMF ramfb driver > certainly needs the QemuFwCfgWriteBytes() call to work, for changing the > resolution. > > > Now, I'm not familiar with the reasons behind QEMU commit a9e0cb67b7f4. > It says it intends to "prevent[] a crash when the guest writes garbage > to the configuration space (e.g. when rebooting)". > > But I don't understand why locking the resolution was necessary for > preventing "a crash": > > (1) Registering a device reset handler in QEMU seems sufficient, so that > QEMU forget about the currently shared RAMFB area at platform reset. > > (2) The crash in question is apparently not a *QEMU* crash -- which > might otherwise justify a heavy-handed approach. Instead, it is a > *guest* crash. See the references below: > > (2a) > http://mid.mail-archive.com/CABSdmrmU7FK90Bupq_ySowcc9Uk=8nQxNLHgzvDsNYdp_QLogA@mail.gmail.com > https://lists.gnu.org/archive/html/qemu-devel/2019-05/msg02299.html > > (2b) https://github.com/intel/gvt-linux/issues/23#issuecomment-483651476 > > Therefore, I don't think that locking the resolution was justified! > > Importantly: > > - The QemuRamfbDxe driver allocates the framebuffer in *reserved* > memory, therefore any well-behaving OS will *never* touch the > framebuffer. > > - The QemuRamfbDxe driver allocates the framebuffer memory only once, > namely for such a resolution that needs the largest amount of > framebuffer memory. Therefore, framebuffer re-allocations in the guest > driver -- and thereby guest RAM *re-mapping* in QEMU -- are *not* > necessary, upon resolution change. > > The ramfb device reset handler in QEMU is justified (for unmapping / > forgetting the previously shared RAMFB area). > > The resolution locking is *NOT* justified, and it breaks the OVMF > driver. I suggest backing out the resolution locking from QEMU. > > Reference (2a) above indicates 'It could be a misguided attempt to > "resize ramfb" by the guest Intel driver'. If that is the case, then > please fix the Intel guest driver, without regressing the QEMU device > model. > > I'm sad that the QEMU device model change was not regression-tested > against the *upstream* OVMF driver (which, by then, had been upstream > for almost a year). > > Laszlo > > [-- Attachment #2: Type: text/html, Size: 6633 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [edk2-discuss] Load Option passing. Either bugs or my confusion. 2020-04-16 4:38 ` Hou Qiming @ 2020-04-16 14:12 ` Laszlo Ersek 2020-04-17 3:22 ` Hou Qiming 2020-04-20 14:19 ` Gerd Hoffmann 1 sibling, 1 reply; 12+ messages in thread From: Laszlo Ersek @ 2020-04-16 14:12 UTC (permalink / raw) To: Hou Qiming Cc: valerij zaporogeci, discuss, Gerd Hoffmann, Marcel Apfelbaum (GMail address), edk2-devel-groups-io, qemu devel list On 04/16/20 06:38, Hou Qiming wrote: > Very good point, I did neglect ramfb resolution changes... But there is one > important thing: it *can* cause a QEMU crash, a potentially exploitable > one, not always a guest crash. That's what motivated my heavy-handed > approach since allowing resolution change would have necessitated a good > deal of security checks. It has crashed my host *kernel* quite a few times. > > The point is, while the QemuRamfbDxe driver may behave properly, nothing > prevents the guest from writing garbage or *malicious* values to the ramfb > config space. Then the values are sent to the display component without any > sanity check. For some GUI frontends, this means allocating an OpenGL > texture with guest-supplied dimensions and uploading guest memory content > to it, which means that guest memory content goes straight into a *kernel > driver*, *completely unchecked*. Some integer overflow and a lenient GPU > driver later, and the guest escapes straight to kernel. > > The proper way to enable ramfb resolution change again is adding sanity > checks for ramfb resolution / pointer / etc. on the QEMU side. We have to > make sure it doesn't exceed what the host GPU driver supports. Maybe clamp > both width and height to between 1 and 2048? We also need to validate that > OpenGL texture dimension update succeeds. Note that OpenGL is not obliged > to validate anything and everything has to be checked on the QEMU side. I agree that QEMU should sanity check the resolution requested by the guest. I also agree that "arbitrary" limits are acceptable, for preventing integer overflows and -- hopefully -- memory allocation failures too. But I don't see the host kernel / OpenGL / physical GPU angle, at least not directly. That angle seems to be specific to your particular use case (particular choice of display backend). For example, if you nest QEMU/TCG in QEMU/TCG, with no KVM and no device assignment in the picture anywhere, and OVMF drives ramfb in L2, and the display *backend* (such as GTK or SDL GUI window) for the QEMU process running in L1 sits on top of a virtual device (such as bochs-display) provided by QEMU running in L0, then the ramfb stuff (including the resolution changes and the range checks) should work just the same, between L2 and L1. I kinda feel like ramfb has been hijacked for providing a boot time display crutch for kvmgt. (I might not be using the correct terminology here; sorry about that). That's *not* what ramfb was originally intended for, as far as I recall. Compare: - 59926de9987c ("Merge remote-tracking branch 'remotes/kraxel/tags/vga-20180618-pull-request' into staging", 2018-06-19) - dddb37495b84 ("Merge remote-tracking branch 'remotes/awilliam/tags/vfio-updates-20181015.0' into staging", 2018-10-15) IIRC, Gerd originally invented ramfb for giving AARCH64 Windows the linear framebuffer that the latter so badly wants, in particular so that the framebuffer exist in guest RAM (not in guest MMIO), in order to avoid the annoying S1/S2 caching behavior of AARCH64/KVM when the guest maps an area as MMIO that is mapped as RAM on the host [1]. See: - https://bugzilla.tianocore.org/show_bug.cgi?id=785#c4 - https://bugzilla.tianocore.org/show_bug.cgi?id=785#c7 - https://bugzilla.tianocore.org/show_bug.cgi?id=785#c8 and the further references given in those bugzilla comments. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1679680#c0 Component reuse is obviously *hugely* important, and it would be silly for me to argue against reusing ramfb wherever it applies. Just please don't break the original use case. Should I file a bug report in LaunchPad, or is this thread enough for tracking the QEMU regression? Thanks Laszlo > > Qiming > > > On Wed, Apr 15, 2020 at 11:05 PM Laszlo Ersek <lersek@redhat.com> wrote: > >> (CC Gerd, Qiming, Marcel, qemu-devel for ramfb:) >> >> On 04/14/20 23:20, valerij zaporogeci wrote: >> >> [snip] >> >>> There is a Boot Manager UI display problem, I don't know if this is >>> qemu problem, but with the ARM (both 32 and 64 bits, the qemu version >>> is 4.2.0, the OVMF is fresh), and using "ramfb" device, the Boot >>> Manager has troubles with drawing - it's interfase looks entirely >>> broken, like this (I'll try to attach the screenshot). UEFI shell >>> doesn't have this problem. switching to "serial" (which is -serial vc) >>> doesn't produce it too. Only when ramfb is chosen, the Boot Manager UI >>> gets smeared. But it takes input and presumable works properly, except >>> displaying things. qemu writes these messages in the command prompt: >>> ramfb_fw_cfg_write: 640x480 @ 0x4bd00000 >>> ramfb_fw_cfg_write: resolution locked, change rejected >>> ramfb_fw_cfg_write: 800x600 @ 0x4bd00000 >>> ramfb_fw_cfg_write: resolution locked, change rejected >> >> Gerd contributed the OVMF QemuRamfbDxe driver in edk2 commit >> 1d25ff51af5c ("OvmfPkg: add QemuRamfbDxe", 2018-06-14). Note the date: >> June 2018. >> >> The then-latest (released) QEMU version was v2.12.0, and v2.12.1 / >> v3.0.0 were in the making. >> >> At that time, the resolution change definitely worked -- note my >> "Tested-by" on the edk2 commit message. >> >> >> Running "git blame" on the QEMU source, I now find commit a9e0cb67b7f4 >> ("hw/display/ramfb: lock guest resolution after it's set", 2019-05-24). >> >> Again, note the date: May 2019 (and this commit was released with QEMU >> v4.1.0). >> >> So I would say that the symptom you see is a QEMU v4.1.0 regression. The >> QemuRamfbGraphicsOutputSetMode() function in the OVMF ramfb driver >> certainly needs the QemuFwCfgWriteBytes() call to work, for changing the >> resolution. >> >> >> Now, I'm not familiar with the reasons behind QEMU commit a9e0cb67b7f4. >> It says it intends to "prevent[] a crash when the guest writes garbage >> to the configuration space (e.g. when rebooting)". >> >> But I don't understand why locking the resolution was necessary for >> preventing "a crash": >> >> (1) Registering a device reset handler in QEMU seems sufficient, so that >> QEMU forget about the currently shared RAMFB area at platform reset. >> >> (2) The crash in question is apparently not a *QEMU* crash -- which >> might otherwise justify a heavy-handed approach. Instead, it is a >> *guest* crash. See the references below: >> >> (2a) >> http://mid.mail-archive.com/CABSdmrmU7FK90Bupq_ySowcc9Uk=8nQxNLHgzvDsNYdp_QLogA@mail.gmail.com >> https://lists.gnu.org/archive/html/qemu-devel/2019-05/msg02299.html >> >> (2b) https://github.com/intel/gvt-linux/issues/23#issuecomment-483651476 >> >> Therefore, I don't think that locking the resolution was justified! >> >> Importantly: >> >> - The QemuRamfbDxe driver allocates the framebuffer in *reserved* >> memory, therefore any well-behaving OS will *never* touch the >> framebuffer. >> >> - The QemuRamfbDxe driver allocates the framebuffer memory only once, >> namely for such a resolution that needs the largest amount of >> framebuffer memory. Therefore, framebuffer re-allocations in the guest >> driver -- and thereby guest RAM *re-mapping* in QEMU -- are *not* >> necessary, upon resolution change. >> >> The ramfb device reset handler in QEMU is justified (for unmapping / >> forgetting the previously shared RAMFB area). >> >> The resolution locking is *NOT* justified, and it breaks the OVMF >> driver. I suggest backing out the resolution locking from QEMU. >> >> Reference (2a) above indicates 'It could be a misguided attempt to >> "resize ramfb" by the guest Intel driver'. If that is the case, then >> please fix the Intel guest driver, without regressing the QEMU device >> model. >> >> I'm sad that the QEMU device model change was not regression-tested >> against the *upstream* OVMF driver (which, by then, had been upstream >> for almost a year). >> >> Laszlo >> >> > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [edk2-discuss] Load Option passing. Either bugs or my confusion. 2020-04-16 14:12 ` Laszlo Ersek @ 2020-04-17 3:22 ` Hou Qiming 2020-04-20 9:32 ` Laszlo Ersek 0 siblings, 1 reply; 12+ messages in thread From: Hou Qiming @ 2020-04-17 3:22 UTC (permalink / raw) To: Laszlo Ersek Cc: valerij zaporogeci, discuss, Gerd Hoffmann, Marcel Apfelbaum (GMail address), edk2-devel-groups-io, qemu devel list [-- Attachment #1: Type: text/plain, Size: 9541 bytes --] I'm glad we can reach a consensus that ramfb needs sanity checks. And well, I'm probably at fault with the hijacking. Your QEMU/TCG in QEMU/TCG example also made me realize a deeper problem, though: your setting still can't escape the host display / physical GPU issue. The middle display layers be bochs or whatever, but as long as the framebuffer content and resolution values are propagated, and the end result is displayed at all on the host, the host GPU attack surface remains exposed to the L2 guest, and checks are needed. Everything shown on the screen involves the display driver - GPU stack, GTK or SDL or tty, you can't avoid that. ramfb-kvmgt just happened to be the shortest pipeline where every stage neglected the checks, which exposed this problem. Blaming this on ramfb is unfair since in your scenario the checks are better done in the display subsystems. TL;DR You made me realize right now, it's a very real risk that an AARCH64 Windows guest could exploit a x64 host's display driver by specifying a crafted framebuffer with overflowing resolution. I don't want to break it, but I'd prefer a broken state over an insecure state. I'm not quite sure what this thread is. But I think with the scope this discussion is going, maybe it's more of a bug than a regression. On Thu, Apr 16, 2020 at 10:12 PM Laszlo Ersek <lersek@redhat.com> wrote: > On 04/16/20 06:38, Hou Qiming wrote: > > Very good point, I did neglect ramfb resolution changes... But there is > one > > important thing: it *can* cause a QEMU crash, a potentially exploitable > > one, not always a guest crash. That's what motivated my heavy-handed > > approach since allowing resolution change would have necessitated a good > > deal of security checks. It has crashed my host *kernel* quite a few > times. > > > > The point is, while the QemuRamfbDxe driver may behave properly, nothing > > prevents the guest from writing garbage or *malicious* values to the > ramfb > > config space. Then the values are sent to the display component without > any > > sanity check. For some GUI frontends, this means allocating an OpenGL > > texture with guest-supplied dimensions and uploading guest memory content > > to it, which means that guest memory content goes straight into a *kernel > > driver*, *completely unchecked*. Some integer overflow and a lenient GPU > > driver later, and the guest escapes straight to kernel. > > > > The proper way to enable ramfb resolution change again is adding sanity > > checks for ramfb resolution / pointer / etc. on the QEMU side. We have to > > make sure it doesn't exceed what the host GPU driver supports. Maybe > clamp > > both width and height to between 1 and 2048? We also need to validate > that > > OpenGL texture dimension update succeeds. Note that OpenGL is not obliged > > to validate anything and everything has to be checked on the QEMU side. > > I agree that QEMU should sanity check the resolution requested by the > guest. I also agree that "arbitrary" limits are acceptable, for > preventing integer overflows and -- hopefully -- memory allocation > failures too. > > But I don't see the host kernel / OpenGL / physical GPU angle, at least > not directly. That angle seems to be specific to your particular use > case (particular choice of display backend). > > For example, if you nest QEMU/TCG in QEMU/TCG, with no KVM and no device > assignment in the picture anywhere, and OVMF drives ramfb in L2, and the > display *backend* (such as GTK or SDL GUI window) for the QEMU process > running in L1 sits on top of a virtual device (such as bochs-display) > provided by QEMU running in L0, then the ramfb stuff (including the > resolution changes and the range checks) should work just the same, > between L2 and L1. > > I kinda feel like ramfb has been hijacked for providing a boot time > display crutch for kvmgt. (I might not be using the correct terminology > here; sorry about that). That's *not* what ramfb was originally intended > for, as far as I recall. Compare: > > - 59926de9987c ("Merge remote-tracking branch > 'remotes/kraxel/tags/vga-20180618-pull-request' into staging", 2018-06-19) > > - dddb37495b84 ("Merge remote-tracking branch > 'remotes/awilliam/tags/vfio-updates-20181015.0' into staging", 2018-10-15) > > IIRC, Gerd originally invented ramfb for giving AARCH64 Windows the > linear framebuffer that the latter so badly wants, in particular so that > the framebuffer exist in guest RAM (not in guest MMIO), in order to > avoid the annoying S1/S2 caching behavior of AARCH64/KVM when the guest > maps an area as MMIO that is mapped as RAM on the host [1]. See: > > - https://bugzilla.tianocore.org/show_bug.cgi?id=785#c4 > - https://bugzilla.tianocore.org/show_bug.cgi?id=785#c7 > - https://bugzilla.tianocore.org/show_bug.cgi?id=785#c8 > > and the further references given in those bugzilla comments. > > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1679680#c0 > > Component reuse is obviously *hugely* important, and it would be silly > for me to argue against reusing ramfb wherever it applies. Just please > don't break the original use case. > > Should I file a bug report in LaunchPad, or is this thread enough for > tracking the QEMU regression? > > Thanks > Laszlo > > > > > Qiming > > > > > > On Wed, Apr 15, 2020 at 11:05 PM Laszlo Ersek <lersek@redhat.com> wrote: > > > >> (CC Gerd, Qiming, Marcel, qemu-devel for ramfb:) > >> > >> On 04/14/20 23:20, valerij zaporogeci wrote: > >> > >> [snip] > >> > >>> There is a Boot Manager UI display problem, I don't know if this is > >>> qemu problem, but with the ARM (both 32 and 64 bits, the qemu version > >>> is 4.2.0, the OVMF is fresh), and using "ramfb" device, the Boot > >>> Manager has troubles with drawing - it's interfase looks entirely > >>> broken, like this (I'll try to attach the screenshot). UEFI shell > >>> doesn't have this problem. switching to "serial" (which is -serial vc) > >>> doesn't produce it too. Only when ramfb is chosen, the Boot Manager UI > >>> gets smeared. But it takes input and presumable works properly, except > >>> displaying things. qemu writes these messages in the command prompt: > >>> ramfb_fw_cfg_write: 640x480 @ 0x4bd00000 > >>> ramfb_fw_cfg_write: resolution locked, change rejected > >>> ramfb_fw_cfg_write: 800x600 @ 0x4bd00000 > >>> ramfb_fw_cfg_write: resolution locked, change rejected > >> > >> Gerd contributed the OVMF QemuRamfbDxe driver in edk2 commit > >> 1d25ff51af5c ("OvmfPkg: add QemuRamfbDxe", 2018-06-14). Note the date: > >> June 2018. > >> > >> The then-latest (released) QEMU version was v2.12.0, and v2.12.1 / > >> v3.0.0 were in the making. > >> > >> At that time, the resolution change definitely worked -- note my > >> "Tested-by" on the edk2 commit message. > >> > >> > >> Running "git blame" on the QEMU source, I now find commit a9e0cb67b7f4 > >> ("hw/display/ramfb: lock guest resolution after it's set", 2019-05-24). > >> > >> Again, note the date: May 2019 (and this commit was released with QEMU > >> v4.1.0). > >> > >> So I would say that the symptom you see is a QEMU v4.1.0 regression. The > >> QemuRamfbGraphicsOutputSetMode() function in the OVMF ramfb driver > >> certainly needs the QemuFwCfgWriteBytes() call to work, for changing the > >> resolution. > >> > >> > >> Now, I'm not familiar with the reasons behind QEMU commit a9e0cb67b7f4. > >> It says it intends to "prevent[] a crash when the guest writes garbage > >> to the configuration space (e.g. when rebooting)". > >> > >> But I don't understand why locking the resolution was necessary for > >> preventing "a crash": > >> > >> (1) Registering a device reset handler in QEMU seems sufficient, so that > >> QEMU forget about the currently shared RAMFB area at platform reset. > >> > >> (2) The crash in question is apparently not a *QEMU* crash -- which > >> might otherwise justify a heavy-handed approach. Instead, it is a > >> *guest* crash. See the references below: > >> > >> (2a) > >> > http://mid.mail-archive.com/CABSdmrmU7FK90Bupq_ySowcc9Uk=8nQxNLHgzvDsNYdp_QLogA@mail.gmail.com > >> > https://lists.gnu.org/archive/html/qemu-devel/2019-05/msg02299.html > >> > >> (2b) > https://github.com/intel/gvt-linux/issues/23#issuecomment-483651476 > >> > >> Therefore, I don't think that locking the resolution was justified! > >> > >> Importantly: > >> > >> - The QemuRamfbDxe driver allocates the framebuffer in *reserved* > >> memory, therefore any well-behaving OS will *never* touch the > >> framebuffer. > >> > >> - The QemuRamfbDxe driver allocates the framebuffer memory only once, > >> namely for such a resolution that needs the largest amount of > >> framebuffer memory. Therefore, framebuffer re-allocations in the guest > >> driver -- and thereby guest RAM *re-mapping* in QEMU -- are *not* > >> necessary, upon resolution change. > >> > >> The ramfb device reset handler in QEMU is justified (for unmapping / > >> forgetting the previously shared RAMFB area). > >> > >> The resolution locking is *NOT* justified, and it breaks the OVMF > >> driver. I suggest backing out the resolution locking from QEMU. > >> > >> Reference (2a) above indicates 'It could be a misguided attempt to > >> "resize ramfb" by the guest Intel driver'. If that is the case, then > >> please fix the Intel guest driver, without regressing the QEMU device > >> model. > >> > >> I'm sad that the QEMU device model change was not regression-tested > >> against the *upstream* OVMF driver (which, by then, had been upstream > >> for almost a year). > >> > >> Laszlo > >> > >> > > > > [-- Attachment #2: Type: text/html, Size: 12265 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [edk2-discuss] Load Option passing. Either bugs or my confusion. 2020-04-17 3:22 ` Hou Qiming @ 2020-04-20 9:32 ` Laszlo Ersek 0 siblings, 0 replies; 12+ messages in thread From: Laszlo Ersek @ 2020-04-20 9:32 UTC (permalink / raw) To: Hou Qiming Cc: valerij zaporogeci, discuss, Gerd Hoffmann, Marcel Apfelbaum (GMail address), edk2-devel-groups-io, qemu devel list On 04/17/20 05:22, Hou Qiming wrote: > I'm glad we can reach a consensus that ramfb needs sanity checks. And well, > I'm probably at fault with the hijacking. > > Your QEMU/TCG in QEMU/TCG example also made me realize a deeper problem, > though: your setting still can't escape the host display / physical GPU > issue. The middle display layers be bochs or whatever, but as long as the > framebuffer content and resolution values are propagated, and the end > result is displayed at all on the host, the host GPU attack surface remains > exposed to the L2 guest, and checks are needed. Everything shown on the > screen involves the display driver - GPU stack, GTK or SDL or tty, you > can't avoid that. ramfb-kvmgt just happened to be the shortest pipeline > where every stage neglected the checks, which exposed this problem. Good point. > Blaming > this on ramfb is unfair since in your scenario the checks are better done > in the display subsystems. > > TL;DR You made me realize right now, it's a very real risk that an AARCH64 > Windows guest could exploit a x64 host's display driver by specifying a > crafted framebuffer with overflowing resolution. I don't want to break it, > but I'd prefer a broken state over an insecure state. > > I'm not quite sure what this thread is. But I think with the scope this > discussion is going, maybe it's more of a bug than a regression. All display devices (frontends) emulated by QEMU have to set bounds for the permissible resolutions, for the guest. And those limits must never break the capabilities of the display backends. So this is not a new problem. How is it handled with other frontends? Like bochs-display, for example -- I assume bochs-display too is purely virtual, i.e. the resolution is fully controller (between bounds) by the guest. How is the guest currently prevented from setting a bochs-display resolution that "breaks SDL" (whatever that means)? I'm inclined to agree that we're just seeing two sides of the same bug -- the first state was too lax, and the current state is too strict. Thanks Laszlo ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [edk2-discuss] Load Option passing. Either bugs or my confusion. 2020-04-16 4:38 ` Hou Qiming 2020-04-16 14:12 ` Laszlo Ersek @ 2020-04-20 14:19 ` Gerd Hoffmann 1 sibling, 0 replies; 12+ messages in thread From: Gerd Hoffmann @ 2020-04-20 14:19 UTC (permalink / raw) To: Hou Qiming Cc: Laszlo Ersek, valerij zaporogeci, discuss, Marcel Apfelbaum (GMail address), edk2-devel-groups-io, qemu devel list Hi, > The proper way to enable ramfb resolution change again is adding sanity > checks for ramfb resolution / pointer / etc. on the QEMU side. Pointer *is* checked. ramfb creates a mapping, and if that fails due to the pointer not being valid it bails out. Sanity-checking the resolution is the job of the UI code. cheers, Gerd ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [edk2-discuss] Load Option passing. Either bugs or my confusion. 2020-04-15 15:05 ` Laszlo Ersek 2020-04-16 4:38 ` Hou Qiming @ 2020-04-20 14:13 ` Gerd Hoffmann 2020-04-21 13:02 ` Laszlo Ersek 1 sibling, 1 reply; 12+ messages in thread From: Gerd Hoffmann @ 2020-04-20 14:13 UTC (permalink / raw) To: Laszlo Ersek Cc: valerij zaporogeci, Hou Qiming, discuss, Marcel Apfelbaum (GMail address), edk2-devel-groups-io, qemu devel list Hi, > So I would say that the symptom you see is a QEMU v4.1.0 regression. The > QemuRamfbGraphicsOutputSetMode() function in the OVMF ramfb driver > certainly needs the QemuFwCfgWriteBytes() call to work, for changing the > resolution. Oh? QemuRamfbGraphicsOutputSetMode() can be called multiple times? How does that happen? > (1) Registering a device reset handler in QEMU seems sufficient, so that > QEMU forget about the currently shared RAMFB area at platform reset. That happens. After system reset you can write configuration again (once). The guest os should not play with ramfb. It is supposed to be setup by the firmware (ovmf driver or vgabios rom) as boot display, then never be re-configured again ... cheers, Gerd ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [edk2-discuss] Load Option passing. Either bugs or my confusion. 2020-04-20 14:13 ` Gerd Hoffmann @ 2020-04-21 13:02 ` Laszlo Ersek 2020-04-22 7:42 ` Hou Qiming 0 siblings, 1 reply; 12+ messages in thread From: Laszlo Ersek @ 2020-04-21 13:02 UTC (permalink / raw) To: Gerd Hoffmann Cc: valerij zaporogeci, Hou Qiming, discuss, Marcel Apfelbaum (GMail address), edk2-devel-groups-io, qemu devel list On 04/20/20 16:13, Gerd Hoffmann wrote: > Hi, > >> So I would say that the symptom you see is a QEMU v4.1.0 regression. >> The QemuRamfbGraphicsOutputSetMode() function in the OVMF ramfb >> driver certainly needs the QemuFwCfgWriteBytes() call to work, for >> changing the resolution. > > Oh? QemuRamfbGraphicsOutputSetMode() can be called multiple times? > How does that happen? QemuRamfbGraphicsOutputSetMode() is the "SetMode" member function of the EFI_GRAPHICS_OUTPUT_PROTOCOL instance that QemuRamfbDxe produces. This is a standard protocol; UEFI drivers and applications are free to locate it and to use it. (1) When you launch OVMF, you get the splash screen in a particular resolution. This resolution: - is configured by OvmfPkg/PlatformDxe, - is inherited by an OS boot loader, - is reconfigurable with OvmfPkg/PlatformDxe, for the next boot, via the Setup TUI, - defaults to 800x600 (taking effect when no particular choice is configured). (2) UiApp -- the Setup TUI itself -- uses its own resolution. Under OVMF, this resolution is fixed 640x480. When UiApp is entered, ultimately a call is made to QemuRamfbGraphicsOutputSetMode() -- i.e., a GOP.SetMode() member function -- for setting this 640x480 resolution. Using the following command: qemu-system-x86_64 \ -nodefaults \ -boot menu=on,splash-time=5000 \ -enable-kvm \ -device ramfb \ -drive if=pflash,readonly,format=raw,file=$PREFIX/share/qemu/edk2-x86_64-code.fd \ -drive if=pflash,snapshot,format=raw,file=$PREFIX/share/qemu/edk2-i386-vars.fd \ -debugcon file:ovmf.log \ -global isa-debugcon.iobase=0x402 when you first see the progress bar, the graphical resolution (1) is 800x600. Accordingly, QEMU prints to stderr: > ramfb_fw_cfg_write: 800x600 @ 0x6702000 Once you hit ESC to interrupt the progress bar and to enter the Setup TUI, UiApp switches to resolution (2), 640x480. QEMU prints: > ramfb_fw_cfg_write: 640x480 @ 0x6702000 > ramfb_fw_cfg_write: resolution locked, change rejected And you get garbage in the Setup window. Thanks, Laszlo ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [edk2-discuss] Load Option passing. Either bugs or my confusion. 2020-04-21 13:02 ` Laszlo Ersek @ 2020-04-22 7:42 ` Hou Qiming 2020-04-22 16:05 ` Laszlo Ersek 0 siblings, 1 reply; 12+ messages in thread From: Hou Qiming @ 2020-04-22 7:42 UTC (permalink / raw) To: Laszlo Ersek Cc: Gerd Hoffmann, valerij zaporogeci, discuss, Marcel Apfelbaum (GMail address), edk2-devel-groups-io, qemu devel list [-- Attachment #1: Type: text/plain, Size: 2660 bytes --] A little off topic thing: isn't the default resolution supposed to be 1024x768? This is the Microsoft regulation which all my physical devices seem to follow: https://docs.microsoft.com/en-us/windows-hardware/test/hlk/testref/6afc8979-df62-4d86-8f6a-99f05bbdc7f3 And when the user provides an EDID one should parse it and set the default resolution to match it. But that's a less important feature. On Tue, Apr 21, 2020 at 9:03 PM Laszlo Ersek <lersek@redhat.com> wrote: > On 04/20/20 16:13, Gerd Hoffmann wrote: > > Hi, > > > >> So I would say that the symptom you see is a QEMU v4.1.0 regression. > >> The QemuRamfbGraphicsOutputSetMode() function in the OVMF ramfb > >> driver certainly needs the QemuFwCfgWriteBytes() call to work, for > >> changing the resolution. > > > > Oh? QemuRamfbGraphicsOutputSetMode() can be called multiple times? > > How does that happen? > > QemuRamfbGraphicsOutputSetMode() is the "SetMode" member function of the > EFI_GRAPHICS_OUTPUT_PROTOCOL instance that QemuRamfbDxe produces. > > This is a standard protocol; UEFI drivers and applications are free to > locate it and to use it. > > (1) When you launch OVMF, you get the splash screen in a particular > resolution. This resolution: > - is configured by OvmfPkg/PlatformDxe, > - is inherited by an OS boot loader, > - is reconfigurable with OvmfPkg/PlatformDxe, for the next boot, via the > Setup TUI, > - defaults to 800x600 (taking effect when no particular choice is > configured). > > (2) UiApp -- the Setup TUI itself -- uses its own resolution. Under > OVMF, this resolution is fixed 640x480. When UiApp is entered, > ultimately a call is made to QemuRamfbGraphicsOutputSetMode() -- i.e., a > GOP.SetMode() member function -- for setting this 640x480 resolution. > > Using the following command: > > qemu-system-x86_64 \ > -nodefaults \ > -boot menu=on,splash-time=5000 \ > -enable-kvm \ > -device ramfb \ > -drive > if=pflash,readonly,format=raw,file=$PREFIX/share/qemu/edk2-x86_64-code.fd \ > -drive > if=pflash,snapshot,format=raw,file=$PREFIX/share/qemu/edk2-i386-vars.fd \ > -debugcon file:ovmf.log \ > -global isa-debugcon.iobase=0x402 > > when you first see the progress bar, the graphical resolution (1) is > 800x600. Accordingly, QEMU prints to stderr: > > > ramfb_fw_cfg_write: 800x600 @ 0x6702000 > > Once you hit ESC to interrupt the progress bar and to enter the Setup > TUI, UiApp switches to resolution (2), 640x480. QEMU prints: > > > ramfb_fw_cfg_write: 640x480 @ 0x6702000 > > ramfb_fw_cfg_write: resolution locked, change rejected > > And you get garbage in the Setup window. > > Thanks, > Laszlo > > [-- Attachment #2: Type: text/html, Size: 3408 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [edk2-discuss] Load Option passing. Either bugs or my confusion. 2020-04-22 7:42 ` Hou Qiming @ 2020-04-22 16:05 ` Laszlo Ersek 2020-04-23 3:15 ` Hou Qiming 0 siblings, 1 reply; 12+ messages in thread From: Laszlo Ersek @ 2020-04-22 16:05 UTC (permalink / raw) To: Hou Qiming Cc: Gerd Hoffmann, valerij zaporogeci, discuss, Marcel Apfelbaum (GMail address), edk2-devel-groups-io, qemu devel list On 04/22/20 09:42, Hou Qiming wrote: > A little off topic thing: isn't the default resolution supposed to be > 1024x768? No. > This is the Microsoft regulation which all my physical devices > seem to follow: > > https://docs.microsoft.com/en-us/windows-hardware/test/hlk/testref/6afc8979-df62-4d86-8f6a-99f05bbdc7f3 Key term being "Microsoft regulation". The UEFI spec requires discrete ("plug-in") graphics devices to support at least either 800x600x32 or 640x480x32. And the edk2 (not just OVMF) default for the console resolution is 800x600. (See PcdVideoHorizontalResolution and PcdVideoVerticalResolution in "MdeModulePkg/MdeModulePkg.dec".) > And when the user provides an EDID one should parse it and set the default > resolution to match it. But that's a less important feature. It's more complex than you might think, and (to me personally) it seems to require more time than its importance justifies. https://bugzilla.redhat.com/show_bug.cgi?id=1749250 Laszlo ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [edk2-discuss] Load Option passing. Either bugs or my confusion. 2020-04-22 16:05 ` Laszlo Ersek @ 2020-04-23 3:15 ` Hou Qiming 0 siblings, 0 replies; 12+ messages in thread From: Hou Qiming @ 2020-04-23 3:15 UTC (permalink / raw) To: Laszlo Ersek Cc: Gerd Hoffmann, valerij zaporogeci, discuss, Marcel Apfelbaum (GMail address), edk2-devel-groups-io, qemu devel list [-- Attachment #1: Type: text/plain, Size: 1196 bytes --] > > > > And when the user provides an EDID one should parse it and set the > default > > resolution to match it. But that's a less important feature. > > It's more complex than you might think, and (to me personally) it seems > to require more time than its importance justifies. > > https://bugzilla.redhat.com/show_bug.cgi?id=1749250 > > Read the thread. Actually, I wrote some EDID parsing code a while ago, but that's before QEMU supporting EDID so I had to do it outside QEMU and pass my parsing result to ramfb as the now-removed starting_width / starting_height. In the context QEMU, the EDID actually reflects the user preference since the whole structure is usually made up from the user-specified resolution. And I think most guest OSes initialize first-time-seen monitors to their EDID resolution, which should have motivated QEMU to provide an EDID for a virtual monitor. But at this point it's kind of awkward to do the EDID / resolution handling (that I need) in the ramfb driver as the kvmgt EDID has to be read out from the i915 MMIO just like a physical GPU. Guess now my use case is better covered with a fully functional i915 framebuffer driver for OVMF. If I had the time... [-- Attachment #2: Type: text/html, Size: 1602 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2020-04-23 3:16 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <18709.1586743215734342129@groups.io> [not found] ` <11181.1586825080096843656@groups.io> 2020-04-14 14:26 ` [edk2-discuss] Load Option passing. Either bugs or my confusion Laszlo Ersek [not found] ` <5211.1586899245384995995@groups.io> 2020-04-15 15:05 ` Laszlo Ersek 2020-04-16 4:38 ` Hou Qiming 2020-04-16 14:12 ` Laszlo Ersek 2020-04-17 3:22 ` Hou Qiming 2020-04-20 9:32 ` Laszlo Ersek 2020-04-20 14:19 ` Gerd Hoffmann 2020-04-20 14:13 ` Gerd Hoffmann 2020-04-21 13:02 ` Laszlo Ersek 2020-04-22 7:42 ` Hou Qiming 2020-04-22 16:05 ` Laszlo Ersek 2020-04-23 3:15 ` Hou Qiming
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox