On 12/8/23 13:04, Gerd Hoffmann wrote: > On Thu, Dec 07, 2023 at 05:16:10PM +0100, Ard Biesheuvel wrote: >> Hi Gerd, >> >> On Thu, 7 Dec 2023 at 10:44, Gerd Hoffmann wrote: >>> >>> Extend the ValidateFvHeader function, additionally to the header checks >>> walk over the list of variables and sanity check them. >>> >>> In case we find inconsistencies indicating variable store corruption >>> return EFI_NOT_FOUND so the variable store will be re-initialized. >>> >>> Signed-off-by: Gerd Hoffmann >> >> This seems like a good idea to me, and the change looks generally >> fine, except for one nit below: > >>> + case VAR_HEADER_VALID_ONLY: >>> + VarState = "header-ok"; >>> + VarName = L""; >>> + break; >>> + case VAR_ADDED: >>> + VarState = "ok"; >>> + break; >>> + case VAR_ADDED &VAR_IN_DELETED_TRANSITION: >> >> This looks odd, so please add a comment explaining why these constants >> are constructed this way. > > The VAR_HEADER_VALID_ONLY and VAR_ADDED states are applied using an > assignment, i.e. "State = VAR_...". > > The other two are used to clear bits, i.e. "State &= VAR_...". > > So the usual livecycle for the State field is: > > (1) 0x7f (VAR_HEADER_VALID_ONLY) > (2) 0x3f (VAR_ADDED) > (3) 0x3e (VAR_ADDED & VAR_IN_DELETED_TRANSITION) > (4) 0x3c (VAR_ADDED & VAR_IN_DELETED_TRANSITION & VAR_DELETED) > > Seems it can also happen that variable entries jump directly from (2) to > (4), resulting in 0x3d (VAR_ADDED & VAR_DELETED) being an possible > alternative value for (4). > > I'm not fully sure what happens to VAR_HEADER_VALID_ONLY entries, i.e. > whenever they go through the (3) + (4) deletion cycles too or whenever > they are garbage-collected in some other way. I suspect the latter, > given that the reclaim logic in the variable driver looks at the > variable names, but VAR_HEADER_VALID_ONLY entries don't have a valid > variable name. > > > BTW: Is there any good documentation on the design of the variable > driver and the fault tolerant flash writes? Is the variable flash > supposed to be in a consistent state at any given time, even if > interrupted in the middle of some update operation? I have two references: (1) A Tour Beyond BIOS: Implementing UEFI Authenticated Variables in SMM with EDKII Part III - Variable https://github.com/tianocore/tianocore.github.io/wiki/EDK-II-white-papers https://github.com/tianocore-docs/Docs/raw/master/White_Papers/A_Tour_Beyond_BIOS_Implementing_UEFI_Authenticated_Variables_in_SMM_with_EDKII_V2.pdf It discusses Fault Tolerant Write too. And yes, in theory, FTW is supposed to protect against power loss at any particular stage. (2) Some messages from 2013-2014, when the list was still on sourceforge.net. I think the public archives from that time are all lost, but I'm attaching the one relevant message I could find locally. > > The qemu flash emulation is a bit sloppy, specifically block writes can > be half-completed when a reset happens in the middle of the operation. > Which maybe is the root cause for varstore corruption. > > On the other hand the edk2 flash code does not look like it carefully > chooses blocks when updating flash, so I'm not sure it actually depends > on atomic block updates to ensure consistency. Can't comment on these, alas. Laszlo -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#112341): https://edk2.groups.io/g/devel/message/112341 Mute This Topic: https://groups.io/mt/103031342/7686176 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/leave/12367111/7686176/1913456212/xyzzy [rebecca@openfw.io] -=-=-=-=-=-=-=-=-=-=-=-