From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received-SPF: Pass (sender SPF authorized) identity=mailfrom; client-ip=209.132.183.28; helo=mx1.redhat.com; envelope-from=lersek@redhat.com; receiver=edk2-devel@lists.01.org Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 176AB21148DCC for ; Thu, 20 Sep 2018 06:48:49 -0700 (PDT) Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 84444C050006; Thu, 20 Sep 2018 13:48:48 +0000 (UTC) Received: from lacos-laptop-7.usersys.redhat.com (ovpn-120-255.rdu2.redhat.com [10.10.120.255]) by smtp.corp.redhat.com (Postfix) with ESMTP id DB2F019486; Thu, 20 Sep 2018 13:48:47 +0000 (UTC) To: Samah Mansour , edk2-devel@lists.01.org References: From: Laszlo Ersek Message-ID: Date: Thu, 20 Sep 2018 15:48:46 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Thu, 20 Sep 2018 13:48:48 +0000 (UTC) Subject: Re: SPI Flash Corruption X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Sep 2018 13:48:49 -0000 Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit On 09/19/18 16:26, Samah Mansour wrote: > Hello, > > > Our product uses a Baytrail with Minnowboard Max bios firmware ( version > 0.93). Every now and then we see SPI flash corruption due to power cuts > while the unit is booting which causes the unit not to boot anymore. After > investigation we noticed that the VPD area is all FFs (address > 44000->47DFF0). > > > We have noticed that the Bios while booting writes to the flash from > several places in the code, which is if interrupted most probably is > causing the corruption. > > > Why is the bios writing all these configurations to flash while booting, is > it to optimize boot time? is it ok if we disable the bios writing to flash > completely to protect ourselves from corruption? The firmware is at liberty to write various non-volatile UEFI variables during boot. Some of those variables are standardized, some others may be specific to UEFI drivers (with correspondingly private namespace GUIDs for the variables). Power loss during flash write (and resultant flash corruption) is expected. My understanding is that the Fault Tolerant Write protocol / driver, sitting between the FVB (firmware volume block, i.e. flash) protocol / driver, and the variable write protocol / driver, implements a kind of journaling. It is described in the Intel whitepaper A Tour Beyond BIOS Implementing UEFI Authenticated Variables in SMM with EDKII September 2015 My expectation has been that the platform should recover from interrupted writes. That is, for a single given UEFI variable, you should either see "before" or "after" status, never "middle". (The whitepaper says that "Individual variable atomicity" is maintained even through a failed "reclaim", with the help of FTW.) If multiple variables should be in sync with each other, that's a different question. If the variables are not in sync, I think "failure to boot" may be a reasonable outcome. But, "failure to boot" means a lot of things, and I hope one should be at least dropped to the setup utility or the shell. Are you seeing an actual crash? Laszlo