public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
* SPI Flash Corruption
@ 2018-09-19 14:26 Samah Mansour
  2018-09-20 13:48 ` Laszlo Ersek
  0 siblings, 1 reply; 7+ messages in thread
From: Samah Mansour @ 2018-09-19 14:26 UTC (permalink / raw)
  To: edk2-devel

Hello,


Our product uses a Baytrail with Minnowboard Max bios firmware ( version
0.93). Every now and then we see SPI flash corruption due to power cuts
while the unit is booting which causes the unit not to boot anymore. After
investigation we noticed that the VPD area is all FFs (address
44000->47DFF0).


We have noticed that the Bios while booting writes to the flash from
several places in the code, which is if interrupted most probably is
causing the corruption.


Why is the bios writing all these configurations to flash while booting, is
it to optimize boot time? is it ok if we disable the bios writing to flash
completely to protect ourselves from corruption?


Samah


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: SPI Flash Corruption
  2018-09-19 14:26 SPI Flash Corruption Samah Mansour
@ 2018-09-20 13:48 ` Laszlo Ersek
  2018-09-20 15:47   ` Samah Mansour
  0 siblings, 1 reply; 7+ messages in thread
From: Laszlo Ersek @ 2018-09-20 13:48 UTC (permalink / raw)
  To: Samah Mansour, edk2-devel

On 09/19/18 16:26, Samah Mansour wrote:
> Hello,
> 
> 
> Our product uses a Baytrail with Minnowboard Max bios firmware ( version
> 0.93). Every now and then we see SPI flash corruption due to power cuts
> while the unit is booting which causes the unit not to boot anymore. After
> investigation we noticed that the VPD area is all FFs (address
> 44000->47DFF0).
> 
> 
> We have noticed that the Bios while booting writes to the flash from
> several places in the code, which is if interrupted most probably is
> causing the corruption.
> 
> 
> Why is the bios writing all these configurations to flash while booting, is
> it to optimize boot time? is it ok if we disable the bios writing to flash
> completely to protect ourselves from corruption?

The firmware is at liberty to write various non-volatile UEFI variables
during boot. Some of those variables are standardized, some others may
be specific to UEFI drivers (with correspondingly private namespace
GUIDs for the variables).

Power loss during flash write (and resultant flash corruption) is
expected. My understanding is that the Fault Tolerant Write protocol /
driver, sitting between the FVB (firmware volume block, i.e. flash)
protocol / driver, and the variable write protocol / driver, implements
a kind of journaling. It is described in the Intel whitepaper

  A Tour Beyond BIOS
  Implementing UEFI Authenticated Variables in SMM with EDKII
  September 2015

My expectation has been that the platform should recover from
interrupted writes. That is, for a single given UEFI variable, you
should either see "before" or "after" status, never "middle". (The
whitepaper says that "Individual variable atomicity" is maintained even
through a failed "reclaim", with the help of FTW.)

If multiple variables should be in sync with each other, that's a
different question. If the variables are not in sync, I think "failure
to boot" may be a reasonable outcome. But, "failure to boot" means a lot
of things, and I hope one should be at least dropped to the setup
utility or the shell. Are you seeing an actual crash?

Laszlo


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: SPI Flash Corruption
  2018-09-20 13:48 ` Laszlo Ersek
@ 2018-09-20 15:47   ` Samah Mansour
  2018-09-20 23:43     ` Yao, Jiewen
  0 siblings, 1 reply; 7+ messages in thread
From: Samah Mansour @ 2018-09-20 15:47 UTC (permalink / raw)
  To: lersek; +Cc: edk2-devel

Hi Laszlo,
Thanks for your reply.
Actually what I see is that VPD (Vital Product Area between addresses
44000->47DFF0  ) is completely wiped which causes the failure to boot!
Without the VPD unit cannot boot.
I will take a look at the white paper.
It would be helpful to know what's the impact of disabling the ability of
the firmware to write those non volatile variables to flash.

Samah


On Thu, Sep 20, 2018 at 9:48 AM Laszlo Ersek <lersek@redhat.com> wrote:

> On 09/19/18 16:26, Samah Mansour wrote:
> > Hello,
> >
> >
> > Our product uses a Baytrail with Minnowboard Max bios firmware ( version
> > 0.93). Every now and then we see SPI flash corruption due to power cuts
> > while the unit is booting which causes the unit not to boot anymore.
> After
> > investigation we noticed that the VPD area is all FFs (address
> > 44000->47DFF0).
> >
> >
> > We have noticed that the Bios while booting writes to the flash from
> > several places in the code, which is if interrupted most probably is
> > causing the corruption.
> >
> >
> > Why is the bios writing all these configurations to flash while booting,
> is
> > it to optimize boot time? is it ok if we disable the bios writing to
> flash
> > completely to protect ourselves from corruption?
>
> The firmware is at liberty to write various non-volatile UEFI variables
> during boot. Some of those variables are standardized, some others may
> be specific to UEFI drivers (with correspondingly private namespace
> GUIDs for the variables).
>
> Power loss during flash write (and resultant flash corruption) is
> expected. My understanding is that the Fault Tolerant Write protocol /
> driver, sitting between the FVB (firmware volume block, i.e. flash)
> protocol / driver, and the variable write protocol / driver, implements
> a kind of journaling. It is described in the Intel whitepaper
>
>   A Tour Beyond BIOS
>   Implementing UEFI Authenticated Variables in SMM with EDKII
>   September 2015
>
> My expectation has been that the platform should recover from
> interrupted writes. That is, for a single given UEFI variable, you
> should either see "before" or "after" status, never "middle". (The
> whitepaper says that "Individual variable atomicity" is maintained even
> through a failed "reclaim", with the help of FTW.)
>
> If multiple variables should be in sync with each other, that's a
> different question. If the variables are not in sync, I think "failure
> to boot" may be a reasonable outcome. But, "failure to boot" means a lot
> of things, and I hope one should be at least dropped to the setup
> utility or the shell. Are you seeing an actual crash?
>
> Laszlo
>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: SPI Flash Corruption
  2018-09-20 15:47   ` Samah Mansour
@ 2018-09-20 23:43     ` Yao, Jiewen
  2018-09-21  9:26       ` Wei, David
  0 siblings, 1 reply; 7+ messages in thread
From: Yao, Jiewen @ 2018-09-20 23:43 UTC (permalink / raw)
  To: Samah Mansour; +Cc: lersek@redhat.com, edk2-devel@lists.01.org

thank you, Samah. 
Would you please file a tracker in edkii bugzilla ?

The term VPD might lead confusion here. 
Ideally VPD region is independent with UEFI variable region. It is a special region to hold PCD with VPD type. 
I just look at the code. The open source minnowmax puts variable region in the VPD region. As such there is discussion about variable atomicity. But the variable atomicity cannot guarantee the integrity of FV header. Additional check need to be done in platform  FVB driver. 

If you can add a detailed reproducing step in the bugzilla, it will be helpful for us to understand the problem. 

thank you!
Yao, Jiewen


> 在 2018年9月20日,下午11:47,Samah Mansour <samah.mansour1@gmail.com> 写道:
> 
> Hi Laszlo,
> Thanks for your reply.
> Actually what I see is that VPD (Vital Product Area between addresses
> 44000->47DFF0  ) is completely wiped which causes the failure to boot!
> Without the VPD unit cannot boot.
> I will take a look at the white paper.
> It would be helpful to know what's the impact of disabling the ability of
> the firmware to write those non volatile variables to flash.
> 
> Samah
> 
> 
>> On Thu, Sep 20, 2018 at 9:48 AM Laszlo Ersek <lersek@redhat.com> wrote:
>> 
>>> On 09/19/18 16:26, Samah Mansour wrote:
>>> Hello,
>>> 
>>> 
>>> Our product uses a Baytrail with Minnowboard Max bios firmware ( version
>>> 0.93). Every now and then we see SPI flash corruption due to power cuts
>>> while the unit is booting which causes the unit not to boot anymore.
>> After
>>> investigation we noticed that the VPD area is all FFs (address
>>> 44000->47DFF0).
>>> 
>>> 
>>> We have noticed that the Bios while booting writes to the flash from
>>> several places in the code, which is if interrupted most probably is
>>> causing the corruption.
>>> 
>>> 
>>> Why is the bios writing all these configurations to flash while booting,
>> is
>>> it to optimize boot time? is it ok if we disable the bios writing to
>> flash
>>> completely to protect ourselves from corruption?
>> 
>> The firmware is at liberty to write various non-volatile UEFI variables
>> during boot. Some of those variables are standardized, some others may
>> be specific to UEFI drivers (with correspondingly private namespace
>> GUIDs for the variables).
>> 
>> Power loss during flash write (and resultant flash corruption) is
>> expected. My understanding is that the Fault Tolerant Write protocol /
>> driver, sitting between the FVB (firmware volume block, i.e. flash)
>> protocol / driver, and the variable write protocol / driver, implements
>> a kind of journaling. It is described in the Intel whitepaper
>> 
>>  A Tour Beyond BIOS
>>  Implementing UEFI Authenticated Variables in SMM with EDKII
>>  September 2015
>> 
>> My expectation has been that the platform should recover from
>> interrupted writes. That is, for a single given UEFI variable, you
>> should either see "before" or "after" status, never "middle". (The
>> whitepaper says that "Individual variable atomicity" is maintained even
>> through a failed "reclaim", with the help of FTW.)
>> 
>> If multiple variables should be in sync with each other, that's a
>> different question. If the variables are not in sync, I think "failure
>> to boot" may be a reasonable outcome. But, "failure to boot" means a lot
>> of things, and I hope one should be at least dropped to the setup
>> utility or the shell. Are you seeing an actual crash?
>> 
>> Laszlo
>> 
> _______________________________________________
> edk2-devel mailing list
> edk2-devel@lists.01.org
> https://lists.01.org/mailman/listinfo/edk2-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: SPI Flash Corruption
  2018-09-20 23:43     ` Yao, Jiewen
@ 2018-09-21  9:26       ` Wei, David
  2018-09-21 16:52         ` Andrew Fish
  0 siblings, 1 reply; 7+ messages in thread
From: Wei, David @ 2018-09-21  9:26 UTC (permalink / raw)
  To: Yao, Jiewen, Samah Mansour
  Cc: edk2-devel@lists.01.org, lersek@redhat.com, Gao, Liming,
	Kinney, Michael D, Wu, Mike, Zimmer, Vincent

More comments:

The NV Variable region starts from 0x40000. Is there any data remains within  0x40000 - 0x44000 region?  Could you dump the flash image and share it with us, and also file a bug in https://bugzilla.tianocore.org as Jiewen mentioned? 

It occurred to me that on some old version of Minnowboard Max BIOS, the NV variable reclaiming process would take a long time ,so that inpatient user may think the system is stuck and cut the power.  This will break the NV variable region. And in old version of Minnowboard Max BIOS, FTW driver is not added for PEI stage, so system may not recover if PEI stage depends on NV variable. 

Newer version of Minnowboard Max BIOS re-configures SPI flash clock to make the NV Variable reclaiming process more faster, and also adds FTW for PEI stage. I will check which version of Minnowboard Max BIOS has added this fix. 

Thanks,
David  Wei

Intel SSG/STO/UEFI BIOS                                 


-----Original Message-----
From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of Yao, Jiewen
Sent: Friday, September 21, 2018 7:44 AM
To: Samah Mansour <samah.mansour1@gmail.com>
Cc: edk2-devel@lists.01.org; lersek@redhat.com
Subject: Re: [edk2] SPI Flash Corruption
Importance: High

thank you, Samah. 
Would you please file a tracker in edkii bugzilla ?

The term VPD might lead confusion here. 
Ideally VPD region is independent with UEFI variable region. It is a special region to hold PCD with VPD type. 
I just look at the code. The open source minnowmax puts variable region in the VPD region. As such there is discussion about variable atomicity. But the variable atomicity cannot guarantee the integrity of FV header. Additional check need to be done in platform  FVB driver. 

If you can add a detailed reproducing step in the bugzilla, it will be helpful for us to understand the problem. 

thank you!
Yao, Jiewen


> 在 2018年9月20日,下午11:47,Samah Mansour <samah.mansour1@gmail.com> 写道:
> 
> Hi Laszlo,
> Thanks for your reply.
> Actually what I see is that VPD (Vital Product Area between addresses
> 44000->47DFF0  ) is completely wiped which causes the failure to boot!
> Without the VPD unit cannot boot.
> I will take a look at the white paper.
> It would be helpful to know what's the impact of disabling the ability of
> the firmware to write those non volatile variables to flash.
> 
> Samah
> 
> 
>> On Thu, Sep 20, 2018 at 9:48 AM Laszlo Ersek <lersek@redhat.com> wrote:
>> 
>>> On 09/19/18 16:26, Samah Mansour wrote:
>>> Hello,
>>> 
>>> 
>>> Our product uses a Baytrail with Minnowboard Max bios firmware ( version
>>> 0.93). Every now and then we see SPI flash corruption due to power cuts
>>> while the unit is booting which causes the unit not to boot anymore.
>> After
>>> investigation we noticed that the VPD area is all FFs (address
>>> 44000->47DFF0).
>>> 
>>> 
>>> We have noticed that the Bios while booting writes to the flash from
>>> several places in the code, which is if interrupted most probably is
>>> causing the corruption.
>>> 
>>> 
>>> Why is the bios writing all these configurations to flash while booting,
>> is
>>> it to optimize boot time? is it ok if we disable the bios writing to
>> flash
>>> completely to protect ourselves from corruption?
>> 
>> The firmware is at liberty to write various non-volatile UEFI variables
>> during boot. Some of those variables are standardized, some others may
>> be specific to UEFI drivers (with correspondingly private namespace
>> GUIDs for the variables).
>> 
>> Power loss during flash write (and resultant flash corruption) is
>> expected. My understanding is that the Fault Tolerant Write protocol /
>> driver, sitting between the FVB (firmware volume block, i.e. flash)
>> protocol / driver, and the variable write protocol / driver, implements
>> a kind of journaling. It is described in the Intel whitepaper
>> 
>>  A Tour Beyond BIOS
>>  Implementing UEFI Authenticated Variables in SMM with EDKII
>>  September 2015
>> 
>> My expectation has been that the platform should recover from
>> interrupted writes. That is, for a single given UEFI variable, you
>> should either see "before" or "after" status, never "middle". (The
>> whitepaper says that "Individual variable atomicity" is maintained even
>> through a failed "reclaim", with the help of FTW.)
>> 
>> If multiple variables should be in sync with each other, that's a
>> different question. If the variables are not in sync, I think "failure
>> to boot" may be a reasonable outcome. But, "failure to boot" means a lot
>> of things, and I hope one should be at least dropped to the setup
>> utility or the shell. Are you seeing an actual crash?
>> 
>> Laszlo
>> 
> _______________________________________________
> edk2-devel mailing list
> edk2-devel@lists.01.org
> https://lists.01.org/mailman/listinfo/edk2-devel
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: SPI Flash Corruption
  2018-09-21  9:26       ` Wei, David
@ 2018-09-21 16:52         ` Andrew Fish
  2018-09-21 17:18           ` Samah Mansour
  0 siblings, 1 reply; 7+ messages in thread
From: Andrew Fish @ 2018-09-21 16:52 UTC (permalink / raw)
  To: edk2-devel@lists.01.org
  Cc: Yao, Jiewen, Samah Mansour, Gao, Liming, Mike Kinney,
	Vincent Zimmer, lersek@redhat.com, Wei, David

From a design point of view VPD == Vital Product Data. The idea behind VPD was to be a place to store platform unique information generally programmed in the factory. So things like serial number, system UUID, mac address, etc. Usually VPD is programmed in the factory and never updated, thus it is a good idea to put the VPD data in its own FLASH block, and always keep that block locked. It is not uncommon for a FLASH update utility to not update that block when the FD is updated. 

Thanks,

Andrew Fish

> On Sep 21, 2018, at 2:26 AM, Wei, David <david.wei@intel.com> wrote:
> 
> More comments:
> 
> The NV Variable region starts from 0x40000. Is there any data remains within  0x40000 - 0x44000 region?  Could you dump the flash image and share it with us, and also file a bug in https://bugzilla.tianocore.org as Jiewen mentioned? 
> 
> It occurred to me that on some old version of Minnowboard Max BIOS, the NV variable reclaiming process would take a long time ,so that inpatient user may think the system is stuck and cut the power.  This will break the NV variable region. And in old version of Minnowboard Max BIOS, FTW driver is not added for PEI stage, so system may not recover if PEI stage depends on NV variable. 
> 
> Newer version of Minnowboard Max BIOS re-configures SPI flash clock to make the NV Variable reclaiming process more faster, and also adds FTW for PEI stage. I will check which version of Minnowboard Max BIOS has added this fix. 
> 
> Thanks,
> David  Wei
> 
> Intel SSG/STO/UEFI BIOS                                 
> 
> 
> -----Original Message-----
> From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of Yao, Jiewen
> Sent: Friday, September 21, 2018 7:44 AM
> To: Samah Mansour <samah.mansour1@gmail.com>
> Cc: edk2-devel@lists.01.org; lersek@redhat.com
> Subject: Re: [edk2] SPI Flash Corruption
> Importance: High
> 
> thank you, Samah. 
> Would you please file a tracker in edkii bugzilla ?
> 
> The term VPD might lead confusion here. 
> Ideally VPD region is independent with UEFI variable region. It is a special region to hold PCD with VPD type. 
> I just look at the code. The open source minnowmax puts variable region in the VPD region. As such there is discussion about variable atomicity. But the variable atomicity cannot guarantee the integrity of FV header. Additional check need to be done in platform  FVB driver. 
> 
> If you can add a detailed reproducing step in the bugzilla, it will be helpful for us to understand the problem. 
> 
> thank you!
> Yao, Jiewen
> 
> 
>> 在 2018年9月20日,下午11:47,Samah Mansour <samah.mansour1@gmail.com> 写道:
>> 
>> Hi Laszlo,
>> Thanks for your reply.
>> Actually what I see is that VPD (Vital Product Area between addresses
>> 44000->47DFF0  ) is completely wiped which causes the failure to boot!
>> Without the VPD unit cannot boot.
>> I will take a look at the white paper.
>> It would be helpful to know what's the impact of disabling the ability of
>> the firmware to write those non volatile variables to flash.
>> 
>> Samah
>> 
>> 
>>> On Thu, Sep 20, 2018 at 9:48 AM Laszlo Ersek <lersek@redhat.com> wrote:
>>> 
>>>> On 09/19/18 16:26, Samah Mansour wrote:
>>>> Hello,
>>>> 
>>>> 
>>>> Our product uses a Baytrail with Minnowboard Max bios firmware ( version
>>>> 0.93). Every now and then we see SPI flash corruption due to power cuts
>>>> while the unit is booting which causes the unit not to boot anymore.
>>> After
>>>> investigation we noticed that the VPD area is all FFs (address
>>>> 44000->47DFF0).
>>>> 
>>>> 
>>>> We have noticed that the Bios while booting writes to the flash from
>>>> several places in the code, which is if interrupted most probably is
>>>> causing the corruption.
>>>> 
>>>> 
>>>> Why is the bios writing all these configurations to flash while booting,
>>> is
>>>> it to optimize boot time? is it ok if we disable the bios writing to
>>> flash
>>>> completely to protect ourselves from corruption?
>>> 
>>> The firmware is at liberty to write various non-volatile UEFI variables
>>> during boot. Some of those variables are standardized, some others may
>>> be specific to UEFI drivers (with correspondingly private namespace
>>> GUIDs for the variables).
>>> 
>>> Power loss during flash write (and resultant flash corruption) is
>>> expected. My understanding is that the Fault Tolerant Write protocol /
>>> driver, sitting between the FVB (firmware volume block, i.e. flash)
>>> protocol / driver, and the variable write protocol / driver, implements
>>> a kind of journaling. It is described in the Intel whitepaper
>>> 
>>> A Tour Beyond BIOS
>>> Implementing UEFI Authenticated Variables in SMM with EDKII
>>> September 2015
>>> 
>>> My expectation has been that the platform should recover from
>>> interrupted writes. That is, for a single given UEFI variable, you
>>> should either see "before" or "after" status, never "middle". (The
>>> whitepaper says that "Individual variable atomicity" is maintained even
>>> through a failed "reclaim", with the help of FTW.)
>>> 
>>> If multiple variables should be in sync with each other, that's a
>>> different question. If the variables are not in sync, I think "failure
>>> to boot" may be a reasonable outcome. But, "failure to boot" means a lot
>>> of things, and I hope one should be at least dropped to the setup
>>> utility or the shell. Are you seeing an actual crash?
>>> 
>>> Laszlo
>>> 
>> _______________________________________________
>> edk2-devel mailing list
>> edk2-devel@lists.01.org
>> https://lists.01.org/mailman/listinfo/edk2-devel
> _______________________________________________
> edk2-devel mailing list
> edk2-devel@lists.01.org
> https://lists.01.org/mailman/listinfo/edk2-devel
> _______________________________________________
> edk2-devel mailing list
> edk2-devel@lists.01.org
> https://lists.01.org/mailman/listinfo/edk2-devel



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: SPI Flash Corruption
  2018-09-21 16:52         ` Andrew Fish
@ 2018-09-21 17:18           ` Samah Mansour
  0 siblings, 0 replies; 7+ messages in thread
From: Samah Mansour @ 2018-09-21 17:18 UTC (permalink / raw)
  To: afish
  Cc: edk2-devel, jiewen.yao, liming.gao, michael.d.kinney,
	vincent.zimmer, lersek, david.wei

Thanks guys for your answers.

I will open a bug in bugzilla and attach the binary.

*Is there any data remains within  0x40000 - 0x44000 region?  *
Yes there is  Data between 0x40000 and 0x44000. it's after 440000 that you
can only see FFs in the affected units.

The bios we are using is based on the MinnowBoard Max 0.93 ( it's pretty
old). It would be very helpful for me to know when the FTW was added, this
way I can test with the Bios after that change.

Samah



On Fri, Sep 21, 2018 at 12:53 PM Andrew Fish <afish@apple.com> wrote:

> From a design point of view VPD == Vital Product Data. The idea behind VPD
> was to be a place to store platform unique information generally programmed
> in the factory. So things like serial number, system UUID, mac address,
> etc. Usually VPD is programmed in the factory and never updated, thus it is
> a good idea to put the VPD data in its own FLASH block, and always keep
> that block locked. It is not uncommon for a FLASH update utility to not
> update that block when the FD is updated.
>
> Thanks,
>
> Andrew Fish
>
> > On Sep 21, 2018, at 2:26 AM, Wei, David <david.wei@intel.com> wrote:
> >
> > More comments:
> >
> > The NV Variable region starts from 0x40000. Is there any data remains
> within  0x40000 - 0x44000 region?  Could you dump the flash image and share
> it with us, and also file a bug in https://bugzilla.tianocore.org as
> Jiewen mentioned?
> >
> > It occurred to me that on some old version of Minnowboard Max BIOS, the
> NV variable reclaiming process would take a long time ,so that inpatient
> user may think the system is stuck and cut the power.  This will break the
> NV variable region. And in old version of Minnowboard Max BIOS, FTW driver
> is not added for PEI stage, so system may not recover if PEI stage depends
> on NV variable.
> >
> > Newer version of Minnowboard Max BIOS re-configures SPI flash clock to
> make the NV Variable reclaiming process more faster, and also adds FTW for
> PEI stage. I will check which version of Minnowboard Max BIOS has added
> this fix.
> >
> > Thanks,
> > David  Wei
> >
> > Intel SSG/STO/UEFI BIOS
> >
> >
> > -----Original Message-----
> > From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of
> Yao, Jiewen
> > Sent: Friday, September 21, 2018 7:44 AM
> > To: Samah Mansour <samah.mansour1@gmail.com>
> > Cc: edk2-devel@lists.01.org; lersek@redhat.com
> > Subject: Re: [edk2] SPI Flash Corruption
> > Importance: High
> >
> > thank you, Samah.
> > Would you please file a tracker in edkii bugzilla ?
> >
> > The term VPD might lead confusion here.
> > Ideally VPD region is independent with UEFI variable region. It is a
> special region to hold PCD with VPD type.
> > I just look at the code. The open source minnowmax puts variable region
> in the VPD region. As such there is discussion about variable atomicity.
> But the variable atomicity cannot guarantee the integrity of FV header.
> Additional check need to be done in platform  FVB driver.
> >
> > If you can add a detailed reproducing step in the bugzilla, it will be
> helpful for us to understand the problem.
> >
> > thank you!
> > Yao, Jiewen
> >
> >
> >> 在 2018年9月20日,下午11:47,Samah Mansour <samah.mansour1@gmail.com> 写道:
> >>
> >> Hi Laszlo,
> >> Thanks for your reply.
> >> Actually what I see is that VPD (Vital Product Area between addresses
> >> 44000->47DFF0  ) is completely wiped which causes the failure to boot!
> >> Without the VPD unit cannot boot.
> >> I will take a look at the white paper.
> >> It would be helpful to know what's the impact of disabling the ability
> of
> >> the firmware to write those non volatile variables to flash.
> >>
> >> Samah
> >>
> >>
> >>> On Thu, Sep 20, 2018 at 9:48 AM Laszlo Ersek <lersek@redhat.com>
> wrote:
> >>>
> >>>> On 09/19/18 16:26, Samah Mansour wrote:
> >>>> Hello,
> >>>>
> >>>>
> >>>> Our product uses a Baytrail with Minnowboard Max bios firmware (
> version
> >>>> 0.93). Every now and then we see SPI flash corruption due to power
> cuts
> >>>> while the unit is booting which causes the unit not to boot anymore.
> >>> After
> >>>> investigation we noticed that the VPD area is all FFs (address
> >>>> 44000->47DFF0).
> >>>>
> >>>>
> >>>> We have noticed that the Bios while booting writes to the flash from
> >>>> several places in the code, which is if interrupted most probably is
> >>>> causing the corruption.
> >>>>
> >>>>
> >>>> Why is the bios writing all these configurations to flash while
> booting,
> >>> is
> >>>> it to optimize boot time? is it ok if we disable the bios writing to
> >>> flash
> >>>> completely to protect ourselves from corruption?
> >>>
> >>> The firmware is at liberty to write various non-volatile UEFI variables
> >>> during boot. Some of those variables are standardized, some others may
> >>> be specific to UEFI drivers (with correspondingly private namespace
> >>> GUIDs for the variables).
> >>>
> >>> Power loss during flash write (and resultant flash corruption) is
> >>> expected. My understanding is that the Fault Tolerant Write protocol /
> >>> driver, sitting between the FVB (firmware volume block, i.e. flash)
> >>> protocol / driver, and the variable write protocol / driver, implements
> >>> a kind of journaling. It is described in the Intel whitepaper
> >>>
> >>> A Tour Beyond BIOS
> >>> Implementing UEFI Authenticated Variables in SMM with EDKII
> >>> September 2015
> >>>
> >>> My expectation has been that the platform should recover from
> >>> interrupted writes. That is, for a single given UEFI variable, you
> >>> should either see "before" or "after" status, never "middle". (The
> >>> whitepaper says that "Individual variable atomicity" is maintained even
> >>> through a failed "reclaim", with the help of FTW.)
> >>>
> >>> If multiple variables should be in sync with each other, that's a
> >>> different question. If the variables are not in sync, I think "failure
> >>> to boot" may be a reasonable outcome. But, "failure to boot" means a
> lot
> >>> of things, and I hope one should be at least dropped to the setup
> >>> utility or the shell. Are you seeing an actual crash?
> >>>
> >>> Laszlo
> >>>
> >> _______________________________________________
> >> edk2-devel mailing list
> >> edk2-devel@lists.01.org
> >> https://lists.01.org/mailman/listinfo/edk2-devel
> > _______________________________________________
> > edk2-devel mailing list
> > edk2-devel@lists.01.org
> > https://lists.01.org/mailman/listinfo/edk2-devel
> > _______________________________________________
> > edk2-devel mailing list
> > edk2-devel@lists.01.org
> > https://lists.01.org/mailman/listinfo/edk2-devel
>
>


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-09-21 17:18 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-09-19 14:26 SPI Flash Corruption Samah Mansour
2018-09-20 13:48 ` Laszlo Ersek
2018-09-20 15:47   ` Samah Mansour
2018-09-20 23:43     ` Yao, Jiewen
2018-09-21  9:26       ` Wei, David
2018-09-21 16:52         ` Andrew Fish
2018-09-21 17:18           ` Samah Mansour

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox