From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail02.groups.io (mail02.groups.io [66.175.222.108]) by spool.mail.gandi.net (Postfix) with ESMTPS id 43B04D81092 for ; Tue, 31 Oct 2023 09:55:48 +0000 (UTC) DKIM-Signature: a=rsa-sha256; bh=cW2fhBwxVpufYr9m8GPUPLzcADvG31TFkklCx3hqnQ4=; c=relaxed/simple; d=groups.io; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject:To:Cc:Precedence:List-Subscribe:List-Help:Sender:List-Id:Mailing-List:Delivered-To:Reply-To:List-Unsubscribe-Post:List-Unsubscribe:Content-Type; s=20140610; t=1698746146; v=1; b=EgsKcPHnt95ROh2YuLESuskvuSV8nxstitgJfqBC8ytbAoGeENzOeOEzF2h+3F0lrAT65F5x eajZ2f3SFVYlR9h/a/nj74f5bc20u8evVpfD0wFjeTUDMZON3UY3ob/kmjukZkAbZbgB8wyexrd dK/WZ06RKRnsWPx681iQSeYw= X-Received: by 127.0.0.2 with SMTP id VTCwYY7687511xwVVg2Pzj8r; Tue, 31 Oct 2023 02:55:46 -0700 X-Received: from mail-qk1-f169.google.com (mail-qk1-f169.google.com [209.85.222.169]) by mx.groups.io with SMTP id smtpd.web10.182355.1698746145854807707 for ; Tue, 31 Oct 2023 02:55:46 -0700 X-Received: by mail-qk1-f169.google.com with SMTP id af79cd13be357-779fb118fe4so360936085a.2 for ; Tue, 31 Oct 2023 02:55:45 -0700 (PDT) X-Gm-Message-State: xdd8Tjkhy5d0Cxt5XUJT734xx7686176AA= X-Google-Smtp-Source: AGHT+IFqDDJahjnUYNiJNqgAUFkJuTGswbpVsDiAw8fwSZtJ4d1zw7GEpzr6zmCo8SqaZXxXmHlh0fNk/1McT184kbY= X-Received: by 2002:a05:6214:f6c:b0:671:739e:e2fa with SMTP id iy12-20020a0562140f6c00b00671739ee2famr9246885qvb.59.1698746144683; Tue, 31 Oct 2023 02:55:44 -0700 (PDT) MIME-Version: 1.0 References: <20231029144613.150580-1-dhaval@rivosinc.com> <20231029144613.150580-4-dhaval@rivosinc.com> <2db1b89a-6c7f-ea3f-becb-1e942b41a3e8@redhat.com> In-Reply-To: From: "Dhaval Sharma" Date: Tue, 31 Oct 2023 15:25:33 +0530 Message-ID: Subject: Re: [edk2-devel] [PATCH v7 3/5] MdePkg: Implement RISC-V Cache Management Operations To: Pedro Falcato Cc: Laszlo Ersek , devel@edk2.groups.io, Michael D Kinney , Liming Gao , Zhiguang Liu , Sunil V L , Daniel Schaefer , Jingyu Li Precedence: Bulk List-Subscribe: List-Help: Sender: devel@edk2.groups.io List-Id: Mailing-List: list devel@edk2.groups.io; contact devel+owner@edk2.groups.io Reply-To: devel@edk2.groups.io,dhaval@rivosinc.com List-Unsubscribe-Post: List-Unsubscribe=One-Click List-Unsubscribe: Content-Type: multipart/alternative; boundary="000000000000848c720609002899" X-GND-Status: LEGIT Authentication-Results: spool.mail.gandi.net; dkim=pass header.d=groups.io header.s=20140610 header.b=EgsKcPHn; spf=pass (spool.mail.gandi.net: domain of bounce@groups.io designates 66.175.222.108 as permitted sender) smtp.mailfrom=bounce@groups.io; dmarc=none --000000000000848c720609002899 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable I am posting an update on behalf of Jingyu as he had trouble with posting. CC'ing him here: In summary what we have verified so far: 1. I have verified that instructions/op codes are okay. I have also verified on Qemu that functionally it seems to be calling correct instructions. Ensured with negative test cases that any other op codes d= o cause exceptions as expected. 2. Jingyu was able to verify the CpuFlushCpuDataCache function with this framework (he had to use custom op code based on his soc implementation)= on SG2042. There is one issue that he is debugging now which is related to other cache instructions and he will get back with more data. P.S. SG204= 2 does not implement the exact same CMO opcodes but equivalent ones. So th= is experiment is just an additional data point that helps verify the framew= ork and not CMO itself. 3. In general it sounds like framework flows are alright and as long as instructions do their job as claimed in the spec, it is lower risk. Guess this is what we have so far. If it makes sense to everyone, we could go ahead with merging with this *feature disabled by default* after Jingyu provides clarity reg failures on SG2042 platform. Otherwise we can wait until newer Si is available where these exact instructions can be tested and then upstreamed. [From Jingyu] I verified this CMO framework on an actual HW platform. SW: edk2: https://github.com/rivosinc/edk2/tree/dev-rv-cmo-v7 branch: dev-rv-cmo-v7 edk2-platforms: https://github.com/sophgo/edk2-platforms branch: sg2042-de= v HW: Milk-V Pioneer Box, a developer motherboard based on SG2042 with 64-Core T-HEAD C920. Attention: The T-HEAD C920 implemented its own CMO Extension and is different from the standard CMO Extension. Test steps: 1. Modified the opcodes in RiscVasm.inc to accommodate the C920 CMO feature= . diff --git a/MdePkg/Include/RiscV64/RiscVasm.inc b/MdePkg/Include/RiscV64/RiscVasm.inc index 29de735885..5df85fdb31 100644 --- a/MdePkg/Include/RiscV64/RiscVasm.inc +++ b/MdePkg/Include/RiscV64/RiscVasm.inc @@ -7,13 +7,13 @@ */ .macro RISCVCMOFLUSH - .word 0x25200f + .long 0x0275000b^M .endm .macro RISCVCMOINVALIDATE - .word 0x05200f + .long 0x0265000b^M .endm .macro RISCVCMOCLEAN - .word 0x15200f + .long 0x0275000b^M .endm 2. We enable the CMO during the PCIe devices with DMA access to the memory, just focus on the implementation of CpuFlushCpuDataCache based on the EFI_CPU_ARCH_PROTOCOL. Except for PCIe, in other words, except for the cpu->FlushDataCache, we do not use CMO. And the PCIe inbound only relates to datacache.clean and datacache.invalidate. diff --git a/UefiCpuPkg/CpuDxeRiscV64/CpuDxe.c b/UefiCpuPkg/CpuDxeRiscV64/CpuDxe.c index 2af3b62234..cf50bc5f92 100644 --- a/UefiCpuPkg/CpuDxeRiscV64/CpuDxe.c +++ b/UefiCpuPkg/CpuDxeRiscV64/CpuDxe.c @@ -9,6 +9,8 @@ **/ #include "CpuDxe.h" +#include ^M +#include ^M // // Global Variables @@ -59,7 +61,7 @@ EFI_CPU_ARCH_PROTOCOL gCpu =3D { CpuGetTimerValue, CpuSetMemoryAttributes, 1, // NumberOfTimers - 4 // DmaBufferAlignment + 64 // DmaBufferAlignment^M }; // @@ -90,6 +92,21 @@ CpuFlushCpuDataCache ( IN EFI_CPU_FLUSH_TYPE FlushType ) { + PatchPcdSet64 (PcdRiscVFeatureOverride, 0x1);^M + switch (FlushType) {^M + case EfiCpuFlushTypeWriteBack:^M + WriteBackDataCacheRange ((VOID *)(UINTN)Start, (UINTN)Length);^M + break;^M + case EfiCpuFlushTypeInvalidate:^M + InvalidateInstructionCacheRange ((VOID *)(UINTN)Start, (UINTN)Length);^M + break;^M + case EfiCpuFlushTypeWriteBackInvalidate:^M + WriteBackInvalidateDataCacheRange ((VOID *)(UINTN)Start, (UINTN)Length);^M + break;^M + default:^M + return EFI_INVALID_PARAMETER;^M + }^M +^M return EFI_SUCCESS; } diff --git a/Platform/Sophgo/SG2042_EVB_Board/SG2042.dsc b/Platform/Sophgo/SG2042_EVB_Board/SG2042.dsc index 51ff89678c..e2e44ad619 100644 --- a/Platform/Sophgo/SG2042_EVB_Board/SG2042.dsc +++ b/Platform/Sophgo/SG2042_EVB_Board/SG2042.dsc @@ -389,6 +389,7 @@ [PcdsPatchableInModule] gSophgoSG2042PlatformPkgTokenSpaceGuid.PcdSG2042PhyAddrToVirAddr|0 + gEfiMdePkgTokenSpaceGuid.PcdRiscVFeatureOverride|0 ##########################################################################= ###### # @@ -500,7 +501,7 @@ # RISC-V Core module # UefiCpuPkg/CpuTimerDxeRiscV64/CpuTimerDxeRiscV64.inf - Silicon/Sophgo/SG2042Pkg/Override/UefiCpuPkg/CpuDxeRiscV64/CpuDxeRiscV64.in= f + UefiCpuPkg/CpuDxeRiscV64/CpuDxeRiscV64.inf MdeModulePkg/Universal/ResetSystemRuntimeDxe/ResetSystemRuntimeDxe.inf MdeModulePkg/Universal/FaultTolerantWriteDxe/FaultTolerantWriteDxe.inf diff --git a/Platform/Sophgo/SG2042_EVB_Board/SG2042.fdf b/Platform/Sophgo/SG2042_EVB_Board/SG2042.fdf index 844fc3eac0..9cbb1d3f65 100644 --- a/Platform/Sophgo/SG2042_EVB_Board/SG2042.fdf +++ b/Platform/Sophgo/SG2042_EVB_Board/SG2042.fdf @@ -77,7 +77,7 @@ INF Silicon/Sophgo/SG2042Pkg/Drivers/SdHostDxe/SdHostDxe.inf # RISC-V Core Drivers INF UefiCpuPkg/CpuTimerDxeRiscV64/CpuTimerDxeRiscV64.inf -INF Silicon/Sophgo/SG2042Pkg/Override/UefiCpuPkg/CpuDxeRiscV64/CpuDxeRiscV64.in= f +INF UefiCpuPkg/CpuDxeRiscV64/CpuDxeRiscV64.inf INF MdeModulePkg/Universal/FaultTolerantWriteDxe/FaultTolerantWriteDxe.in= f INF MdeModulePkg/Universal/Variable/RuntimeDxe/VariableRuntimeDxe.inf 3. Now the PCIe devices are in work order on PioneerBox. The CMO instructions are executed as expected. Reviewed-by: Jingyu Li On Mon, Oct 30, 2023 at 10:07=E2=80=AFPM Pedro Falcato wrote: > On Mon, Oct 30, 2023 at 9:38=E2=80=AFAM Laszlo Ersek = wrote: > > > > On 10/29/23 20:12, Pedro Falcato wrote: > > > On Sun, Oct 29, 2023 at 2:46=E2=80=AFPM Dhaval Sharma > wrote: > > >> > > >> Implement Cache Management Operations (CMO) defined by > > >> RISC-V spec https://github.com/riscv/riscv-CMOs. > > >> > > >> Notes: > > >> 1. CMO only supports block based Operations. Meaning cache > > >> flush/invd/clean Operations are not available for the entire > > >> range. In that case we fallback on fence.i instructions. > > >> 2. Operations are implemented using Opcodes to make them compiler > > >> independent. binutils 2.39+ compilers support CMO instructions. > > >> > > >> Test: > > >> 1. Ensured correct instructions are refelecting in asm > > > > > > nit: reflecting > > > > > >> 2. Not able to verify actual instruction in HW as Qemu ignores > > >> any actual cache operations. > > > > > > Do you have no way to test this in hardware? Since Rivos is a RISCV > > > vendor and all ;) > > > I don't like inviting the idea of merging CPU architectural changes > > > without actually testing them in something resembling real silicon > > > (i.e QEMU KVM is _fine_, QEMU TCG really isn't). > > > > > > > Hopefully I'm not drawing an incorrect parallel here, but, as I recall > > arm64 enablement in 2014, nearly all initial enablement in RHEL occurre= d > > on software emulators (ARM Foundation Model, ARM FVP, then QEMU TCG). > > You need to start somewhere. In particular, qemu-system-aarch64 was a > > huge step forward (performance-wise) once it *existed*, relative to the > > Foundation Model / FVP, even though qemu-system-aarch64 wouldn't emulat= e > > CPU caches (IIRC). > > Right. I don't know how faithful those early ARM simulators were, but > QEMU TCG is not very faithful and uarch details *can* slip through the > cracks. > In arm64 it's easy to miss a dsb or a isb if you're not extra careful > (or read the ARM ARM wrong). > > RISCV has a bunch of fun gotchas too. For instance, did you know you > need to flush the TLB using sfence.vma even when only mapping a page? > This "small" detail results in boot failures on real hardware (such as > the visionfive 2), but is completely silent in QEMU TCG. > > So this is why I would much prefer a test on real silicon. It's hard > to prove correctness when all you have is QEMU's spotty simulation > (rightfully so, it's not a simulator). > > -- > Pedro > --=20 Thanks! =3DD -=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#110392): https://edk2.groups.io/g/devel/message/110392 Mute This Topic: https://groups.io/mt/102256466/7686176 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io] -=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D- --000000000000848c720609002899 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
I am posting an update on behalf of Jingyu as he had troub= le with posting. CC'ing him here:
In summary what we have verified = so far:
  1. I have verified that instructions/op codes are ok= ay. I have also verified on Qemu that functionally it seems to be calling c= orrect instructions. Ensured with negative test cases that any other op cod= es=C2=A0do cause exceptions as expected.
  2. Jingyu was able to verify = the CpuFlushCpuDataCache function with this framework (he had to use custom= op code based on his soc implementation) on SG2042. There is one issue tha= t he=C2=A0is debugging now which is related to other cache instructions and= he will get back with more data. P.S. SG2042 does not implement the exact = same CMO opcodes but equivalent ones. So this experiment is just an additio= nal data point that helps verify the framework and not CMO itself.
  3. = In general it sounds like framework flows are alright and as long as instru= ctions do their job as claimed in the spec, it is lower risk.
Guess this is what we have so far. If it makes sense to everyone, we could= go ahead with merging with this *feature disabled by default* after Jingyu= provides clarity reg failures on SG2042 platform. Otherwise we can wait un= til newer Si is available where these exact instructions can be tested and = then upstreamed.

[From Jingyu]
I verifie= d this CMO framework on an actual HW platform.

SW:
edk2:=C2=A0https://github.com/rivosinc/edk2/tree/dev-rv-cmo-v7=C2=A0branch: dev-rv-cmo-v7
edk2-platforms:=C2=A0
https://g= ithub.com/sophgo/edk2-platforms=C2=A0 branch: sg2042-dev

HW:
= Milk-V Pioneer Box, a developer motherboard based on SG2042 with 64-Core T-= HEAD C920.

Attention:
The T-HEAD C920 implemented its own CM= O Extension and is different from the standard CMO Extension.

Test s= teps:
1. Modified the opcodes in RiscVasm.inc to accommodate the C920 = CMO feature.
diff --git a/MdePkg/Include/RiscV64/RiscVasm.inc = b/MdePkg/Include/RiscV64/RiscVasm.inc
index 29de735885..5df85fdb3= 1 100644
--- a/MdePkg/Include/RiscV64/RiscVasm.inc
+++ = b/MdePkg/Include/RiscV64/RiscVasm.inc
@@ -7,13 +7,13 @@
=C2=A0 */
=C2=A0
=C2=A0.macro RISCVCMOFLUSH
= -=C2=A0 =C2=A0 .word 0x25200f
+=C2=A0 =C2=A0 .long 0x0275000b^M
=C2=A0.endm
=C2=A0
=C2=A0.macro RISCVCMOINVALI= DATE
-=C2=A0 =C2=A0 .word 0x05200f
+=C2=A0 =C2=A0 .long= 0x0265000b^M
=C2=A0.endm
=C2=A0
=C2=A0.macro= RISCVCMOCLEAN
-=C2=A0 =C2=A0 .word 0x15200f
+=C2=A0 = =C2=A0 .long 0x0275000b^M
=C2=A0.endm
=C2=A02. We enable the CMO during the PCIe devices with DMA access to the memory= , just focus on the implementation of CpuFlushCpuDataCache based on the EFI= _CPU_ARCH_PROTOCOL.=C2=A0Except for PCIe, in other words, except for the cp= u->FlushDataCache, we do not use CMO. And the PCIe inbound only relates = to datacache.clean and datacache.invalidate.
diff --git a/Uefi= CpuPkg/CpuDxeRiscV64/CpuDxe.c b/UefiCpuPkg/CpuDxeRiscV64/CpuDxe.c
index 2af3b62234..cf50bc5f92 100644
--- a/UefiCpuPkg/CpuDxeRiscV= 64/CpuDxe.c
+++ b/UefiCpuPkg/CpuDxeRiscV64/CpuDxe.c
@@ = -9,6 +9,8 @@
=C2=A0**/
=C2=A0
=C2=A0#include = "CpuDxe.h"
+#include <Library/CacheMaintenanceLib.h&= gt;^M
+#include <Library/PcdLib.h>^M
=C2=A0
=
=C2=A0//
=C2=A0// Global Variables
@@ -59,7 +61,7 = @@ EFI_CPU_ARCH_PROTOCOL=C2=A0 gCpu =3D {
=C2=A0 =C2=A0CpuGetTime= rValue,
=C2=A0 =C2=A0CpuSetMemoryAttributes,
=C2=A0 =C2= =A01,=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 // NumberOfTimers
-=C2=A0 4=C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0// DmaBufferAlignment
+=C2=A0 64=C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0// Dm= aBufferAlignment^M
=C2=A0};
=C2=A0
=C2=A0//
@@ -90,6 +92,21 @@ CpuFlushCpuDataCache (
=C2=A0 =C2=A0I= N EFI_CPU_FLUSH_TYPE=C2=A0 =C2=A0 =C2=A0FlushType
=C2=A0 =C2=A0)<= /div>
=C2=A0{
+=C2=A0 PatchPcdSet64 (PcdRiscVFeatureOverride,= 0x1);^M
+=C2=A0 switch (FlushType) {^M
+=C2=A0 =C2=A0 = case EfiCpuFlushTypeWriteBack:^M
+=C2=A0 =C2=A0 =C2=A0 WriteBackD= ataCacheRange ((VOID *)(UINTN)Start, (UINTN)Length);^M
+=C2=A0 = =C2=A0 =C2=A0 break;^M
+=C2=A0 =C2=A0 case EfiCpuFlushTypeInvalid= ate:^M
+=C2=A0 =C2=A0 =C2=A0 InvalidateInstructionCacheRange ((VO= ID *)(UINTN)Start, (UINTN)Length);^M
+=C2=A0 =C2=A0 =C2=A0 break;= ^M
+=C2=A0 =C2=A0 case EfiCpuFlushTypeWriteBackInvalidate:^M
+=C2=A0 =C2=A0 =C2=A0 WriteBackInvalidateDataCacheRange ((VOID *)(UIN= TN)Start, (UINTN)Length);^M
+=C2=A0 =C2=A0 =C2=A0 break;^M
<= div>+=C2=A0 =C2=A0 default:^M
+=C2=A0 =C2=A0 =C2=A0 return EFI_IN= VALID_PARAMETER;^M
+=C2=A0 }^M
+^M
=C2=A0 =C2= =A0return EFI_SUCCESS;
=C2=A0}

diff --git a/Pl= atform/Sophgo/SG2042_EVB_Board/SG2042.dsc b/Platform/Sophgo/SG2042_EVB_Boar= d/SG2042.dsc
index 51ff89678c..e2e44ad619 100644
--- a/= Platform/Sophgo/SG2042_EVB_Board/SG2042.dsc
+++ b/Platform/Sophgo= /SG2042_EVB_Board/SG2042.dsc
@@ -389,6 +389,7 @@
=C2=A0=
=C2=A0[PcdsPatchableInModule]
=C2=A0 =C2=A0gSophgoSG20= 42PlatformPkgTokenSpaceGuid.PcdSG2042PhyAddrToVirAddr|0
+=C2=A0 g= EfiMdePkgTokenSpaceGuid.PcdRiscVFeatureOverride|0
=C2=A0
=C2=A0###################################################################= #############
=C2=A0#
@@ -500,7 +501,7 @@
=C2= =A0 =C2=A0# RISC-V Core module
=C2=A0 =C2=A0#
=C2=A0 = =C2=A0UefiCpuPkg/CpuTimerDxeRiscV64/CpuTimerDxeRiscV64.inf
-=C2= =A0 Silicon/Sophgo/SG2042Pkg/Override/UefiCpuPkg/CpuDxeRiscV64/CpuDxeRiscV6= 4.inf
+=C2=A0 UefiCpuPkg/CpuDxeRiscV64/CpuDxeRiscV64.inf
=C2=A0 =C2=A0MdeModulePkg/Universal/ResetSystemRuntimeDxe/ResetSystemRunt= imeDxe.inf
=C2=A0
=C2=A0 =C2=A0MdeModulePkg/Universal/F= aultTolerantWriteDxe/FaultTolerantWriteDxe.inf

diff --git a/Pla= tform/Sophgo/SG2042_EVB_Board/SG2042.fdf b/Platform/Sophgo/SG2042_EVB_Board= /SG2042.fdf
index 844fc3eac0..9cbb1d3f65 100644
--- a/P= latform/Sophgo/SG2042_EVB_Board/SG2042.fdf
+++ b/Platform/Sophgo/= SG2042_EVB_Board/SG2042.fdf
@@ -77,7 +77,7 @@ INF=C2=A0 Silicon/S= ophgo/SG2042Pkg/Drivers/SdHostDxe/SdHostDxe.inf
=C2=A0
= =C2=A0# RISC-V Core Drivers
=C2=A0INF=C2=A0 UefiCpuPkg/CpuTimerDx= eRiscV64/CpuTimerDxeRiscV64.inf
-INF=C2=A0 Silicon/Sophgo/SG2042P= kg/Override/UefiCpuPkg/CpuDxeRiscV64/CpuDxeRiscV64.inf
+INF=C2=A0= UefiCpuPkg/CpuDxeRiscV64/CpuDxeRiscV64.inf
=C2=A0
=C2= =A0INF=C2=A0 MdeModulePkg/Universal/FaultTolerantWriteDxe/FaultTolerantWrit= eDxe.inf
=C2=A0INF=C2=A0 MdeModulePkg/Universal/Variable/RuntimeD= xe/VariableRuntimeDxe.inf
=C2=A0
3. Now the PCIe= devices are in work order on PioneerBox. The=C2=A0CMO instructions are exe= cuted as expected.

Reviewed-by: Jingyu Li <jingyu.li01@sophgo.com>
<= /div>

On Mon, Oct 30, 2023 at 10:07=E2=80=AFPM Pedro Falcato <pedro.falcato@gmail= .com> wrote:
On Mon, Oct 30, 2023 at 9:38=E2=80=AFAM Laszlo Ersek <lersek@redhat.com> wrote: >
> On 10/29/23 20:12, Pedro Falcato wrote:
> > On Sun, Oct 29, 2023 at 2:46=E2=80=AFPM Dhaval Sharma <dhaval@rivosinc.com&g= t; wrote:
> >>
> >> Implement Cache Management Operations (CMO) defined by
> >> RISC-V spec https://github.com/riscv/riscv-CMOs= .
> >>
> >> Notes:
> >> 1. CMO only supports block based Operations. Meaning cache > >>=C2=A0 =C2=A0 flush/invd/clean Operations are not available fo= r the entire
> >>=C2=A0 =C2=A0 range. In that case we fallback on fence.i instr= uctions.
> >> 2. Operations are implemented using Opcodes to make them comp= iler
> >>=C2=A0 =C2=A0 independent. binutils 2.39+ compilers support CM= O instructions.
> >>
> >> Test:
> >> 1. Ensured correct instructions are refelecting in asm
> >
> > nit: reflecting
> >
> >> 2. Not able to verify actual instruction in HW as Qemu ignore= s
> >>=C2=A0 =C2=A0 any actual cache operations.
> >
> > Do you have no way to test this in hardware? Since Rivos is a RIS= CV
> > vendor and all ;)
> > I don't like inviting the idea of merging CPU architectural c= hanges
> > without actually testing them in something resembling real silico= n
> > (i.e QEMU KVM is _fine_, QEMU TCG really isn't).
> >
>
> Hopefully I'm not drawing an incorrect parallel here, but, as I re= call
> arm64 enablement in 2014, nearly all initial enablement in RHEL occurr= ed
> on software emulators (ARM Foundation Model, ARM FVP, then QEMU TCG).<= br> > You need to start somewhere. In particular, qemu-system-aarch64 was a<= br> > huge step forward (performance-wise) once it *existed*, relative to th= e
> Foundation Model / FVP, even though qemu-system-aarch64 wouldn't e= mulate
> CPU caches (IIRC).

Right. I don't know how faithful those early ARM simulators were, but QEMU TCG is not very faithful and uarch details *can* slip through the
cracks.
In arm64 it's easy to miss a dsb or a isb if you're not extra caref= ul
(or read the ARM ARM wrong).

RISCV has a bunch of fun gotchas too. For instance, did you know you
need to flush the TLB using sfence.vma even when only mapping a page?
This "small" detail results in boot failures on real hardware (su= ch as
the visionfive 2), but is completely silent in QEMU TCG.

So this is why I would much prefer a test on real silicon. It's hard to prove correctness when all you have is QEMU's spotty simulation
(rightfully so, it's not a simulator).

--
Pedro


--
Thanks!
=3DD
_._,_._,_

Groups.io Links:

=20 You receive all messages sent to this group. =20 =20

View/Reply Online (#110392) | =20 | Mute= This Topic | New Topic
Your Subscriptio= n | Contact Group Owner | Unsubscribe [rebecca@openfw.io]

_._,_._,_
--000000000000848c720609002899--