Assertion issue in UEFI during boot (original) (raw)
December 4, 2024, 4:26am 1
There are several topics relating to assertion issues in Jetpack 5 and 6.
We create this post to share some possible fixes and the verification steps.
Possible Errors
Assertion 1.
ASSERT [VariableStandaloneMm] /dvs/git/dirty/git-master_linux/out/nvidia/optee.t234-uefi/StandaloneMmOptee_RELEASE/edk2/MdeModulePkg/Universal/Variable/RuntimeDxe/Variable.c(3264): !(((INTN)(RETURN_STATUS)(Status)) < 0)
Fix:
diff --git a/core/arch/arm/kernel/stmm_sp.c b/core/arch/arm/kernel/stmm_sp.c
index 1d6344d..6441c03 100644
--- a/core/arch/arm/kernel/stmm_sp.c
+++ b/core/arch/arm/kernel/stmm_sp.c
@@ -77,7 +77,7 @@
static const uint16_t ffa_storage_id = 4U;
static const unsigned int stmm_stack_size = 4 * SMALL_PAGE_SIZE;
-static const unsigned int stmm_heap_size = 750 * SMALL_PAGE_SIZE;
+static const unsigned int stmm_heap_size = 1024 * SMALL_PAGE_SIZE;
static const unsigned int stmm_sec_buf_size = 21 * SMALL_PAGE_SIZE;
static const unsigned int stmm_ns_comm_buf_size = 21 * SMALL_PAGE_SIZE;
→ Has been fixed in both JP6.0(R36.3.0) and JP5.1.4(R35.6.0).
Assertion 2.
ASSERT [FvbNorFlashStandaloneMm] /dvs/git/dirty/git-master_linux/out/nvidia/optee.t234-uefi/StandaloneMmOptee_RELEASE/edk2-nvidia/Silicon/NVIDIA/Drivers/FvbNorFlashDxe/FvbNorFlashStandaloneMm.c(868): ((BOOLEAN)(0==1))
Fix:
- fix(stmm): allow measurement partition to be zero filled · NVIDIA/edk2-nvidia@e4c86ce · GitHub
- fix: reset the meas buffer after computing the first measurement · NVIDIA/edk2-nvidia@615288a · GitHub
→ Has been fixed in both JP6.0(R36.3.0) and JP5.1.4(R35.6.0)
Assertion 3.
ASSERT [FvbNorFlashStandaloneMm] /workspace_SSD/alex/edk2_nvidia/nvidia-uefi-r35.5.0/edk2-nvidia/Silicon/NVIDIA/Drivers/FvbNorFlashDxe/FvbNorFlashStandaloneMm.c(978): ((BOOLEAN)(0==1))
Or
ASSERT [FvbNorFlashStandaloneMm] /out/nvidia/optee.t234-uefi/StandaloneMmOptee_RELEASE/edk2-nvidia/Silicon/NVIDIA/Drivers/FvbNorFlashDxe/FvbNorFlashStandaloneMm.c(937): ((BOOLEAN)(0==1))
Fix:
Varint readfix r35.5.0 by gmahadevan · Pull Request #110 · NVIDIA/edk2-nvidia · GitHub
→ Has been fixed in JP6.2(R36.4.3), but not included in JP5.1.5(R35.6.1)
Steps to apply the change
Step 1: apply the patch to correct source file
Step 2: run the following command to build uefi_StandaloneMmOptee_RELEASE.bin
$ edk2_docker edk2-nvidia/Platform/NVIDIA/StandaloneMmOptee/build.sh
Step 3: refer to the steps in atf_and_optee_README.txt to build tos image and update tos image in <Linux_for_Tegra>/bootloader/tos-optee_t<194 or 234>.img
Step 4: flash the QSPI only to apply the change (e.g. Orin Nano devkit with JP6.0)
$ sudo ./flash.sh -c bootloader/generic/cfg/flash_t234_qspi.xml jetson-orin-nano-devkit internal
Verification steps
Step 1: Enter UEFI Menu.
Step 2: Select Device Manager → NVIDIA Configuration → Reset Setting.
Step 3: Go back top menu and press Reset to Exit.
Step 4: Check if it can boot with successful
Repeat Step 1 to 4 about 5 times to check if there's assertion issue.
Notes
- They can be applied to both T194 and T234 series which uses UEFI as bootloader
- We would suggest user applying all patches above since the assertion issue can only be recovered through reflashing (i.e. reboot cannot help)
- Please let us know if there’s more assertion issue and we can update this post for them.
- All related fixes have been included in Jetpack 6.2(R36.4.3) so that user can simply upgrade to this release to have all of them fixed.
We have been appling all of above patch and generate tos_optee_t234.img.
After reflashing the Jetson orin nano board and we test by rebooting many times, it recurs again.
▒▒I/TC: Reserved shared memory is disabled
I/TC: Dynamic shared memory is enabled
I/TC: Normal World virtualization support is disabled
I/TC: Asynchronous notifications are disabled
ASSERT [FvbNorFlashStandaloneMm] /build/nvidia-uefi-r35.5.0-updates/edk2-nvidia/Silicon/NVIDIA/Drivers/FvbNorFlashDxe/FvbNorFlashStandaloneMm.c(978): ((BOOLEAN)(0==1))
Don’t mind the folder 35.5, the branch of code have checked out in 35.6
KevinFFF December 13, 2024, 9:15am 6
Hi LitchiCheng,
Do you refer to atf_and_optee_README.txt and specify uefi_StandaloneMmOptee_RELEASE.bin which you built before you generate tos image?
Could you also add some custom log in the source to check if you’ve applied the change correctly?
You can also check the full serial log to confirm if you are in correct slot in case you update the tos image in slot A but you are using slot B.
jameskuo December 16, 2024, 8:12am 7
Hi KevinFFF:
I saw another post is also about Assertion 3, mentioned in Nvidia Orin AGX 64 GB not booting anymore - #18 by WayneWWW
I want to check that the solution marked in that post will not do the trick. After building the code, we need to flash not only the uefi_RELEASE.bin, but also the tos-optee_t*.img?
Many Thanks.
KevinFFF December 16, 2024, 8:39am 8
Actually, the fix is included in tos image so that you can just update tos image rather than UEFI binary.
You have to refer to the steps in atf_and_optee_README.txt which would instruct you how to build tos image.
There’s a necessary step as following before building OP-TEE.
$ export UEFI_STMM_PATH=<UEFI source>/images/uefi_StandaloneMmOptee_RELEASE.bin
Trying to flash from the host machine after following the instructions @jameskuo graciously posted. The flash command results in an error:
$ cd nvidia/nvidia_sdk/JetPack_6.1_Linux_JETSON_AGX_ORIN_TARGETS/Linux_for_Tegra/
$ sudo ./flash.sh -c bootloader/generic/cfg/flash_t234_qspi.xml jetson-agx-orin-devkit internal
###############################################################################
# L4T BSP Information:
# R36 , REVISION: 4.0
# User release: 0.0
###############################################################################
ECID is
Error: probing the target board failed.
Make sure the target board is connected through
USB port and is in recovery mode.
Tried using both USB micro and USB C cables in recovery mode, lsusb shows the device properly:
Bus 003 Device 009: ID 0955:7020 NVIDIA Corp. L4T (Linux for Tegra) running on Tegra
Is there something else I should be doing?
jameskuo December 23, 2024, 9:46am 10
Hi:
You didn’t make Orin into recovery mode. It should be NVIDIA Corp.
when you run lsusb
to check if the orin is connected or not.
@jameskuo - Thanks! that worked.
I’m still not clear (as per your comment) if the flashing command is enough, as per your instructions we’re only changing the uefi_jetson.bin
and not the tos-optee* image.
@KevinFFF - Can you confirm if the steps listed by @jameskuo here are enough to fix the issue?
jameskuo December 24, 2024, 1:17am 12
Hi:
No, that didn’t help as the Kevin said in 6f of the post.
Any timeline on when this will be merged and released? We are running into the FvbNorFlashStandaloneMm error every 20 or so boots.
KevinFFF January 17, 2025, 6:41am 14
Hi AlexKlimaj,
I believe Jetpack 6.2(L4T R36.4.3) released yesterday should include the fix for this issue.
If you are using the devkit, please give it a try to flash through SDK manager.
Hi Kevin,
We flashed the updated TOS image and were trying to replicate the UEFI ASSERT crash issue and verify the fix.
But, We are not able to execute step no 3 in the above steps provided.
Once we choose and press enter key on “Reset Setting”, the key board does not response any more.
we cannot go back to Top Menu? we are doing manual reset of the unit, instead.
Any idea why?
Thanks
KevinFFF January 21, 2025, 6:38am 16
Do you hit this issue after you update the TOS image?
Please check if you can perform above steps w/o any changes.
yes.
Ok.
You mean, we need to reflash back the old( default tos.img ) tos image file and try these steps?
- After flashing the QSPI with old TOS img file, the unit is always entering the UEFI Shell ( Note: I had kept updated StandaloneSTMM bin file under /bootloader directory, is this causing the issue?) and it not booting further.
please find the serial log for the same attached below:
- We tried, “Reset Setting” option, but again it is not booting.
Let us know how to fix this issue.
- Also we are facing NVMe flash issue, I have raised a seperate thread for that. Please let us know, how to fix these issues as soon as possible.
- Also I tried flashing 36.4.3 latest Jetson Linux version( Note: I have downloaded this version on USB SSD drive as we have shoratge of space on the hard disk in the Host PC) on Orin NX 16 GB and it flashed fine. Unit booted fully normally.
But we want to configure this Orin NX as EP device, so I modified the ‘jetson-orin-nano-devkit.conf’, file as per the below link and tried reflashing, but we are getting flashing error in the middle, saying “No file found /dev/nvme01”. Any idea why?
Thanks.
KevinFFF January 22, 2025, 4:00am 19
Hi Kevin,
I’ve followed the instructions at the top of the post. I ran into the issue noted in Assertion 3 at the top of this thread on a Jetson AGX Orin 64 GB devkit running JP 6.0. I applied the patch Varint readfix as it was noted that this not contained in the JP 6.0 release.
I was able to build all images successfully, and flash QSPI after overwriting the tos-optee_t234.img
in the bsp bootloader directory.
I am running into errors when attempting to boot, it seems to crash pretty early in the UEFI boot sequence. Here is abbreviated output showing the error:
I/TC: Test OEM keys are being used. This is insecure for shipping products!
I/TC: Primary CPU switching to normal world boot
��
Jetson UEFI firmware (version 36.3.0-gcid-36191598 built on 2024-05-06T16:58:59+00:00)
I/TC: Reserved shared memory is disabled
I/TC: Dynamic shared memory is enabled
I/TC: Normal World virtualization support is disabled
I/TC: Asynchronous notifications are disabled
E/TC:?? 00
E/TC:?? 00 User mode data-abort at address 0x40003ffc (translation fault)
E/TC:?? 00 esr 0x92000007 ttbr0 0x200103c236000 ttbr1 0x00000000 cidr 0x0
E/TC:?? 00 cpu #0 cpsr 0x80000000
E/TC:?? 00 x0 00000000405c8000 x1 0000000000000270
E/TC:?? 00 x2 000000001400bc00 x3 0000000000000000
E/TC:?? 00 x4 0000000000000000 x5 0000000000000000
E/TC:?? 00 x6 0000000000000000 x7 0000000000000000
E/TC:?? 00 x8 0000000000000000 x9 0000000000000000
E/TC:?? 00 x10 0000000000000000 x11 0000000000000000
E/TC:?? 00 x12 0000000000000000 x13 0000000000000000
E/TC:?? 00 x14 0000000000000000 x15 0000000000000000
E/TC:?? 00 x16 0000000000000000 x17 0000000000000000
E/TC:?? 00 x18 0000000000000000 x19 0000000000000000
E/TC:?? 00 x20 0000000000000000 x21 0000000000000000
E/TC:?? 00 x22 0000000000000000 x23 0000000000000000
E/TC:?? 00 x24 0000000000000000 x25 0000000000000000
E/TC:?? 00 x26 0000000000000000 x27 00000000edfe0dd0
E/TC:?? 00 x28 0000000000000000 x29 0000000040003ffc
E/TC:?? 00 x30 0000000000000000 elr 0000000040033010
E/TC:?? 00 sp_el0 00000000405c8000
E/TC:?? 00 region 0: va 0x0000000040000000 pa 0x000000103c082000 size 0x002000 flags ---R-X
E/TC:?? 00 region 1: va 0x0000000040002000 pa 0x000000103c209000 size 0x001000 flags ---RW-
E/TC:?? 00 region 2: va 0x0000000040004000 pa 0x000000103c480000 size 0x1c0000 flags r-xR--
E/TC:?? 00 region 3: va 0x00000000401c4000 pa 0x000000103c640000 size 0x419000 flags rw-RW-
E/TC:?? 00 region 4: va 0x00000000405dd000 pa 0x000000103ca59000 size 0x015000 flags rw-RW-
E/TC:?? 00 region 5: va 0x00000000405f2000 pa 0x000000000c198000 size 0x001000 flags rw----
E/TC:?? 00 region 6: va 0x00000000405f3000 pa 0x0000000003270000 size 0x010000 flags rw----
E/TC:?? 00 region 7: va 0x0000000040603000 pa 0x000000000c390000 size 0x002000 flags rw----
ERROR: Exception reason=0 syndrome=0xbe000011
ERROR: **************************************
ERROR: RAS Uncorrectable Error in IOB, base=0xe010000:
ERROR: Status = 0xec000612
ERROR: SERR = Error response from slave: 0x12
ERROR: IERR = CBB Interface Error: 0x6
ERROR: Overflow (there may be more errors) - Uncorrectable
ERROR: MISC0 = 0xc4460040
ERROR: MISC1 = 0xa4c860000000000
ERROR: MISC2 = 0x0
ERROR: MISC3 = 0x0
ERROR: ADDR = 0x8000000003270000
ERROR: **************************************
ERROR: sdei_dispatch_event returned -1
ERROR: **************************************
ERROR: RAS Uncorrectable Error in ACI, base=0xe01a000:
ERROR: Status = 0xe8000904
ERROR: SERR = Assertion failure: 0x4
ERROR: IERR = FillWrite Error: 0x9
ERROR: Overflow (there may be more errors) - Uncorrectable
ERROR: ADDR = 0x8000000003270000
ERROR: **************************************
ERROR: sdei_dispatch_event returned -1
ERROR: Powering off core
Any thoughts on what might be wrong here?
I’ve attached the logs from updating the QSPI flash as well as the output of booting the unit after the update:
jetson-r36.3.0-flash-uart.txt (71.5 KB)
jetson-r36.3.0-patched-tos-img-boot.txt (37.4 KB)
KevinFFF February 4, 2025, 1:56am 21
Hi schubertseth,
We didn’t hit this RAS error when we were verifying the patch.
It might be caused from something wrong during applying the change.
For AGX Orin devkit, you can simply update to Jetpack 6.2(r36.4.3) which should include all fixes.
Hi Kevin,
I saw that the Varint readfix
was in 6.2, and while that is an option, patching 6.0 would be ideal as it would save my lab quite a bit of time.
We have a bit of custom setup we do after flashing a dev kit (we have multiple kits to fix in this case). Patching would allow us to preserve that configuration rather than starting over.
Are there any suggested steps to trouble shoot this RAS error?