Signing error trying to flash nvme drive on a orin nano 8gb devkit #1794

sgstreet · 2025-01-06T03:01:11Z

Describe the bug
I'm new to OE4T and I'm trying to setup a terga-demo-distro project running on an nvme0, I set and built demo-image-base using the jetson-orin-nano-devkit-nvme machine. When I try to use initrd-flash to write to the nvme device I get the following errors:

== Step 1: Signing binaries at 2025-01-05T18:45:40-08:00 ==
ERR: chip_info.bin_bak missing after dumping boardinfo
ERR: signing failed at 2025-01-05T18:45:42-08:00

I seem to be missing something but I'm unsure what?

To Reproduce
Steps to reproduce the behavior:

Build meta-tegrademo branch 'master' with jetson-orin-nano-devkit-nvme
Build with bitbake argument 'demo-image-base'
Deploy to hardware with method 'initrd-flash'
I get an error

 ./initrd-flash
Starting at 2025-01-05T18:45:40-08:00
Machine:       jetson-orin-nano-devkit-nvme
Rootfs device: nvme0n1p1
Found Jetson device in recovery mode at USB 1-5
== Step 1: Signing binaries at 2025-01-05T18:45:40-08:00 ==
ERR: chip_info.bin_bak missing after dumping boardinfo
ERR: signing failed at 2025-01-05T18:45:42-08:00

I'm seeing the following on the console when running `initrd-flash'

0013.718] E> BLOCK_DEV: Failed to open blockdev.
[0013.723] E> LOADER: Failed to open blockdev 0(0).
[0013.728] E> LOADER: Failed to get storage info for binary 21 from loader.
[0013.735] C> LOADER: Could not read binary 21.
[0013.739] E> Failed to load MB2
[0013.742] C> Task 0x46 failed (err: 0x27228311)
[0013.747] E> Top caller module: MB2_PARAMS, error module: LOADER, reason: 0x11, aux_info: 0x83
[0013.755] C> Boot Info Table status dump :
0111100000111000110111111111000000011110000000000000011000001

The text was updated successfully, but these errors were encountered:

dwalkes · 2025-01-10T17:53:19Z

@sgstreet I attempted to reproduce today with my jetson-orin-nano-devkit-nvme setup including from reboot force-recovery as discussed in the monthly meeting and I couldn't reproduce.

I did notice this warning message I hadn't seen previously at step 1:

Rootfs device: nvme0n1p1
Found Jetson device in recovery mode at USB 1-2
== Step 1: Signing binaries at 2025-01-10T10:34:46-07:00 ==
Partition not found: A_cpu-bootloader

Full console log at:

Starting at 2025-01-10T10:34:46-07:00
Machine:       jetson-orin-nano-devkit-nvme
Rootfs device: nvme0n1p1
Found Jetson device in recovery mode at USB 1-2
== Step 1: Signing binaries at 2025-01-10T10:34:46-07:00 ==
Partition not found: A_cpu-bootloader
== Step 2: Boot Jetson via RCM at 2025-01-10T10:35:15-07:00 ==
Found Jetson device in recovery mode at USB 1-2
== Step 3: Sending flash sequence commands at 2025-01-10T10:35:19-07:00 ==
Waiting for USB storage device flashpkg from 054bb250........[/dev/sdc]
Device size in blocks: 262144
Unmounted /dev/sdc.
== Step 4: Writing partitions on external storage device at 2025-01-10T10:35:46-07:00 ==
Waiting for USB storage device nvme0n1 from 054bb250...[/dev/sdc]
Creating partitions
  [03] name=A_kernel start=0 size=262144 sectors
  [04] name=A_kernel-dtb start=0 size=1536 sectors
  [05] name=A_reserved_on_user start=0 size=64768 sectors
  [06] name=B_kernel start=0 size=262144 sectors
  [07] name=B_kernel-dtb start=0 size=1536 sectors
  [08] name=B_reserved_on_user start=0 size=64768 sectors
  [09] name=recovery start=0 size=163840 sectors
  [10] name=recovery-dtb start=0 size=1024 sectors
  [11] name=esp start=0 size=131072 sectors
  [12] name=recovery_alt start=0 size=163840 sectors
  [13] name=recovery-dtb_alt start=0 size=1024 sectors
  [14] name=esp_alt start=0 size=131072 sectors
  [15] name=UDA start=0 size=819200 sectors
  [16] name=reserved start=0 size=982016 sectors
  [01] name=APP start=0 size=29360128 sectors
  [02] name=APP_b start=0 size=29360128 sectors
Writing partitions
  Writing boot.img (size=41297920) to /dev/sdc3 (size=134217728)...
  Writing kernel_tegra234-p3768-0000+p3767-0005-nv.dtb (size=249497) to /dev/sdc4 (size=786432)...
  Writing boot.img (size=41297920) to /dev/sdc6 (size=134217728)...
  Writing kernel_tegra234-p3768-0000+p3767-0005-nv.dtb (size=249497) to /dev/sdc7 (size=786432)...
  Writing esp.img (size=67108864) to /dev/sdc11 (size=67108864)...
  Writing demo-image-base.ext4 (size=15032385536) to /dev/sdc1 (size=15032385536)...
  Writing demo-image-base.ext4 (size=15032385536) to /dev/sdc2 (size=15032385536)...
[OK: /dev/sdc]
== Step 5: Waiting for final status from device at 2025-01-10T10:37:18-07:00 ==

Here are the host and device logs for comparison
log.initrd-flash.2025-01-10-10.34.zip
device-logs-2025-01-10-10.34.46.tar.gz

Not sure what could be happening, but as discussed in the meeting I'd try another USB host controller if you have one as I know this has caused odd failures in the past.

If you want to try the same tegraflash file I built to verify your host flashing setup you can message me on element or via email and I'll send a link.

kekiefer · 2025-01-10T18:03:51Z

@sgstreet I attempted to reproduce today with my jetson-orin-nano-devkit-nvme setup including from reboot force-recovery as discussed in the monthly meeting and I couldn't reproduce.

One thing to consider was that it wasn't clear if an A/B setup was used unintentionally via NVIDIA's tools.

A reboot forced-recovery from a B slot of an A/B setup induces the error due to a mismatch between in-memory scratch register status and expectations during recovery. Here's a link to a thread on the NVIDIA forums the includes a fairly thorough investigation: https://forums.developer.nvidia.com/t/mb1-bl-crash-when-rebooting-to-rcm-from-b-slot/309503/13

dwalkes · 2025-01-10T18:10:17Z

True and thanks @kekiefer however we should use A/B by default on tegra-demo-distro and this should match NVIDIA's setup. We should also boot to the A slot on first boot. However nvbootctrl dump-slots-info might be a good diagnostic to check as well. I verified I ended up on the A slot on first boot despite the warning during initrd-flash about cpu bootloader

root@jetson-orin-nano-devkit-nvme:~# nvbootctrl dump-slots-info
Current version: 36.4.0
Capsule update status: 0
Current bootloader slot: A
Active bootloader slot: A
num_slots: 2
slot: 0,             status: normal
slot: 1,             status: normal
root@jetson-orin-nano-devkit-nvme:~#

kekiefer · 2025-01-10T18:15:45Z

That's not enough - reboot forced-recovery if you're starting on the A slot works fine. The problem happens when you run this from a B slot.

dwalkes · 2025-01-10T18:16:31Z

Yep understood, just don't understand how @sgstreet would have gotten into that situation without running a capsule update.

kekiefer · 2025-01-10T18:18:22Z

Ok yes, and I'm taking a logical leap in equating this issue with the one I've outlined, just because it manifests the same way.

To be clear, the connection I'm trying to make is that entering this flash was done from a B root, that was set up by NVIDIA's tools without @sgstreet being aware of it.

sgstreet · 2025-01-10T21:20:40Z

@kekiefer @dwalkes, sorry off line this morning. I rebuilding from scratch to try and eliminate operator errors. Can anyone point me at the the means to reflash the qspi? I want to ensure it is correct.

kekiefer · 2025-01-10T21:24:26Z

The initrd-flash script will take care of that for you

sgstreet · 2025-01-11T17:56:32Z

After some gnashing of teeth (no bootloader due to corruption by the operator - me), some random but unfounded concerns I let the magic smoke out of my board. I successful used initrd-flash to flash both the QSPI and a SD card with positive results. A working system!

The reported signing error is caused by, hold your breath, a incompatible SS USB3 port. Using an USB2 port works better. I'm sorry the newbie run around! Thank you for the hand holding!

Next up, flashing the NVME image.

sgstreet · 2025-01-11T18:20:21Z

Well I guess I lied. I successfully used ./doflash.sh not initrd-flash. I seeing some issue with initrd-flash logs. An assessment later this afternoon.

dwalkes · 2025-01-11T18:24:15Z

In the meantime I've started a troubleshooting page at /~https://github.com/OE4T/meta-tegra/wiki/Tegraflash-Troubleshooting as discussed in the meeting this week, attempting to list the suggested troubleshooting steps in rough priority order. Whatever we learn here might be a new entry in the list.

sgstreet · 2025-01-13T16:43:20Z

Closing this an operator error.

dwalkes · 2025-01-13T16:51:13Z

Thanks @sgstreet what was the issue? Any updates we should make to /~https://github.com/OE4T/meta-tegra/wiki/Tegraflash-Troubleshooting or the other wiki pages?

sgstreet · 2025-01-13T18:15:44Z

Thanks @sgstreet what was the issue? Any updates we should make to /~https://github.com/OE4T/meta-tegra/wiki/Tegraflash-Troubleshooting or the other wiki pages?

The issue was that all of my motherboard SS USB3 ports are not compatible with the tegra234 boot ROM USB stack. For clarity, the USB ports are all on my motherboard and not an external USB3 card. I tried both the CPU and chipset ports.

Any updates we should make to /~https://github.com/OE4T/meta-tegra/wiki/Tegraflash-Troubleshooting or the other wiki pages?

I suspect there will be more debug steps when I get the nvme flashing to work. Which I currently believe is another operator error. That's why I moved to gitter. Do you want this on the github discussion instead? I'm super flexible and need pointing the correct direction.

dwalkes · 2025-01-13T18:18:03Z

That's why I moved to gitter. Do you want this on the github discussion instead?

gitter is fine, if you can circle back to help update this one with the resolution when we have it and after I forget that would be great ;)

sgstreet closed this as completed Jan 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Signing error trying to flash nvme drive on a orin nano 8gb devkit #1794

Signing error trying to flash nvme drive on a orin nano 8gb devkit #1794

sgstreet commented Jan 6, 2025 •

edited

Loading

dwalkes commented Jan 10, 2025 •

edited

Loading

kekiefer commented Jan 10, 2025

dwalkes commented Jan 10, 2025

kekiefer commented Jan 10, 2025

dwalkes commented Jan 10, 2025

kekiefer commented Jan 10, 2025 •

edited

Loading

sgstreet commented Jan 10, 2025

kekiefer commented Jan 10, 2025

sgstreet commented Jan 11, 2025

sgstreet commented Jan 11, 2025

dwalkes commented Jan 11, 2025

sgstreet commented Jan 13, 2025

dwalkes commented Jan 13, 2025

sgstreet commented Jan 13, 2025

dwalkes commented Jan 13, 2025

Signing error trying to flash nvme drive on a orin nano 8gb devkit #1794

Signing error trying to flash nvme drive on a orin nano 8gb devkit #1794

Comments

sgstreet commented Jan 6, 2025 • edited Loading

dwalkes commented Jan 10, 2025 • edited Loading

kekiefer commented Jan 10, 2025

dwalkes commented Jan 10, 2025

kekiefer commented Jan 10, 2025

dwalkes commented Jan 10, 2025

kekiefer commented Jan 10, 2025 • edited Loading

sgstreet commented Jan 10, 2025

kekiefer commented Jan 10, 2025

sgstreet commented Jan 11, 2025

sgstreet commented Jan 11, 2025

dwalkes commented Jan 11, 2025

sgstreet commented Jan 13, 2025

dwalkes commented Jan 13, 2025

sgstreet commented Jan 13, 2025

dwalkes commented Jan 13, 2025

sgstreet commented Jan 6, 2025 •

edited

Loading

dwalkes commented Jan 10, 2025 •

edited

Loading

kekiefer commented Jan 10, 2025 •

edited

Loading