I live where a power outage, while not extremely common, happens often enough that I should have anything remotely valuable or sensitive hooked up to a UPS (not merely a surge protector). However, I recently relocated a bunch of my electronics, and my Talos II-based workstation currently lives in my home office without the benefit of a UPS.
I had a power outage that lasted only about 5 minutes, and appeared to be a straight blackout/cut followed by restoration. (Maintenance, cutover of a damaged circuit, whatever, no idea what happened)
This then, unfortunately, left my system unwilling to boot. The BMC LED on the mainboard would turn on at a quick flash when applying power, but the power buttons did not respond and the system would not completely turn on.
This implied the BMC, which controls power sequencing and bringup, was not functioning. After some troubleshooting to try to get a network connection into the BMC, it became apparent the BMC was at fault, somehow.
Eventually, that led me to this page on the Raptor Computing Wiki: https://wiki.raptorcs.com/wiki/Debricking_the_BMC describing the steps needed to restore functionality.
Firstly, I needed to get a serial connection to the BMC itself, which required ordering a bracket. It appears that no vendor actually specifies what pinout their brackets use, and at least on the Amazon listing I did order, the reviewer stated the pinout incorrectly.
Some jumper wires later, I got the serial connection hooked up and was able to see that on applying power, I got nothing but the SoC reset output. No u-boot output, nothing except the SoC continuously resetting.
So now I needed to find a way to reflash u-boot. The above-linked Wiki page shows a link to ASpeed (the SoC manufacturer) for a download of a "socflash" utility. However, in their desire to be as unfriendly as possible to anyone who won't pay them money, they've removed that tool from public download, and reserved it only for developer accounts (which require a sales contact to get).
There's a one-sentence statement that you can use flashrom to write the chip, with an external programmer, so that seemed like my best option, as I have a Segger J-Link which flashrom supports as a SPI programmer.
Now, to find the information I need on how to flash that chip. Since this is all fairly open, the chip is socketed, which means I was able to remove the flash chip for the BMC, and look up the pinout for it, and match that up against the flashrom documentation for using the J-link as a programmer.
|Flash Chip Pin||J-Link Pin||Comments|
|15 (SI)||5 (TDI)|
|8 (SO)||13 (TDO)|
|15 (SCLK)||9 (TCK)|
|7 (CS#)||15 (RESET)|
|2 (VCC)||1 (VTREF)||Also connected to 3.3V bench supply|
|10 (GND)||4 (GND)|
Also of note is that when connecting pins on the J-link using an IDC cable, the numbering is mirrored relative to the key on the connector. (Pin numbering is for the pins on the J-Link itself)
With this connected, I was able to convince flashrom to read the existing image from the flash and verify that connectivity was working and the chip was recognized:
$ ./flashrom/flashrom -p jlink_spi -r bmc-rom flashrom v1.1-rc1-115-g61e16e5-dirty on Linux 5.3.0-gentoo (x86_64) flashrom is free software, get the source code at https://flashrom.org Using clock_gettime for delay loops (clk_id: 1, resolution: 1ns). Found Macronix flash chip "MX25L25635F/MX25L25645G" (32768 kB, SPI) on jlink_spi. Reading flash... done.
Now I just needed the flash image and some idea of how to write it.
Downloading the 1.07 firmware from the wiki, I ended up with a few files, and didn't know what to do with them.
Thankfully a few folks on the #talos-workstation channel on FreeNode were able to sort me out, providing the necessary context for me to know that the image-bmc file in the firmware download is the entire flash image combined, along with providing a link to the kernel DTS (https://git.anastas.io/dormito/br-blackbird-external/src/branch/raw-first-pass/board/bangBMC/blackbird.dts) which shows the range/offsets of the flash regions on the chip.
(Thanks to mearon, dormito, and hanetzer)
Thus I was able to construct a flashrom layout file:
00000000:0005ffff u-boot 00060000:0007ffff u-boot-env 00080000:0067ffff kernel_a 00680000:0077ffff dev-data 00780000:00d7ffff kernel_b 00d80000:01ffffff rwfs
Caveat with the above, I only tested writing to the u-boot section, so please use at your own risk and with a full understanding of what you are trying to accomplish.
And with some trepidation, write the u-boot section with the following command:
$ ./flashrom/flashrom -p jlink_spi -l bmc.layout -i u-boot -w image-bmc flashrom v1.1-rc1-115-g61e16e5-dirty on Linux 5.3.0-gentoo (x86_64) flashrom is free software, get the source code at https://flashrom.org Using region: "u-boot". Using clock_gettime for delay loops (clk_id: 1, resolution: 1ns). Found Macronix flash chip "MX25L25635F/MX25L25645G" (32768 kB, SPI) on jlink_spi. Reading old flash chip contents... done. Erasing and writing flash chip... Erase/write done. Verifying flash... VERIFIED.
Leading to success, thankfully.
A bit of careful re-seating of the chip into the motherboard socket and I was able to get a fully booted BMC, and the rest of my system again, no worse for wear otherwise.