Openbmc archive mirror
 help / color / mirror / Atom feed
From: Oskar Senft <osk@google.com>
To: OpenBMC Maillist <openbmc@lists.ozlabs.org>
Cc: Ed Tanous <edtanous@google.com>, Ali El-Haj-Mahmoud <aaelhaj@google.com>
Subject: ipmitool fru write 0 - does not update "baseboard" FRU
Date: Thu, 2 May 2024 14:31:51 -0400	[thread overview]
Message-ID: <CABoTLcQvFfJPhM=W12Z-QnqRzkRt5TE+g0vunw4MB=aYuo-GPA@mail.gmail.com> (raw)

Hi everyone

tl;dr - Attempting "ipmitool fru write" with an input file that
contains additional bytes beyond the actual FRU data does not actually
update the FRU on OpenBMC at head w/ entity-manager.

Details:

I ran into an issue with updating the "baseboard" FRU (0), which is
stored as /etc/fru/baseboard.fru.bin. However, I don't think this is
specific to FRU 0 and could apply to other updateable FRUs in the same
way.


1. Start off with a "minimal" /etc/fru/baseboard.fru.bin which just
contains chassis type (that's required for entity-manager's fru-device
to recognize the file.

root@akita:~# ls -al /etc/fru/baseboard.fru.bin
-rw-r--r--    1 root     root           512 Sep 21 05:21
/etc/fru/baseboard.fru.bin

root@akita:~# hexdump -C /etc/fru/baseboard.fru.bin
00000000  01 00 01 00 00 00 00 fe  01 01 17 c0 c0 c1 00 a6  |................|
00000010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000200

root@akita:~# ipmitool fru print 0
 Chassis Type          : Rack Mount Chassis
 Chassis Area Checksum : OK


2. Prepare a FRU file with additional data, e.g. with frugy:

(frugy) osk@osk:~$ cat demo.yml
ChassisInfo:
  type: 23
  part_number: '4711'
  serial_number: '12345'

(frugy) osk@osk:~$ frugy demo.yml -o demo.bin

(frugy) osk@osk:~$ ls -al demo.bin
-rw-r----- 1 osk primarygroup 24 May  2 13:51 demo.bin

(frugy) osk@osk:~$ hexdump -C demo.bin
00000000  01 00 01 00 00 00 00 fe  01 02 17 c4 34 37 31 31  |............4711|
00000010  c5 31 32 33 34 35 c1 d0                           |.12345..|
00000018

Note that frugy generates a minimal binary by default. However, other
processes may not and instead produce a fixed size binary blob. This
happens, e.g. when performing "ipmitool fru read" which returns the
whole contents of the underlying storage. A subsequent "ipmitool fru
write" with that blob will fail as explained here.

To simulate this here, increase the file to 256 bytes:

(frugy) osk@osk:~$ cp demo.bin demo-256.bin
(frugy) osk@osk:~$ dd if=/dev/zero bs=1 seek=256 count=0 of=demo-256.bin
0+0 records in
0+0 records out
0 bytes copied, 5.1138e-05 s, 0.0 kB/s

(frugy) osk@osk:~$ ls -al demo-256.bin
-rw-r----- 1 osk primarygroup 256 May  2 13:57 demo-256.bin

(frugy) osk@osk:~$ hexdump -C demo-256.bin
00000000  01 00 01 00 00 00 00 fe  01 02 17 c4 34 37 31 31  |............4711|
00000010  c5 31 32 33 34 35 c1 d0  00 00 00 00 00 00 00 00  |.12345..........|
00000020  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000100


3. Write the newly generated FRU:

root@akita:~# ipmitool fru write 0 demo-256.bin
Fru Size         : 512 bytes
Size to Write    : 256 bytes

Wait ~10 seconds for the fru-device process to reload the contents.


4. Re-read the FRU contents:

root@akita:~# ipmitool fru print 0
 Chassis Type          : Rack Mount Chassis
 Chassis Area Checksum : OK


5. For comparison, write only the minimal FRU:

root@akita:~# ipmitool fru write 0 demo.bin
Fru Size         : 512 bytes
Size to Write    : 24 bytes

Wait ~10 seconds.

root@akita:~# ipmitool fru print 0
 Chassis Type          : Rack Mount Chassis
 Chassis Part Number   : 4711
 Chassis Serial        : 12345
 Chassis Area Checksum : OK


I dug into this and found that this is caused by an interaction
between https://github.com/openbmc/phosphor-host-ipmid/blob/master/dbus-sdr/storagecommands.cpp
and https://github.com/openbmc/entity-manager/blob/master/src/fru_device.cpp.

Here's what happens:
- The "ipmitool fru write" request is handled by storagecommands.cpp
ipmiStorageWriteFruData.

- ipmiStorageWriteFruData receives the whole FRU blob from IPMI across
several calls. On the initial call, it requests the current FRU blob
via D-bus from fru-device and caches it in the process. It then
overwrites sections of that cache with the received data from IPMI.

- ipmiStorageWriteFruData uses information from the FRU header to
check whether it received all the bytes that make up the new FRU. Note
that this could be fewer bytes than the size of the cached blob. Once
it receives all the bytes for the new FRU, it calls
/xyz/openbmc_project/FruDevice WriteFru via D-Bus with the full FRU
blob (i.e. the full cache with modifications on top). After that the
cache is cleared.

- The D-Bus call to WriteFru is handled by fru_device.cpp writeFRU. It
first performs a sanity check and then writes the blob to the
underlying device (or the /etc/fru/baseboard.fru.bin for FRU 0). It
subsequently calls rescanBusses() which actually executes after about
1 second.

- Meanwhile, "ipmitool fru write" continues to happily send additional
bytes to ipmiStorageWriteFruData (as in step #3 above). Since its
cache has just been cleared, it retrieves the current FRU from
fru-device again. However, since that has not yet completed
"rescanBusses", it actually received the original FRU again, not the
modified version. The above cycle repeats, but this time the original
FRU with additional modifications from the additional bytes.

In the best case (if the new FRU data is larger than the original FRU
data), the result is that the FRU did not get updated at all, since we
effectively just wrote back the original FRU. However, if the new FRU
data is smaller than the original FRU data, this may result in
corrupted FRU data to be persisted.


I was trying to figure out how to best fix that, but couldn't come up
with a design that would still work. Some ideas:

A)  I think what we're missing is feedback from fru-device to ipmid
that the FRU write operation has actually completed and the FRU data
was re-read. The ipmid should probably not accept any additional write
requests until the previous write request has completed and the new
FRU data is available.

B) If ipmiStorageWriteFruData cannot detect the expected size of the
FRU data, it relies on a timeout to eventually write the blob if no
more data was received from IPMI. I wonder if this mechanism is "too
clever" and we should simply always just wait for the timeout?

C) ipmiStorageWriteFruData clears the cache immediately after WriteFru
was called. Maybe we should keep that cache around until the timeout
expires?

Thoughts?

Thanks
Oskar.

             reply	other threads:[~2024-05-02 18:33 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-02 18:31 Oskar Senft [this message]
2024-05-03 15:38 ` ipmitool fru write 0 - does not update "baseboard" FRU Willy Tu
2024-05-07 17:32   ` Oskar Senft
2024-05-07 23:49     ` Willy Tu
2024-05-08  0:04       ` Oskar Senft
2024-05-09 16:17         ` Willy Tu
2024-05-20 13:03           ` Oskar Senft

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CABoTLcQvFfJPhM=W12Z-QnqRzkRt5TE+g0vunw4MB=aYuo-GPA@mail.gmail.com' \
    --to=osk@google.com \
    --cc=aaelhaj@google.com \
    --cc=edtanous@google.com \
    --cc=openbmc@lists.ozlabs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).