If You Use HPE SAS Drives You Must Update Their Firmware
One of the most common questions people ask in a hardware conversation is when a piece of equipment is likely to die. In most cases, it’s extremely difficult to find an answer, given that most manufacturers refuse to release this kind of data. Some customers of Hewlett Packard Enterprise, however, now have exactly the opposite issue. HPE knows exactly when their SAS (Serial Attached SCSI) SSDs are going to die, and it’s really hoping to get their attention before something catastrophic happens.
This firmware notification is chock full of interesting facts:
This HPD8 firmware is considered a critical fix and is required to address the issue detailed below. HPE strongly recommends immediate application of this critical fix. Neglecting to update to SSD Firmware Version HPD8 will result in drive failure and data loss at 32,768 hours of operation and require restoration of data from backup in non-fault tolerance, such as RAID 0 and in fault tolerance RAID mode if more drives fail than what is supported by the fault tolerance RAID mode logical drive. By disregarding this notification and not performing the recommended resolution, the customer accepts the risk of incurring future related errors.
Consider what HPE is saying here:
Our SSDs are absolutely going to fail, 100 percent of the time.
Now that you know about it, it’s 100 percent your fault if it doesn’t get fixed.
This catastrophic bug affects a wide range of products sold in the HPE ProLiant, Synergy, Apollo, JBOD D3xxx, D6xxx, D8xxx, MSA, StoreVirtual 4335 and StoreVirtual 3200 product lines. Once the SSD reaches 32,768 hours of operation (notable for being one more than the maximum value of a 16-bit signed integer, which seems like it could be relevant to the issue at-hand), the entire drive will lock up and die forever. You won’t be able to recover any data off it, through any means.
How long is that? 3 years, 270 days, 8 hours. And since these drives were likely put into service simultaneously as part of a RAID array, it means that entire ranks of drives could fail virtually simultaneously. The notification has information on how to determine how long you’ve had the drives deployed, but no matter how long that is, the answer to the question is “Flash these drives immediately.”
Most SSDs have already had firmware updates shipped, but HPE has certified that the drives without firmware updates available until December 9 cannot be impacted by this flaw before those updates are available. On the other hand, it doesn’t say when they start becoming available, so our guidance is still pretty much the same. Come December 9, flash those drives immediately.