A minor bug can cause a system crash after 1,044 days of uninterrupted uptime. Be sure to reboot before then. Semiconductors, especially CPUs, are immensely complex creations all done at the microscopic level. That there aren’t more bugs, for lack of a better word, is a testament to the efforts that these chipmakers put in to delivering solid products. But occasionally, something slips by. AMD has issued an alert that an older processor line has a minor error. The problem exists in its Epyc 7002 line, code-named Rome, which was released three years ago. The bug, first noted on a Reddit thread, says that servers running Rome-era chips will hang after 1,044 days of uptime or nearly three years. There is no way to reset the server other than to reboot. AMD says it will not fix the issue. “AMD has successfully provided a remedy for an isolated challenge regarding 2nd Gen AMD EPYC processors where for some customers, a core within the processor could hang if running consistently for an extended period of time,” a company spokesperson said via email. The bug is in what’s known as the C6 Sleep State. To save energy when the CPU is idle, it can go into a low-power mode. CPUs have several power modes, which are collectively called “C-states” or “C-modes.” Intel first introduced it with the 486 processor, so the idea is hardly new. These C-state modes start at C0, which is the normal CPU operating mode. The higher the C number is, the deeper into sleep mode the CPU goes and the more signals are turned off. The deeper the sleep state, the more time the CPU needs to fully wake up. With this bug, once a CPU goes into C6 past the 1,044-day mark, it gets stuck and a reboot is required. The fix is either reboot the server before the three-year mark or disable the sleep state that causes the bug. That this bug even surfaced is testament to the CPU’s performance; three years of uninterrupted uptime is remarkable. You might think server updates would have dictated a reboot along the way, but then again, the Linux kernel can be patched without a reboot. Significant CPU bugs do happen but not very often, and this certainly isn’t one of them. Related content news AMD holds steady against Intel in Q1 x86 processor shipments finally realigned with typical seasonal trends for client and server processors, according to Mercury Research. By Andy Patrizio May 22, 2024 4 mins CPUs and Processors Data Center news Broadcom launches 400G Ethernet adapters The highly scalable, low-power 400G PCIe Gen 5.0 Ethernet adapters are designed for AI in the data center. By Andy Patrizio May 21, 2024 3 mins CPUs and Processors Networking news HPE updates block storage services The company adds new storage controller support as well as AWS. By Andy Patrizio May 20, 2024 3 mins Enterprise Storage Data Center news ZutaCore launches liquid cooling for advanced Nvidia chips The HyperCool direct-to-chip system from ZutaCore is designed to cool up to 120kW of rack power without requiring a facilities modification. By Andy Patrizio May 15, 2024 3 mins Servers Data Center PODCASTS VIDEOS RESOURCES EVENTS NEWSLETTERS Newsletter Promo Module Test Description for newsletter promo module. Please enter a valid email address Subscribe