Athlon/Linux Extended Paging Bug Description (not confirmed yet: from Slashdot) |
A major Athlon CPU bug has been discovered, and it affects Linux 2.4. Note that this is a bug in the actual CPU itself, and is not a Linux bug. However, it becomes our problem because there are very many semi-broken Athlon/Duron/Athlon MP CPUs out there. Here are the details. As you may know, x86 systems have traditionally managed memory using 4K pages. However, with the introduction of the Pentium processor, Intel added a new feature called extended paging, which allows 4Mb pages to be used instead. Here's the problem -- many Athlon and Duron CPUs experience memory corruption when extended paging is used in conjunction with AGP. And, this problem hits us because Linux 2.4 kernels compiled with a Pentium-Classic or higher Processor family kernel configuration setting will automatically take advantage of extended paging (for kernel hackers out there, this is the X86_FEATURE_PSE constant defined in include/asm-i386/cpufeature.h.) Fortunately, there is a quick and easy fix for this problem. If you have been experiencing lockups on your Athlon, Duron or Athlon MP system when using AGP video, try passing the mem=nopentium option to your kernel (using GRUB or LILO) at boot-time. This tells Linux to go back to using 4K pages, avoiding this CPU bug. In addition, it should also be possible to avoid this problem by not using AGP on affected systems. As soon as I discovered that this CPU bug existed (which happened, unfortunately, because my CPU has the bug), I informed kernel hacker Andrew Morton of the issue; he put me in touch with Alan Cox. Alan is going to try to add some kind of Athlon/AGP CPU bug detection code to the kernel so that it will be able to auto-downgrade to 4K pages when necessary. The unfortunate thing about this situation is that AMD and others have known of this bug since September 2000. In fact, AMD's CPG technical marketing division announced this bug on September 21, 2000 in a technical note entitled Microsoft Windows 2000 Patch for AGP Applications on AMD Athlon and AMD Duron Processors (Technical Note TN17 revision 1). And, the kind folks at AMD even created a simple patch for Windows 2000 that disables extended paging by tweaking the registry. However, apparently AMD didn't realize that Linux 2.4 also uses extended paging when the kernel is compiled with a Pentium-Classic or higher Processor family kernel configuration setting. And, it looks like no one in the Linux community noticed that this "Microsoft Windows 2000/AGP Athlon/Duron bug" also applied to Linux 2.4 systems, probably because it was presented by AMD technical marketing as just that -- a Windows 2000-related AGP bug. An unfortunate miscommunication, which has resulted in lots of problems for Athlon, Duron and Athlon MP users. Here's something that's even more unsettling -- consider what kind of Linux users actually use AGP. That's right -- desktop users. And in what area has Linux been struggling? Yes, the desktop. One wonders how many negative desktop Linux experiences have resulted from this unfortunate problem. I don't know if any particular party is to blame for this issue. After all, AMD did prominently announce this bug when it was discovered. But due to an apparently unfortunate series of events, us Linux people never benefited from this knowledge. But Microsoft Windows 2000 and XP users did. Let's hope that all parties involved can keep things like this from happening in the future. |