Solved: e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang

On one amd64 machine running CentOS 5.5 x86_64 the e1000 network interfaces go down and up with the following messages:

e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
 Tx Queue             <0>
 TDH                  <62>
 TDT                  <8d>
 next_to_use          <8d>
 next_to_clean        <62>
 time_stamp           <10037f7b6>
 next_to_watch        <62>
 jiffies              <10037fcd4>
 next_to_watch.status <0>
NETDEV WATCHDOG: eth0: transmit timed out

This issue usually appears on machines with 4GB or more memory. I have tried a lot of things but no luck.

The fix comes with new Intel’s e1000 driver.

Download the latest driver from e1000 stable on SF (8.0.25 at the time of writing).

Build the RPM:

rpmbuild -tb /path/to/e1000-8.0.25.tar.gz

Install the new e1000 RPM driver:

rpm -ivh /usr/src/redhat/RPMS/x86_64/e1000-8.0.25-1.x86_64.rpm

Add ignore_64bit_dma=1 driver option in /etc/modprobe.conf:

options e1000 ignore_64bit_dma=1

Reboot and enjoy!

From e1000 README file:

Valid Range:   0-xxxxxxx (0=off)
Default Value: 0
Usage: insmod e1000.ko ignore_64bit_dma=1

When non zero the driver will only request DMA mapping of host memory
in the lower 4GB region. This provides a workaround for users of AMD platforms
GA-MA78G-DS3H & SM4021M-T2R+ that have reported TXHangs on system that have
>4GB RAM, suspected caused by some (no deep root cause) issue in the Dual
Address Cycle (DAC) DMA mechanism needed to access addresses above 4GB.
Setting ignore_64bit_dma to 1 activates the workaround.

This parameter is different than other parameters, in that it is a
single (not 1,1,1 etc.) parameter applied to all driver instances and
it is also available during runtime at

Update: if your newer kernel crashes with e1000 ElRepo driver version 8.0.30 or older (detailed here), please update to the latest version, currently 8.0.35. Thanks again ElRepo!