e1000e NIC Driver stops working in XEN on SLES 11 SP1

This document (7010212) is provided subject to the disclaimer at the end of this document.

Environment

SUSE Linux Enterprise Server 11 Service Pack 1

Situation

After time the e1000e based network cards stop working.  Usually seen in a bond configuration.

What is seen in dmesg when the problem happens:

[42957.145619] ------------[ cut here ]------------
[42957.145629] WARNING: at
/usr/src/packages/BUILD/kernel-xen-2.6.32.36/linux-2.6.32/net/sched/sch_generic.c:261
dev_watchdog+0x2a5/0x2c0()
[42957.145633] Hardware name: PRIMERGY RX300 S6             
[42957.145635] NETDEV WATCHDOG: eth0 (e1000e): transmit queue 0 timed out
[42957.145637] Modules linked in: usbbk gntdev netbk blkbk blkback_pagemap
blktap domctl xenbus_be ipv6 bridge 8021q garp stp llc bonding microcode
ipmi_devintf ipmi_si ipmi_msghandler fuse loop dm_mod i2c_i801 usbhid tpm_tis
rtc_cmos tpm sr_mod tpm_bios igb i2c_core iTCO_wdt pcspkr rtc_core hid dca
iTCO_vendor_support 8250_pnp ses 8250 enclosure rtc_lib sg serial_core e1000e
container power_meter ac button uhci_hcd ehci_hcd usbcore sd_mod crc_t10dif
xenblk cdrom xennet edd ext3 mbcache jbd fan processor ide_pci_generic ide_core
ata_generic ata_piix libata megaraid_sas scsi_mod thermal thermal_sys hwmon
[42957.145684] Supported: Yes
[42957.145687] Pid: 0, comm: swapper Tainted: G        W    2.6.32.36-0.5-xen
#1
[42957.145690] Call Trace:
[42957.145705]  [<ffffffff80009b95>] dump_trace+0x65/0x180[42957.145715]  [<ffffffff80351eb8>] dump_stack+0x69/0x71
[42957.145724]  [<ffffffff8003e544>] warn_slowpath_common+0x74/0xd0
[42957.145730]  [<ffffffff8003e5f0>] warn_slowpath_fmt+0x40/0x50
[42957.145735]  [<ffffffff802c9975>] dev_watchdog+0x2a5/0x2c0
[42957.145744]  [<ffffffff8004aba2>] run_timer_softirq+0x1b2/0x2e0
[42957.145750]  [<ffffffff80044d2e>] __do_softirq+0xde/0x1a0
[42957.145756]  [<ffffffff8000800c>] call_softirq+0x1c/0x30
[42957.145761]  [<ffffffff800096d5>] do_softirq+0xa5/0xe0
[42957.145766]  [<ffffffff80044b95>] irq_exit+0x55/0x60
[42957.145772]  [<ffffffff8025a5a2>] evtchn_do_upcall+0x2f2/0x360
[42957.145779]  [<ffffffff80007a6e>] do_hypervisor_callback+0x1e/0x30
[42957.145786]  [<ffffffff800033aa>] 0xffffffff800033aa
[42957.145796]  [<ffffffff8000a9e5>] xen_safe_halt+0x15/0x60
[42957.145802]  [<ffffffff8000dfad>] xen_idle+0x5d/0x70
[42957.145807]  [<ffffffff800065cf>] cpu_idle+0x5f/0xa0
[42957.145814]  [<ffffffff8063ebd5>] start_kernel+0x2b4/0x383
[42957.145818] ---[ end trace 2da945db8eeba80d ]---
[42957.217619] bonding: bond0: link status definitely down for interface eth0,
disabling it

Resolution

This is a issue that can be seen on Westmere based systems.
To avoid the issue disable any QPI power management features inside the BIOS. At a PRIMERGY RX300 S6 system the option was "Advanced -> Advanced Processor Options -> QPI L1 Power State". The option might be named different on other systems.

Cause

The problem with the QPI L1 power state is that waking up from it takes time and while not breaking any rules, this can cause erratic behavior (exhaustion of tokens) inducing data loss on some PCI Express cards.

Disclaimer

This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:7010212
  • Creation Date: 24-Feb-2012
  • Modified Date:28-Sep-2022
    • SUSE Linux Enterprise Server

< Back to Support Search

For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com

SUSE Support Forums

Get your questions answered by experienced Sys Ops or interact with other SUSE community experts.

Join Our Community

Support Resources

Learn how to get the most from the technical support you receive with your SUSE Subscription, Premium Support, Academic Program, or Partner Program.


SUSE Customer Support Quick Reference Guide SUSE Technical Support Handbook Update Advisories
Support FAQ

Open an Incident

Open an incident with SUSE Technical Support, manage your subscriptions, download patches, or manage user access.

Go to Customer Center