Re: Can't ping 169 autoip address with multiple NICs
Roy Marples
Sat Dec 10 02:01:41 2011
Hi
On Fri, 9 Dec 2011 18:57:26 -0600, Dallas Clement wrote:
I've got two Linux hosts each with four NICs. If I directly connect
a
single cable from one host to the other and use dhcpc to configure
the
interfaces, each interface ends up getting a 169 autoip address.
Only
the interface with a cable connected is actually up and running.
This
is what I expect.
If I try to ping the other host over the interface which is up and
running, the ping fails. I am selecting a specific interface with
ping -I eth0 for example. I also tried downing the non-cabled
interfaces so that I just have one up and running. That didn't make
any difference.
Oddly, enough arping -I eth0 169.254.XXX.XXX works just fine. It's
just ping that fails.
Now if I reconfigure these same interfaces on each box with say a 192
or 172 address instead of a 169 address, ping is successful. So
there
seems to be something peculiar about 169 addresses and multiple NICs.
Is this expected behavior for the scenario described?
I have conducted these tests on Linux 2.6.39.4 kernel with Intel NICs
/ e1000e drivers. So about as generic as it gets.
Firstly, dhcpcd only negotiates a network address when the interface
reports a carrier.
There is a known problem on Linux where E1000 devices don't work well
with carrier up/down messages.
There is also a problem with E1000 where dhcpcd will by default request
an MTU from the DHCP server and if given, the E1000 driver will reset
the PHY triggering a bogus carrier down/up to dhpcd which results in an
infinte loop.
Now, because you're using IPv4LL to connect the two hosts, that is
normally fine.
But because dhcpcd thinks there is a carrier on all 4 interfaces (hey,
that's what the kernel told dhcpcd via the driver) then they all get the
169.254 address.
The catch is that the interface with the lowest metric wins -
irregardless of which actually has the cable - as dhcpcd thinks they are
all up.
So taking that all into account, the problem is like so:
Each interace has address on the same subnet, but only one interface
can route to the subet.
If there is only one subnet route, then it's easy to tell which
interface takes the route, but on Linux it's possible for each interface
to have the same subnet route on a different metric.
In this case, the lowest metric wins.
You should note, this is behaviour by design because dhcpcd relies on a
working driver, which the E1000 plainly does not have.
Thanks
Roy
Archive administrator: postmaster@marples.name