dhcpcd-discuss

Re: Can't ping 169 autoip address with multiple NICs

Roy Marples

Sat Dec 10 02:01:41 2011

Hi

On Fri, 9 Dec 2011 18:57:26 -0600, Dallas Clement wrote:
I've got two Linux hosts each with four NICs. If I directly connect a single cable from one host to the other and use dhcpc to configure the interfaces, each interface ends up getting a 169 autoip address. Only the interface with a cable connected is actually up and running. This
is what I expect.

If I try to ping the other host over the interface which is up and
running, the ping fails.  I am selecting a specific interface with
ping -I eth0 for example.  I also tried downing the non-cabled
interfaces so that I just have one up and running.  That didn't make
any difference.

Oddly, enough arping -I eth0 169.254.XXX.XXX works just fine.  It's
just ping that fails.

Now if I reconfigure these same interfaces on each box with say a 192
or 172 address instead of a 169 address, ping is successful. So there
seems to be something peculiar about 169 addresses and multiple NICs.

Is this expected behavior for the scenario described?

I have conducted these tests on Linux 2.6.39.4 kernel with Intel NICs
/ e1000e drivers.  So about as generic as it gets.

Firstly, dhcpcd only negotiates a network address when the interface reports a carrier. There is a known problem on Linux where E1000 devices don't work well with carrier up/down messages. There is also a problem with E1000 where dhcpcd will by default request an MTU from the DHCP server and if given, the E1000 driver will reset the PHY triggering a bogus carrier down/up to dhpcd which results in an infinte loop.

Now, because you're using IPv4LL to connect the two hosts, that is normally fine. But because dhcpcd thinks there is a carrier on all 4 interfaces (hey, that's what the kernel told dhcpcd via the driver) then they all get the 169.254 address. The catch is that the interface with the lowest metric wins - irregardless of which actually has the cable - as dhcpcd thinks they are all up.

So taking that all into account, the problem is like so:
Each interace has address on the same subnet, but only one interface can route to the subet. If there is only one subnet route, then it's easy to tell which interface takes the route, but on Linux it's possible for each interface to have the same subnet route on a different metric.
In this case, the lowest metric wins.

You should note, this is behaviour by design because dhcpcd relies on a working driver, which the E1000 plainly does not have.

Thanks

Roy

Follow-Ups:
Re: Can't ping 169 autoip address with multiple NICsDallas Clement
References:
Can't ping 169 autoip address with multiple NICsDallas Clement
Archive administrator: postmaster@marples.name