dhcpcd-discuss

Re: Can't ping 169 autoip address with multiple NICs

Dallas Clement

Sat Dec 10 03:47:10 2011

Roy,  Thank you for the speedy and most helpful response.  I am indeed seeing at least one extra spurious link down / up from the driver, although it's only for the cabled interface.  Ifconfig  shows the other 3 interfaces as down, as does /sys/class/net/ethX/overstate.  I am using the very latest e1000e driver.  How is dhcpcd determining carrier presence?

On Dec 9, 2011, at 8:01 PM, Roy Marples <roy@xxxxxxxxxxxx> wrote:

> Hi
> 
> On Fri, 9 Dec 2011 18:57:26 -0600, Dallas Clement wrote:
>> I've got two Linux hosts each with four NICs.  If I directly connect a
>> single cable from one host to the other and use dhcpc to configure the
>> interfaces, each interface ends up getting a 169 autoip address.  Only
>> the interface with a cable connected is actually up and running.  This
>> is what I expect.
>> 
>> If I try to ping the other host over the interface which is up and
>> running, the ping fails.  I am selecting a specific interface with
>> ping -I eth0 for example.  I also tried downing the non-cabled
>> interfaces so that I just have one up and running.  That didn't make
>> any difference.
>> 
>> Oddly, enough arping -I eth0 169.254.XXX.XXX works just fine.  It's
>> just ping that fails.
>> 
>> Now if I reconfigure these same interfaces on each box with say a 192
>> or 172 address instead of a 169 address, ping is successful.  So there
>> seems to be something peculiar about 169 addresses and multiple NICs.
>> 
>> Is this expected behavior for the scenario described?
>> 
>> I have conducted these tests on Linux 2.6.39.4 kernel with Intel NICs
>> / e1000e drivers.  So about as generic as it gets.
> 
> Firstly, dhcpcd only negotiates a network address when the interface reports a carrier.
> There is a known problem on Linux where E1000 devices don't work well with carrier up/down messages.
> There is also a problem with E1000 where dhcpcd will by default request an MTU from the DHCP server and if given, the E1000 driver will reset the PHY triggering a bogus carrier down/up to dhpcd which results in an infinte loop.
> 
> Now, because you're using IPv4LL to connect the two hosts, that is normally fine.
> But because dhcpcd thinks there is a carrier on all 4 interfaces (hey, that's what the kernel told dhcpcd via the driver) then they all get the 169.254 address.
> The catch is that the interface with the lowest metric wins - irregardless of which actually has the cable - as dhcpcd thinks they are all up.
> 
> So taking that all into account, the problem is like so:
> Each interace has address on the same subnet, but only one interface can route to the subet.
> If there is only one subnet route, then it's easy to tell which interface takes the route, but on Linux it's possible for each interface to have the same subnet route on a different metric.
> In this case, the lowest metric wins.
> 
> You should note, this is behaviour by design because dhcpcd relies on a working driver, which the E1000 plainly does not have.
> 
> Thanks
> 
> Roy
> 

Follow-Ups:
Re: Can't ping 169 autoip address with multiple NICsRoy Marples
References:
Can't ping 169 autoip address with multiple NICsDallas Clement
Re: Can't ping 169 autoip address with multiple NICsRoy Marples
Archive administrator: postmaster@marples.name