Re: [RFC]Routes are stored as a linked list
Roy Marples
Fri Mar 08 15:37:20 2019
Hi Donald
If you have many applications failing on buffer overflows, why not
increase the size of them via kernel settings?
echo 16777216> /proc/sys/net/core/rmem_default
echo 16777216> /proc/sys/net/core/rmem_max
Or do you feel this needs to be set per application?
Roy
On 08/03/2019 14:35, Donald Sharp wrote:
Roy -
Again, thanks for picking up the RB tree changes. It's gonna help us
out tremendously.
I just choose an arbitrarily large value for the receive buffer as
that this was just testing to see if the problem goes away, I wasn't
trying to imply anything other than it would be nice to have a tunable
parameter. Everyone's system is different but having the ability to
tune it on dhcpcd startup would allow me to run some tests and see
what a good value is for our system. And if it bothers someone else
they can modify their startup behavior. So imo, just having a knob to
tune it, with the receive buffer not being tuned unless the knob is
used would be more than sufficient from my perspective.
donald
On Fri, Mar 8, 2019 at 8:20 AM Roy Marples <roy@xxxxxxxxxxxx> wrote:
OK, I've now done a little bit more testing on this and comitted a few
more changes - this mainly affects a single dhcpcd running on all
interfaces rather that just one which may not be your use case, but it
is the prefered way of running dhcpcd as it uses less resources.
I still have some more test cases to run, but it's looking quite
promising now for being merged into master.
But lets talk some more about the recieve socket size.
The linux man page says this:
SO_RCVBUF
Sets or gets the maximum socket receive buffer in bytes. The
kernel doubles this value (to allow space for bookkeeping
overhead) when it is set using setsockopt(2), and this
doubled
value is returned by getsockopt(2). The default value is set
by the /proc/sys/net/core/rmem_default file, and the maximum
allowed value is set by the /proc/sys/net/core/rmem_max file.
The minimum (doubled) value for this option is 256.
So you've allocated a 16Mb buffer, but the kernel doubles it to 32Mb.
Ouch.
But let's not use doubled values for this discussion.
Have you tested with a smaller buffer at all?
The default is about 100k I believe on Linux - does it overflow still if
we say increase it to 512k or 1Mb?
Roy
On 05/03/2019 20:02, Donald Sharp wrote:
Mar 05 15:35:52 janelle dhcpcd[20445]: free route list used 4000231 times
Mar 05 15:35:52 janelle dhcpcd[20445]: new routes from free list 3000224
Mar 05 15:35:52 janelle dhcpcd[20445]: dhcpcd exited
So a note on the buffer overruns -> While dhcpcd does indeed have
logic to handle this situation, we have noticed, in general, that
recovery mechanisms are always more expensive than not having to
recover at all. Especially as a system becomes more loaded. We have
even seen negative feed back loops where a system never recovers
because multiple daemons are all recovering from netlink socket buffer
overflow. In other words we do our best to avoid it. Hence the
suggestion to allow the receive buffer to be expanded via the cli so
people sensitive to this situation can tune the behavior of their
system.
donald
On Tue, Mar 5, 2019 at 10:24 AM Roy Marples <roy@xxxxxxxxxxxx> wrote:
On 05/03/2019 15:03, Donald Sharp wrote:
Roy -
I could not figure out how to make debug's work so I just changed the
logdebugx to logerrx:
Use the -d flag or add the word debug to /etc/dhcpcd.conf
diff --git a/src/route.c b/src/route.c
index dbe2b5a5..6963837d 100644
--- a/src/route.c
+++ b/src/route.c
@@ -258,8 +258,8 @@ rt_dispose(struct dhcpcd_ctx *ctx)
rt_headfree(&ctx->routes);
#ifdef RT_FREE_ROUTE_TABLE
rt_headfree(&ctx->froutes);
- logdebugx("free route list used %zu times", froutes);
- logdebugx("new routes from free list %zu", nroutes);
+ logerrx("free route list used %zu times", froutes);
+ logerrx("new routes from free list %zu", nroutes);
#endif
}
And only see this:
Mar 05 14:41:19 janelle dhcpcd[20151]: free route list used 7 times
Mar 05 14:41:19 janelle dhcpcd[20168]: dummy0: soliciting an IPv6 router
Mar 05 14:41:19 janelle dhcpcd[20151]: new routes from free list 0
On startup. After startup when I install/remove 1 million routes I
never see the message again.
Cause dhcpcd to exit and then check syslog.
You'll see it then.
Another issue that has popped up during testing is that now that I am
looking at the syslog is that I am seeing frequent messges from dhcpd
that say this:
Mar 05 14:34:40 janelle dhcpcd[19931]: route socket overflowed -
learning interface state
Mar 05 14:34:41 janelle dhcpcd[19931]: route socket overflowed -
learning interface state
during route install/removal.
I added this bit of code:
diff --git a/src/if-linux.c b/src/if-linux.c
index b912c171..4b5dda54 100644
--- a/src/if-linux.c
+++ b/src/if-linux.c
@@ -323,12 +323,15 @@ if_opensockets_os(struct dhcpcd_ctx *ctx)
&on, sizeof(on));
#endif
+ int rcvbufsize = 16 * 1024 * 1024;
+ setsockopt(ctx->link_fd, SOL_SOCKET, SO_RCVBUFFORCE, &rcvbufsize,
sizeof(rcvbufsize));
if ((priv = calloc(1, sizeof(*priv))) == NULL)
return -1;
ctx->priv = priv;
memset(&snl, 0, sizeof(snl));
priv->route_fd = _open_link_socket(&snl, NETLINK_ROUTE);
+ setsockopt(priv->route_fd, SOL_SOCKET, SO_RCVBUFFORCE, &rcvbufsize,
sizeof(rcvbufsize));
if (priv->route_fd == -1)
return -1;
len = sizeof(snl);
and the error message has gone away. Perhaps some cli option is
needed for linux when it is planned to be used in a large scale
routing env?
While the error itself is indeed scary, dhcpcd does have the logic to
work around it.
Each time it happens it will re-sync interfaces and addresses with a
single call to getifaddrs(3).
It doens't resync routes, but that's not overly important.
Do you only see the message on system boot, or when dhcpcd is actually
running as well?
I'm curious as that's quite a large kernel buffer you're reserving, and
that will be for each instance of dhcpcd.
Roy
Archive administrator: postmaster@marples.name