open_memstream is one of the more important functions added to POSIX libc of late. It's so important because it makes the generation of strings really easy - you no longer need to care about allocating the right amount of memory as the library will do it for you. Now, there's many functions that already help with this, such as asprintf but that's not standard and if you want to create many strings in one area you still need to care about the size of the area. You want to create an area if you have many strings, because it's more efficient for malloc and if you keep the area around and re-use it then it avoids memory fragmentation.

Now, to be clear, you have been able to do this since forever using fopen, writing to the file and then allocating your area based on the final file size. Still, it requires some memory management still but more importantly it writes to a file. Writing to a file is slow and reduces the life span of the disk you're writing to. It's only been fairly recently that tmpfs was a thing, but even today not all OS's have /tmp mounted as tmpfs. Requiring this isn't exactly ideal for a generic program to do - the setup and install should be easy. Because of all these reasons, most programs worked the string length needed and either allocated an area or string and then finally write the string. However, while saving the disk, it's also a lot more error prone because you need to work out the length of everything and that's not always trivial, especially for things like a DHCP client which is always building strings based on the information given by the DHCP server.

Here's an example of open_memstream in action:

/*
 * Example program which manages an area of environment strings
 * to send to child programs.
 */

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

static const char *foo = "foo";
static const char *bar = "bar";

int main(void)
{
        char *argv[] = { "/usr/bin/env", NULL };
        char *buf, *ep, *p, **env, **envp;
        size_t buflen, nenv, i;
        FILE *fp = open_memstream(&buf, &buflen);

        fprintf(fp, "FOO=%s", foo);
        fputc('\0', fp);
        fprintf(fp, "BAR=%s", bar);
        fputc('\0', fp);

        /* We could keep fp around as our area and just rewind it. */
        fclose(fp);

        /* execve relies on a trailing NULL */
        nenv = 1;
        for (p = buf, ep = p + buflen; p < ep; p++) {
                if (*p == '\0')
                        nenv++;
        }

        /* reallocarray(3) should be standard really */
        envp = env = malloc(nenv * sizeof(char *));
        *envp++ = buf;
        for (p = buf, ep--; p < ep; p++) {
                if (*p == '\0')
                        *envp++ = p + 1;
        }
        *envp = NULL;

        execve(argv[0], argv, env);
}

As you can see, we only manage the environment array handed to to execve - open_memstream is managing our string area and fprintf is working out the length each string needs to be for us. This vastly reduces the complexity and increases the security and reliabilty of creating large environment strings, which most DHCP clients do. We could also write a helper function to write the string AND the trailing NULL terminator for the string to be more efficient. You'll get to see this in dhcpcd-8 which should be released later this year.

Next Post Previous Post