2017-02-07

Checking whether a Unix filename is safe in C

This blog post explains how to check whether a filename is safe for overwriting the file contents on a Unix system, and it also contains C code to do the checks.

The use case is that an archive (e.g. ZIP) extractor is extracting an untrusted archive and creating and overwriting files. The archive may be created by someone malicious, trying to trick the extractor to overwrite sensitive system files such as /etc/passwd. Of course this would only work if the process running the extractor has the permission to modify system files (which it normally doesn't have, because it's not running as root). The most basic protection the extractor can provide is refusing to write to a file with an absolute name (i.e. starting with /), so that even if it's running as root, it would do no harm if it's not running in the root directory.

However, checking whether the first character is a / isn't good enough, because the attacker may specify a file with name ../.././../../../.././../tmp/../etc/././passwd, which is equivalent to /etc/passwd if the current directory isn't deep enough. Enforcing the following conditions makes sure that such attacks are not possible:

  • The filename isn't be empty.
  • The filename doesn't start or end with a /.
  • The strings . and .. are not present as a pathname component (i.e. as an item when the filename is split on /).
  • The filename doesn't contain a double slash: //.

Here is how to check all these in C:

int is_filename_safe(const char *p) {
  const char *q;
  for (;;) {
    for (q = p; *p != '\0' && *p != '/'; ++p) {}
    /* In the first iteration: An empty filename is unsafe. */
    /* In the first iteration: A leading '/' is unsafe. */
    /* In subsequent iterations: A trailing '/' is unsafe. */
    /* In subsequent iterations: A "//" is unsafe. */
    if (p == q ||
    /* A pathname component "." is unsafe. */
    /* A pathname component ".." is unsafe. */
        (*q == '.' && (q + 1 == p || (q[1] == '.' && q + 2 == p)))) {
      return 0;  /* Unsafe. */
    }
    if (*p++ == '\0') return 0;  /* Safe. */
  }
}

An even better solution is running the extractor in an isolated sandbox (jail), thus protecting against malicious input and all kinds of software bugs.