Snippet Path

Idempotent hash of a folder using tarballs

Authors

If you want to have an idempotent hash of a folder, one way is to hash the tarball of that folder. However, you should know that hash of two folders with the same content and file tree might not be equal if you don't use some of tar flags that ignores things like modification time and permission flags.

If you're hashing a git repository, you also need to ignore files that are not tracked by git, and git ls-files does that for us.

git ls-files myfolder \
    | sort \
    | TZ=UTC tar -cf myfolder.tar \
      --no-xattrs \
      --no-acls \
      --no-selinux \
      --group=0 \
      --owner=0 \
      --numeric-owner \
      --mtime="UTC 2020-01-01 00:00:00" \
      --mode="a=r,u+w,go-w" \
      -T -

HASH_OF_DIR="$(sha256sum myfolder.tar | cut -d ' ' -f 1)"

You can also use --sort=name instead of the sort command, but make sure that your tar binary supports this flag.