Automatically generating usermap and groupmap for rsync

I recently had to restore a server from backup after rebuilding it, and ran into a problem I've only encountered occasionally: The user and group IDs were different on the new server configuration compared to the backup. For example, the user "prosody" was user 121 on the old build and 109 on the new one. Just restoring with rsync would leave files with the wrong owners and groups, which could lead to various security issues (and downtime and bugs, of course.) The backup was full of mystery-meat owners and groups.

Luckily, I also had a backup of /etc/passwd and /etc/group, and I was able to write a script to work these into an rsync command.

As it happened, before shutting down the server I had decided to take a backup of the root filesystem, not just the data partition. This meant I had an authoritative source of user and group IDs from before the backup, and of course the running system had its own user and group IDs. Additionally, rsync has options that specify how to map the user and group IDs on the "sending" side into the user and group IDs to be used on the "receiving" side. It's just a matter of constructing those mappings.

I used cut to extract the username/user ID mappings from both the before and after /etc/passwd files, then join to perform a join on them by username, and finally cut again to extract the mapping. sort is needed to get things ready for joining, and tr translates it into the required format. Finally, I append a catchall to assign unrecognized things to root. Happily, the /etc/group file has the same approximate format, so the same code works for groups as for users.

Here's what it ended up looking like:

ids_by_name() {
    cut -d: -f1,3 -- "$1" | sort

id_pairs() {
    join -t: <(ids_by_name "$1") <(ids_by_name "$2") \
        | cut -d: -f2,3 \
        | tr '\n' ','
    echo -n "*:0"

rsync -savx \
  --usermap="$(id_pairs /mnt/PATH/TO/BACKUP/etc/passwd /etc/passwd)" \
  --groupmap="$(id_pairs /mnt/PATH/TO/BACKUP/etc/group /etc/group)" \

Spot-checks looked good before and after. As a final check before bringing services back up, I also asked for a list of all files with user and group IDs that didn't map to users and groups: find /srv/RESTORE/DESTINATION/ -nouser -o -nogroup. No problems found.

Most of my backup scripts focus exclusively on the data, with little attention paid to the metadata. Most of the time, ownership information doesn't matter; the backup is likely to only contain files owned by one user, and chown -R can fix that up very easily. But in this case, backing up the generic "user data" partition meant that it really did matter. I'll be adding /etc/passwd and /etc/group to my standard backup paths in the future!

No comments yet. Commenting is not yet reimplemented after the Wordpress migration, sorry! For now, you can email me and I can manually add comments. Feed icon