If you’ve ever worked with a team of contractors in software development, you may notice some of their (or even your!) commits are made under the incorrect email address or username. If you need to clean things up for compliance or maybe just your own sanity, turns out there’s a fairly easy way to rewrite any of those invalid authors’ bad commits.
Gather your list(s)
Start by checking out all branches of your repository on your local machine — we’ll need to scan them for invalid authors. Here’s a nifty shell script from user octasimo on github that does just that:
#!/bin/bash
for branch in `git branch -a | grep remotes | grep -v HEAD | grep -v master `; do
git branch --track ${branch#remotes/origin/} $branch
done
Once all branches are checked out, we’ll need to scan them for invalid authors. Use this command to get a list of those emails:
git log --all --format='%cE' | sort -u
Now you can sort that list to show only the invalid ones. For example, if my corporate company email is “@company.com” I can use grep to sort out the bad ones like so:
git log --all --format='%cE' | sort -u | grep -v '@company.com'
Now you have a list of all the invalid emails that have been used to commit changes to your repository. From there, make a list determining which proper (corporate) email should have been used.
Pipe the list into the script
For the rewriting of these emails, you can use the script provided by Github themselves. This script asks you for an “old email” (the list of bad emails we found above) and a “correct email” — the email that should have been used to make the commit.
#!/bin/sh
git filter-branch --env-filter '
OLD_EMAIL="[email protected]"
CORRECT_NAME="Your Correct Name"
CORRECT_EMAIL="[email protected]"
if [ "$GIT_COMMITTER_EMAIL" = "$OLD_EMAIL" ]
then
export GIT_COMMITTER_NAME="$CORRECT_NAME"
export GIT_COMMITTER_EMAIL="$CORRECT_EMAIL"
fi
if [ "$GIT_AUTHOR_EMAIL" = "$OLD_EMAIL" ]
then
export GIT_AUTHOR_NAME="$CORRECT_NAME"
export GIT_AUTHOR_EMAIL="$CORRECT_EMAIL"
fi
' --tag-name-filter cat -- --branches --tags
To use this script, just insert the invalid email as the value of the “OLD_EMAIL” variable, the correct name for this user as the value for the “CORRECT_NAME” variable, and the corporate email as the value for the “CORRECT_EMAIL” variable.
Quick tip: if you have to rewrite multiple emails as I did, you will want to change that first line to git filter-branch -f
Once you’ve run the script for each email that needs to be rewritten, push your changes to the repository with git, and all should be cleaned up!