Apply automated file conversion (transformation) starting at a past revision, preserving history
For a given project (on which I'm currently working exclusively), I'd like to apply a certain automated conversion on some files. For this discussion, let's assume I want to change line endings from Windows style to Unix style on all text files. I want the new format to be effective as of a certain revision in the past, and carried through the history.
What I usually do is the following: I start an interactive rebase and append the following after each commit starting with the desired starting point:
exec <transformation rule> exec git commit -m "transform" && git revert --no-edit HEAD
This leads to a transform followed by a sequence of original commits, each surrounded with revert and transform, followed by a final revert. Then I execute a second interactive rebase and squash all triplets revert-<original commit>-transform, and remove the last revert. The first transform stays in place, all other transform and revert commits vanish.
This manual process yields (almost) the desired result (I'm not sure about commit dates for the new commits), but I was wondering of it is possible to automate this using filter-branch, fast-import or perhaps a custom tool that I have missed in my cursory search.
Likely filter-branch is what you want in this case.
Update with a bit more explanations: when you perform the mangling of sources that your described above (when I read it thoroughly I wished to applaud your ingenuity, indeed!), you have to consider a given series of commits a sequence of changes , transitions . That's why you have to introduce a fake reverts which you have to squash on next interation. Thus you avoid conflicts and this enraptured me most.
Unlike this git filter-branch gives another "view" on the same commit sequence. Each commit in the sequence is considered as a state not a transition. And the filter you provide changes the state itself. Thus you don't have to think about conflicts at all, they simply can't arrive. But certainly you have to apply the same transformation on each commit in the sequence.
So the answer for your particular question: if you wish to convert newlines from DOS to UNIX you should simply run a command like this:
git filter-branch --tree-filter 'find . -type f -print0 | xargs -0 perl -pi -e "s,\r\n,\n,;"' HEAD
perl -pi -e 's,\n,\r\n,;' is the command which does the conversion, find and xargs is a method to collect all files in the current directory and pass them to perl. Please note that git has its own CRLF magic (see man git-config for core.autocrlf, core.safecrlf and .gitattributes file) and usually you don't need to handle CRLFs manually, proper configuration is enough to construct a repository suitable for both Windows and unix-like systems.