Skip to content

Commit 469a3e1

Browse files
committed
filter-repo (README): separate sections for different tools
Our showing of how to handle the simple example with different tools combined three different tools into a single section which I think made it slightly harder to read and follow. It also concentrated almost exclusively on filter-branch. Provide a separate section for each tool, and provide more details for BFG and fast-export/fast-import. Signed-off-by: Elijah Newren <[email protected]>
1 parent 8ba3566 commit 469a3e1

File tree

1 file changed

+58
-8
lines changed

1 file changed

+58
-8
lines changed

README.md

Lines changed: 58 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,9 @@ history rewriting tools](contrib/filter-repo-demos).
2626
* [BFG Repo Cleaner](#bfg-repo-cleaner)
2727
* [Simple example, with comparisons](#simple-example-with-comparisons)
2828
* [Solving this with filter-repo](#solving-this-with-filter-repo)
29-
* [Solving this with other filtering tools](#solving-this-with-other-filtering-tools)
29+
* [Solving this with BFG Repo Cleaner](#solving-this-with-bfg-repo-cleaner)
30+
* [Solving this with filter-branch](#solving-this-with-filter-branch)
31+
* [Solving this with fast-export/fast-import](#solving-this-with-fast-exportfast-import)
3032
* [Design rationale behind filter-repo](#design-rationale-behind-filter-repo)
3133
* [How do I contribute?](#how-do-i-contribute)
3234
* [Is there a Code of Conduct?](#is-there-a-code-of-conduct)
@@ -158,14 +160,15 @@ Doing this with filter-repo is as simple as the following command:
158160
(the single quotes are unnecessary, but make it clearer to a human that we
159161
are replacing the empty string as a prefix with `my-module-`)
160162

161-
## Solving this with other filtering tools
163+
## Solving this with BFG Repo Cleaner
162164

163-
By contrast, BFG Repo Cleaner is not capable of this kind of rewrite,
164-
it would take considerable effort to do this safely with
165-
fast-export/fast-import (especially if you wanted empty commits pruned
166-
or commit hashes rewritten), and filter-branch comes with a pile of
167-
caveats (more on that below) even once you figure out the necessary
168-
invocation(s):
165+
BFG Repo Cleaner is not capable of this kind of rewrite; in fact, all
166+
three types of wanted changes are outside of its capabilities.
167+
168+
## Solving this with filter-branch
169+
170+
filter-branch comes with a pile of caveats (more on that below) even
171+
once you figure out the necessary invocation(s):
169172

170173
```shell
171174
git filter-branch \
@@ -244,6 +247,53 @@ new and old history before pushing somewhere. Other caveats:
244247
quotes will wreak havoc and likely result in missing files or
245248
misnamed files)
246249

250+
## Solving this with fast-export/fast-import
251+
252+
One can kind of hack this together with something like:
253+
254+
```shell
255+
git fast-export --no-data --reencode=yes --mark-tags --fake-missing-tagger \
256+
--signed-tags=strip --tag-of-filtered-object=rewrite --all \
257+
| grep -vP '^M [0-9]+ [0-9a-f]+ (?!src/)' \
258+
| grep -vP '^D (?!src/)' \
259+
| perl -pe 's%^(M [0-9]+ [0-9a-f]+ )(.*)$%\1my-module/\2%' \
260+
| perl -pe 's%^(D )(.*)$%\1my-module/\2%' \
261+
| perl -pe s%refs/tags/%refs/tags/my-module-% \
262+
| git -c core.ignorecase=false fast-import --force --quiet
263+
git for-each-ref --format="delete %(refname)" refs/tags/ \
264+
| grep -v refs/tags/my-module- \
265+
| git update-ref --stdin
266+
git reset --hard
267+
git reflog expire --expire=now --all
268+
git gc --prune=now
269+
```
270+
271+
But this comes with some nasty caveats and limitations:
272+
* The various greps and regex replacements operate on the entire
273+
fast-export stream and thus might accidentally corrupt unintended
274+
portions of it, such as commit messages. If you needed to edit
275+
file contents and thus dropped the --no-data flag, it could also
276+
end up corrupting file contents.
277+
* This command assumes all filenames in the repository are composed
278+
entirely of ascii characters, and also exclude special characters
279+
such as tabs or double quotes. If such a special filename exists
280+
within the old src/ directory, it will be pruned even though it
281+
was intended to be kept. (In slightly different repository
282+
rewrites, this type of editing also risks corrupting filenames
283+
with special characters by adding extra double quotes near the end
284+
of the filename and in some leading directory name.)
285+
* This command will leave behind huge numbers of useless empty
286+
commits, and has no realistic way of pruning them. (And if you
287+
tried to combine this technique with another tool to prune the
288+
empty commits, then you now have no way to distinguish between
289+
commits which were made empty by the filtering that you want to
290+
remove, and commits which were empty before the filtering process
291+
and which you thus may want to keep.)
292+
* Commit messages which reference other commits by hash will now
293+
reference old commits that no longer exist. Attempting to edit
294+
the commit messages to update them is extraordinarily difficult to
295+
add to this kind of direct rewrite.
296+
247297
# Design rationale behind filter-repo
248298

249299
None of the existing repository filtering tools did what I wanted;

0 commit comments

Comments
 (0)