为什么BFG会改变我的最新提交?

时间:2015-01-25 05:13:29

标签: git bfg-repo-cleaner

git filter-branch花了很长时间。令人高兴的是,我找到了BFG repo-cleaner

但它意外地改变了我上一次提交的内容。

$ git clone --mirror example.com:/repo.git
$ cd repo.git
$ git log HEAD^!
commit 5f737d28756d4854d25899632abffe7cca2c7423
Author: Paul Draper <paul@example.com>
Date:   Sat Jan 24 19:31:47 2015 -0700

    Fix /contact and /folderEntries/listFoldersSimple
$ git diff --stat HEAD^!
 cake/app/controllers/folder_entries_controller.php |     1 +

现在我干净了。

$ java -jar ~/bfg-1.12.0.jar -b 1M
...
In total, 161797 object ids were changed. Full details are logged here:
...
$ git log HEAD^!
commit 3ff700cebe32497423435b416ea11169b7fcbf90
Author: Paul Draper <paul@example.com>
Date:   Sat Jan 24 19:31:47 2015 -0700

    Fix /contact and /folderEntries/listFoldersSimple


    Former-commit-id: 5f737d28756d4854d25899632abffe7cca2c7423
$ git diff --stat HEAD^!
     cake/app/controllers/folder_entries_controller.php |     1 +
 .../lucidchart-tools/caja/ant-jars/guava-r09.jar   |   Bin 0 -> 1141964 bytes
 .../caja/ant-jars/guava-r09.jar.REMOVED.git-id     |     1 -
 cake/app/lucidchart-tools/caja/ant-jars/js.jar     |   Bin 0 -> 1122370 bytes
 .../caja/ant-jars/js.jar.REMOVED.git-id            |     1 -
 .../lucidchart-tools/caja/ant-jars/pluginc-src.jar |   Bin 0 -> 5172676 bytes
 .../caja/ant-jars/pluginc-src.jar.REMOVED.git-id   |     1 -
 .../app/lucidchart-tools/caja/ant-jars/pluginc.jar |   Bin 0 -> 2959487 bytes
 .../caja/ant-jars/pluginc.jar.REMOVED.git-id       |     1 -
 .../lucidchart-tools/caja/ant-jars/xercesImpl.jar  |   Bin 0 -> 1229125 bytes
 .../caja/ant-jars/xercesImpl.jar.REMOVED.git-id    |     1 -
 cake/app/lucidchart-tools/jsdoc/rhino/js.jar       |   Bin 0 -> 1111429 bytes
 .../jsdoc/rhino/js.jar.REMOVED.git-id              |     1 -
 cake/app/lucidchart-tools/selenium/chromedriver    |   Bin 0 -> 5778064 bytes
 .../selenium/chromedriver.REMOVED.git-id           |     1 -
 .../selenium/selenium-server-standalone-2.37.0.jar |   Bin 0 -> 34730734 bytes
 ...ium-server-standalone-2.37.0.jar.REMOVED.git-id |     1 -
 .../selenium-server-standalone-2.42.2-mod.jar      |   Bin 0 -> 34873583 bytes
 ...server-standalone-2.42.2-mod.jar.REMOVED.git-id |     1 -
 .../selenium/selenium-server-standalone-2.42.2.jar |   Bin 0 -> 34823352 bytes
 ...ium-server-standalone-2.42.2.jar.REMOVED.git-id |     1 -
 .../lucidchart-tools/test-runner-1.0-SNAPSHOT.jar  |   Bin 0 -> 9732125 bytes
 .../test-runner-1.0-SNAPSHOT.jar.REMOVED.git-id    |     1 -
 .../CommandLine/Scaffolders/DefaultScaffolder.phar |   Bin 0 -> 4404199 bytes
 .../DefaultScaffolder.phar.REMOVED.git-id          |     1 -
 .../WebPICmdLine/Microsoft.Web.Deployment.dll      |   Bin 0 -> 1201991 bytes
 .../Microsoft.Web.Deployment.dll.REMOVED.git-id    |     1 -
 cake/app/vendors/aws.phar                          |   Bin 0 -> 6784935 bytes
 cake/app/vendors/aws.phar.REMOVED.git-id           |     1 -
 .../tcpdf/fonts/dejavu-fonts-ttf-2.33/status.txt   |  6657 +++++
 .../status.txt.REMOVED.git-id                      |     1 -
 cake/app/vendors/tcpdf/tcpdf.php                   | 28808 +++++++++++++++++++
 cake/app/vendors/tcpdf/tcpdf.php.REMOVED.git-id    |     1 -
 .../img/onboarding-chart/04_shape manager.gif      |   Bin 0 -> 1413721 bytes
 .../04_shape manager.gif.REMOVED.git-id            |     1 -
 cake/app/webroot/img/onboarding-chart/05_share.gif |   Bin 0 -> 1341876 bytes
 .../onboarding-chart/05_share.gif.REMOVED.git-id   |     1 -
 .../js/closure/usage/rhino/javadoc/index-all.html  | 12027 ++++++++
 .../rhino/javadoc/index-all.html.REMOVED.git-id    |     1 -
 cake/app/webroot/js/closure/usage/rhino/js-14.jar  |   Bin 0 -> 1471932 bytes
 .../closure/usage/rhino/js-14.jar.REMOVED.git-id   |     1 -
 cake/app/webroot/js/closure/usage/rhino/js.jar     |   Bin 0 -> 1134765 bytes
 .../js/closure/usage/rhino/js.jar.REMOVED.git-id   |     1 -
 .../js/closure/usage/rhino/testsrc/tests.tar.gz    |   Bin 0 -> 1778543 bytes
 .../rhino/testsrc/tests.tar.gz.REMOVED.git-id      |     1 -
 cake/app/webroot/js/mathquill/font/Symbola.svg     |  5102 ++++
 .../js/mathquill/font/Symbola.svg.REMOVED.git-id   |     1 -
 .../webroot/js/templates/SoyToJsSrcCompiler.jar    |   Bin 0 -> 2154164 bytes
 .../SoyToJsSrcCompiler.jar.REMOVED.git-id          |     1 -
 cake/app/webroot/persona-pages/img/gif-v3.gif      |   Bin 0 -> 1570363 bytes
 .../persona-pages/img/gif-v3.gif.REMOVED.git-id    |     1 -
 .../webroot/persona-pages/img/interactive-gif.gif  |   Bin 0 -> 1434134 bytes
 .../img/interactive-gif.gif.REMOVED.git-id         |     1 -
 cake/build/closure/compiler.jar                    |   Bin 0 -> 6007184 bytes
 cake/build/closure/compiler.jar.REMOVED.git-id     |     1 -
 .../lucidchart-mobile-sliders-landscape-4.png      |   Bin 0 -> 1718536 bytes
 ...t-mobile-sliders-landscape-4.png.REMOVED.git-id |     1 -
 .../lucidchart-mobile-sliders-portrait-4.png       |   Bin 0 -> 1614308 bytes
 ...rt-mobile-sliders-portrait-4.png.REMOVED.git-id |     1 -
 .../Versions/A/OCHamcrestIOS                       |   Bin 0 -> 3671740 bytes
 .../Versions/A/OCHamcrestIOS.REMOVED.git-id        |     1 -
 .../OCMockitoIOS.framework/Versions/A/OCMockitoIOS |   Bin 0 -> 1299132 bytes
 .../Versions/A/OCMockitoIOS.REMOVED.git-id         |     1 -
 .../Versions/A/CrashReporter                       |   Bin 0 -> 1432156 bytes
 .../Versions/A/CrashReporter.REMOVED.git-id        |     1 -
 chart-ios/libFlurry_6.0.0.a                        |   Bin 0 -> 3819300 bytes
 chart-ios/libFlurry_6.0.0.a.REMOVED.git-id         |     1 -
 67 files changed, 52595 insertions(+), 33 deletions(-)

所有这些额外文件都是我想删除的文件。

为什么在我最近的提交中更改了所有这些文件?

1 个答案:

答案 0 :(得分:6)

显然,这是关于BFG如何运作的常见误解。来自documentation

  

如果有问题 - 比如一个10MB的文件,当你告诉BFG去掉每个超过5MB的文件时 - 在受保护的提交中,它将不会被删除,而因为它仍然在那里,所以没有从早先的提交中删除它。如果你想要BFG删除你需要确保当前提交是干净的东西。

这并不意味着“没有必要先删除它,所以它不会”,这就是我所读到的。

这意味着“没有必要先删除它,所以它可能不会”。在任何情况下,它都将遵守受保护的提交具有相同树的保证。

在我的情况下,它确实先删除了这些blob,但后来不得不重新添加内容以保留HEAD树不变的要求。

在Github上有关于此的更完整的讨论:


修改

我找到了一种方法来移除HEAD上存在的之外的大blob。

这使用bash + unix实用程序来查找超过1MB的任何blob(对于不同的大小更改1024 * 1024),然后使用BFG将其删除:

comm -23 \
    <(git rev-list --objects --all | git cat-file  --batch-check="%(objecttype) %(objectname) %(objectsize) %(rest)" | grep ^blob | awk '$3 > 1024 * 1024 { print $2 }' | sort) \
    <(git ls-tree -r HEAD | cut -f 1 | cut -d ' ' -f 3 | sort) \
    > /tmp/large-blobs.list
java -jar bfg-1.12.0.jar -bi /tmp/large-blobs.list