我正在使用here中的这个脚本列出我的Git存储库中的大blob:
#!/bin/bash
#set -x
# Shows you the largest objects in your repo's pack file.
# Written for osx.
#
# @see https://stubbisms.wordpress.com/2009/07/10/git-script-to-show-largest-pack-objects-and-trim-your-waist-line/
# @author Antony Stubbs
# set the internal field spereator to line break, so that we can iterate easily over the verify-pack output
IFS=$'\n';
# list all objects including their size, sort by size, take top 10
objects=`git verify-pack -v .git/objects/pack/pack-*.idx | grep -v chain | sort -k3nr | head`
echo "All sizes are in kB's. The pack column is the size of the object, compressed, inside the pack file."
output="size,pack,SHA,location"
for y in $objects
do
# extract the size in bytes
size=$((`echo $y | cut -f 5 -d ' '`/1024))
# extract the compressed size in bytes
compressedSize=$((`echo $y | cut -f 6 -d ' '`/1024))
# extract the SHA
sha=`echo $y | cut -f 1 -d ' '`
# find the objects location in the repository tree
other=`git rev-list --all --objects | grep $sha`
#lineBreak=`echo -e "\n"`
output="${output}\n${size},${compressedSize},${other}"
done
echo -e $output | column -t -s ', '
我对这一行感到有些疑惑:
objects=`git verify-pack -v .git/objects/pack/pack-*.idx | grep -v chain | sort -k3nr | head`
为什么grep -v chain
(其中-v是反转匹配)?因此,您将获得提交, blob 和树对象。但二进制文件是不是总是存储在 blob 对象中?这意味着,对于定位大型二进制文件,您应该简单地执行: grep blob ?
我没有看到在结果集中包含树和提交对象的目的。
答案 0 :(得分:0)
grep -v chain
抛出这样的行:
chain length = 1: 44 objects
chain length = 2: 30 objects
chain length = 3: 15 objects
chain length = 4: 11 objects
实际上,它确实有点无意义,因为它们的字段3(-k3nr
)的数值将是字符串" ="的数值,即零。
但二进制文件是不是总是存储在 blob 对象中?这意味着,对于定位大型二进制文件,您应该这样做: grep blob 而不是?
不确定。或者省略所有grep
并在所有内容上运行它,包括表单的最后一行:
.git/objects/pack/pack-6a0a97d0239b29f4fef82f52b326317cd0cdd94f.pack: ok
树或提交或标记对象不太可能成为前十名,如果确实如此,可能会很有趣。