如果我有一个包含子模块的存储库,并且我想为sneakernet创建一个包,我该如何使该包包含更新子模块所需的对象?
例如,假设我有一个名为parent
的存储库,它包含一个名为submod
的子模块。我想创建一个包含自提交basecommit
以来所有最近工作的包,所以很自然地,我会从parent
根目录内部执行:
git bundle create mybundlefile.bundle basecommit..myworkingbranch
这将创建一个名为mybundlefile.bundle
的文件,其中包含范围parent
上的basecommit..myworkingbranch
repo以及ref myworkingbranch
中的所有提交对象。问题是,如果这些提交中的任何一个更改了子模块,则生成的包将不会非常有用,因为这样的提交仅作为更改子模块哈希存储在包文件中。因此,存储在bundle文件中的对象只是说“我正在提交3ba024b
并且我将子模块submod
的哈希从2b941cf
更改为1bb84ec
。”但是bundle实际上并没有包含对象2b941cf..1bb84ec
necceassary来更新bundle中的子模块并为myworkingbranch
创建一个干净的工作树。
如何创建捆绑文件,以便包含子模块回购中的所有这些对象。也就是说,如果父级repo的基本提交basecommit
将子模块submod
指向散列A
,并且父级repo的工作分支myworkingbranch
指向子模块submod
,则为散列B
1}},然后我的包不仅需要包含basecommit..myworkingbranch
,还需要包含A..B
。
答案 0 :(得分:3)
如何创建捆绑文件,以便包含子模块回购中的所有对象。
你做不到。捆绑文件特定于Git存储库。子模块只是指向另一个 Git存储库的链接,因此您必须为单独的Git存储库创建单独的包。 (然后,您可以从各种捆绑包中进行归档。)
很明显Git可以下载到每个子模块中并在每个子模块中运行git bundle
,为您制作这些不同的文件。但是传递给子git bundle
命令的参数很棘手。您必须编写自己的脚本并使用git submodule foreach
使其在每个子模块中运行,并让您的脚本找出要使用的参数。
然后,您可能希望将每个包(可能是tar或rar或zip或任何存档)打包在一起,以便进行传输和取消绑定/获取。在分拆期间你会想要另一个git submodule foreach
,这可能会更加烦人,因为理想情况下这应该使用新的子模块集(在分拆顶级一级并选择适当的提交之后)。
有人可能会编写脚本来执行此操作,但如果是这样,我不知道。它不包含在Git本身中 - 捆绑本身就是一种笨重而非主流的。
答案 1 :(得分:0)
我写了github.com/TamaMcGlinn/submodule_bundler来做到这一点。有很多极端的情况,我怀疑我已经掌握了所有这些情况。请尝试一下,如果您的用例需要任何修复,请在项目上打开一个问题。
为后代,我将列出上面项目中的所有代码;但您应该直接从github克隆。
#!/usr/bin/env python3
""" Create bundles for submodules """
import os
import argparse
import subprocess
import tarfile
import submodule_commits
import string
import random
import shutil
parser = argparse.ArgumentParser(description='Create bundles for submodules (recursively), \
to facilitate sneakernet connections. On the online computer, \
a bundle is made for each repository, and then packed into a .tar file. \
On the offline computer, use unbundle.py on the tarfile to unzip and \
pull from the corresponding bundle for each repository.')
parser.add_argument('filename', metavar='filename', type=str, help='file to create e.g. ../my_bundles.tar')
parser.add_argument('commit_range', metavar='[baseline]..[target]', type=str, default='..HEAD', nargs='?',
help='commit range of top-level repository to bundle; defaults to everything')
args = parser.parse_args()
class IllegalArgumentError(ValueError):
pass
try:
[baseline, target] = args.commit_range.split('..')
except ValueError:
raise IllegalArgumentError(f"Invalid commit range: '{args.commit_range}': "
+ "Expected [baseline]..[target]. Baseline and target are optional "
+ "but the dots are necessary to distinguish between the two.") from None
full_histories = False
from_str = f'from {baseline} '
if baseline == '':
print("No baseline (all bundles will be complete history bundles)")
full_histories = True
from_str = "from scratch "
if target == '':
target = 'HEAD'
print('Making bundles to update ' + from_str + f'to {target}')
updates_required = {}
new_submodules = {}
bundles = []
for submodule in submodule_commits.submodule_commits('.', target):
new_submodules[submodule['subdir']] = submodule['commit']
root_dir = os.getcwd()
tar_file_name = os.path.basename(args.filename).split('.')[0]
temp_dir = f'temp_dir_for_{tar_file_name}_bundles' # note this won't work if that dir already has contents
def create_bundle(submodule_dir, new_commit_sha, baseline_descriptor=''):
bundle_path_in_temp = f'{submodule_dir}.bundle'
bundle_path = f'{temp_dir}/{bundle_path_in_temp}'
if submodule_dir == '.':
route_to_root = './'
else:
route_to_root = (submodule_dir.count('/') + 1) * '../'
os.makedirs(os.path.dirname(bundle_path), exist_ok=True)
os.chdir(submodule_dir)
rev_parse_output = subprocess.check_output(['git', 'rev-parse', '--abbrev-ref', 'HEAD'])
current_branch = rev_parse_output.decode("utf-8").strip('\n')
subprocess.run(['git', 'bundle', 'create', route_to_root + bundle_path,
f'{baseline_descriptor}{current_branch}', '--tags'])
bundles.append(bundle_path_in_temp)
os.chdir(root_dir)
if not full_histories:
for existing_commit in submodule_commits.submodule_commits('.', baseline):
baseline_commit = existing_commit['commit']
submodule_dir = existing_commit['subdir']
new_commit_sha = new_submodules.pop(submodule_dir, None)
if new_commit_sha is None:
# the submodule was removed, don't need to make any bundle
continue
if new_commit_sha == baseline_commit:
# no change, no bundle
continue
print(f"Need to update {submodule_dir} from {baseline_commit} to {new_commit_sha}")
create_bundle(submodule_dir, new_commit_sha, f'{baseline_commit}..')
for submodule_dir, commit_sha in new_submodules.items():
print(f"New submodule {submodule_dir}")
bundle_name = f'{submodule_dir}.bundle'
create_bundle(submodule_dir, commit_sha)
# the bundle of the top-level repository itself is oddly called '..bundle'
# it is impossible to have a submodule that clashes with this
# because you cannot name a directory '.'
baseline_descriptor = ''
if not full_histories:
baseline_descriptor = f'{baseline}..'
create_bundle('.', target, baseline_descriptor)
print("Packing bundles into tarfile:")
with tarfile.open(args.filename, mode="w:") as tar: # no compression; git already does that
os.chdir(temp_dir)
for bundle in bundles:
print(bundle)
tar.add(bundle)
os.chdir(root_dir)
print("Removing temp directory")
shutil.rmtree(temp_dir)
#!/usr/bin/env python3
""" Extract bundles for submodules """
import os
import argparse
import shutil
import tarfile
import pullbundle
import submodule_commits
import subprocess
parser = argparse.ArgumentParser(description='Create bundles for submodules (recursively), \
to facilitate sneakernet connections. On the online computer, \
a bundle is made for each repository, and then packed into a .tar file. \
On the offline computer, use unbundle.py on the tarfile to unzip and \
pull from the corresponding bundle for each repository.')
parser.add_argument('filename', metavar='filename', type=str, help='file to create e.g. ../my_bundles.tar')
args = parser.parse_args()
tar_file_name = os.path.basename(args.filename).split('.')[0]
temp_dir = f'temp_dir_for_{tar_file_name}_extraction'
with tarfile.open(args.filename, 'r:') as tar:
tar.extractall(temp_dir)
root_dir = os.getcwd()
def is_git_repository(dir):
""" Return true iff dir exists and is a git repository (by checking git rev-parse --show-toplevel) """
if not os.path.exists(dir):
return False
previous_dir = os.getcwd()
os.chdir(dir)
rev_parse_toplevel = subprocess.check_output(['git', 'rev-parse', '--show-toplevel'])
git_dir = rev_parse_toplevel.decode("utf-8").strip('\n')
current_dir = os.getcwd().replace('\\', '/')
os.chdir(previous_dir)
return current_dir == git_dir
pullbundle.pullbundle(f'{temp_dir}/..bundle', True)
for submodule in submodule_commits.submodule_commits():
subdir = submodule["subdir"]
commit = submodule["commit"]
print(f'{subdir} -> {commit}')
bundle_file_from_root = f'{temp_dir}/{subdir}.bundle'
if not os.path.isfile(bundle_file_from_root):
print(f'Skipping submodule {subdir} because there is no bundle')
else:
if not is_git_repository(subdir):
# clone first if the subdir doesn't exist or isn't a git repository yet
subprocess.run(['git', 'clone', bundle_file_from_root, subdir])
route_to_root = (subdir.count('/') + 1) * '../'
bundle_file = f'{route_to_root}{bundle_file_from_root}'
os.chdir(subdir)
pullbundle.pullbundle(bundle_file)
os.chdir(root_dir)
print("Removing temp directory")
shutil.rmtree(temp_dir)
subprocess.run(['git', 'submodule', 'update', '--recursive'])
#!/usr/bin/env python3
""" Pull from bundles """
import argparse
import subprocess
import re
import os
ref_head_regex = 'refs/heads/(.*)'
head_commit = None
class UnableToFastForwardError(RuntimeError):
pass
def iterate_branches(bundle_refs):
""" Given lines of output from 'git bundle unbundle' this writes the HEAD commit to the head_commit global
and yields each branch, commit pair """
global head_commit
for bundle_ref in bundle_refs:
ref_split = bundle_ref.split()
commit = ref_split[0]
ref_name = ref_split[1]
if ref_name == 'HEAD':
head_commit = commit
else:
match = re.search(ref_head_regex, ref_name)
if match:
branch_name = match.group(1)
yield (branch_name, commit)
def update_branch(branch, commit, check_divergence=False):
""" Update branch to commit if possible by fast-forward """
rev_parse_branch_output = subprocess.check_output(['git', 'rev-parse', branch])
old_commit = rev_parse_branch_output.decode("utf-8").strip('\n')
if old_commit == commit:
print(f'Skipping {branch} which is up-to-date at {commit}')
else:
rev_parse_current_output = subprocess.check_output(['git', 'rev-parse', '--abbrev-ref', 'HEAD'])
current_branch = rev_parse_current_output.decode("utf-8").strip('\n')
returncode = subprocess.call(['git', 'merge-base', '--is-ancestor', branch, commit])
branch_is_behind_commit = returncode == 0
if branch_is_behind_commit:
print(f'Fast-forwarding {branch} from {old_commit} to {commit}')
if current_branch == branch:
subprocess.call(['git', 'reset', '--hard', '-q', commit])
else:
subprocess.call(['git', 'branch', '-Dq', branch])
subprocess.run(['git', 'branch', '-q', branch, commit])
else:
returncode = subprocess.call(['git', 'merge-base', '--is-ancestor', commit, branch])
branch_is_ahead_of_commit = returncode == 0
if branch_is_ahead_of_commit:
print(f'Skipping {branch} which is at {old_commit}, ahead of bundle version {commit}')
if current_branch == branch and check_divergence:
raise UnableToFastForwardError("Unable to update branch: already ahead of bundle") from None
else:
print(f'Error: {branch} already exists, at {old_commit} which diverges from '
+ f'bundle version at {commit}')
print('You could switch to the bundle version as follows, but you might lose work.')
print(f'git checkout -B {branch} {commit}')
if current_branch == branch and check_divergence:
raise UnableToFastForwardError("Unable to update branch: diverged from bundle") from None
def checkout(commit):
subprocess.run(['git', 'checkout', '-q', '-f', commit])
def pullbundle(bundle_file, check_divergence=False):
""" Main function; update all branches from given bundle file """
global head_commit
head_commit = None
subprocess.run(['git', 'fetch', bundle_file, '+refs/tags/*:refs/tags/*'], stderr=subprocess.DEVNULL)
unbundle_output = subprocess.check_output(['git', 'bundle', 'unbundle', bundle_file])
bundle_refs = filter(None, unbundle_output.decode("utf-8").split('\n'))
for branch, commit in iterate_branches(bundle_refs):
returncode = subprocess.call(['git', 'show-ref', '-q', '--heads', branch])
branch_exists = returncode == 0
if branch_exists:
update_branch(branch, commit, check_divergence)
else:
print(f'Created {branch} pointing at {commit}')
subprocess.run(['git', 'branch', branch, commit])
checkout(commit)
if head_commit is not None:
# checkout as detached head without branch
# note this might not happen; if the bundle updates a bunch of branches
# then whichever one we were already on is updated already and we don't need to do anything here
checkout(head_commit)
if __name__ == "__main__":
parser = argparse.ArgumentParser(description='Update all branches and tags contained in a bundle file')
parser.add_argument('filename', metavar='filename', help='git bundle file to pull e.g. ../foo.bundle')
parser.add_argument('-c', '--check_divergence', help="return an errorcode if the current branch was not updated "
+ "because of already being ahead or having diverged from the bundle version of that branch",
action='store_true')
args = parser.parse_args()
pullbundle(args.filename, args.check_divergence)
#!/usr/bin/env python3
""" Print the commit of each submodule (recursively) at some commit"""
import os
import argparse
import subprocess
import re
def print_submodule_commits(root_subdir, root_commit):
for result in submodule_commits(root_subdir, root_commit):
print(f'{result["subdir"]} {result["commit"]}')
def submodule_commits(subdir='.', commit='HEAD', prefix=''):
is_subdir = subdir != '.'
if is_subdir:
previous_dir = os.getcwd()
os.chdir(subdir)
git_ls_tree = subprocess.check_output(['git', 'ls-tree', '-r', commit])
ls_tree_lines = filter(None, git_ls_tree.decode("utf-8").split("\n"))
submodule_regex = re.compile(r'^[0-9]+\s+commit')
for line in ls_tree_lines:
if submodule_regex.match(line):
line_split = line.split()
commit_hash = line_split[2]
subdirectory = line_split[3]
submodule_prefix = subdirectory
if prefix != '':
submodule_prefix = f'{prefix}/{subdirectory}'
yield {'subdir': submodule_prefix, 'commit': commit_hash}
yield from submodule_commits(subdirectory, commit_hash, submodule_prefix)
if is_subdir:
os.chdir(previous_dir)
if __name__ == "__main__":
parser = argparse.ArgumentParser(description='Print the commit of each submodule (recursively) at some commit')
parser.add_argument('commit', metavar='commit_hash', type=str, default='HEAD', nargs='?',
help='commit to examine; defaults to HEAD')
args = parser.parse_args()
print_submodule_commits('.', args.commit)