我计划以非正统的方式使用LibGit2 / LibGit2Sharp和GIT,我要求任何熟悉API的人确认我建议的理论工作。 :)
方案
只有主分支将存在于存储库中。将跟踪和提交包含大型二进制和非二进制文件的大量目录。大多数二进制文件将在提交之间更改。由于磁盘空间限制,磁盘库应该包含不超过10次提交(磁盘现在经常填满)。
API未提供的功能是将从指定的CommitId开始的提交历史记录截断回主分支的初始提交,并删除任何因此而悬空的GIT对象。
我已经使用ReferenceCollection.RewiteHistory方法进行了测试,我可以使用它从提交中删除父项。这创建了一个新的提交历史,从CommitId开始返回HEAD。但是,这仍然会留下所有旧提交以及这些提交所特有的任何引用或blob。我现在的计划是自己清理这些悬挂的GIT物体。有没有人看到这种方法有任何问题或有更好的方法?
答案 0 :(得分:3)
但是仍然会留下所有旧提交以及这些提交所特有的任何引用或blob。我现在的计划是自己清理这些悬挂的GIT物品。
在重写存储库的历史记录时,LibGit2Sharp负责不丢弃重写的引用。默认情况下,存储它们的命名空间为refs/original
。这可以通过RewriteHistoryOptions
参数进行更改。
为了删除旧的提交,树和blob,首先必须删除这些引用。这可以通过以下代码实现:
foreach (var reference in repo.Refs.FromGlob("refs/original/*"))
{
repo.Refs.Remove(reference);
}
下一步将清除现在悬空的git对象。但是,这不能通过LibGit2Sharp(尚未)完成。一种选择是shell来输出以下命令
git gc --aggressive
这将以非常有效/破坏性/不可恢复的方式减少存储库的大小。
有没有人看到这种方法有任何问题或有更好的方法?
您的方法看起来有效。
有没有人看到这种方法有任何问题或有更好的方法?
如果限制是磁盘大小,另一种选择是使用 git-annex 或 git-bin 等工具来存储大型二进制文件在git存储库之外。请参阅此 SO question 以获取有关主题和潜在缺点(部署,锁定等)的一些不同观点。
我将尝试您提供的RewriteHistoryOptions和foreach代码。但是,现在它看起来像File.Delete对我来说悬挂git对象。
要注意,这可能是一条坎坷的道路
.git\objects
文件夹中的条目通常是只读文件。 File.Delete
无法在此状态下删除它们。例如,您必须先调用File.SetAttributes(path, FileAttributes.Normal);
来取消设置只读属性。Tree
和Blob
的内容可能会变成一项非常复杂的任务。答案 1 :(得分:0)
根据上面的建议,我提出的初步(静态测试)C#代码将截断特定SHA的主分支,从而创建新的初始提交。它还删除了所有悬空引用和Blob
public class RepositoryUtility
{
public RepositoryUtility()
{
}
public String[] GetPaths(Commit commit)
{
List<String> paths = new List<string>();
RecursivelyGetPaths(paths, commit.Tree);
return paths.ToArray();
}
private void RecursivelyGetPaths(List<String> paths, Tree tree)
{
foreach (TreeEntry te in tree)
{
paths.Add(te.Path);
if (te.TargetType == TreeEntryTargetType.Tree)
{
RecursivelyGetPaths(paths, te.Target as Tree);
}
}
}
public void TruncateCommits(String repositoryPath, Int32 maximumCommitCount)
{
IRepository repository = new Repository(repositoryPath);
Int32 count = 0;
string newInitialCommitSHA = null;
foreach (Commit masterCommit in repository.Head.Commits)
{
count++;
if (count == maximumCommitCount)
{
newInitialCommitSHA = masterCommit.Sha;
}
}
//there must be parent commits to the commit we want to set as the new initial commit
if (count > maximumCommitCount)
{
TruncateCommits(repository, repositoryPath, newInitialCommitSHA);
}
}
private void RecursivelyCheckTreeItems(Tree tree,Dictionary<String, TreeEntry> treeItems, Dictionary<String, GitObject> gitObjectDeleteList)
{
foreach (TreeEntry treeEntry in tree)
{
//if the blob does not exist in a commit before the truncation commit then add it to the deletion list
if (!treeItems.ContainsKey(treeEntry.Target.Sha))
{
if (!gitObjectDeleteList.ContainsKey(treeEntry.Target.Sha))
{
gitObjectDeleteList.Add(treeEntry.Target.Sha, treeEntry.Target);
}
}
if (treeEntry.TargetType == TreeEntryTargetType.Tree)
{
RecursivelyCheckTreeItems(treeEntry.Target as Tree, treeItems, gitObjectDeleteList);
}
}
}
private void RecursivelyAddTreeItems(Dictionary<String, TreeEntry> treeItems, Tree tree)
{
foreach (TreeEntry treeEntry in tree)
{
//check for existance because if a file is renamed it can exist under a tree multiple times with the same SHA
if (!treeItems.ContainsKey(treeEntry.Target.Sha))
{
treeItems.Add(treeEntry.Target.Sha, treeEntry);
}
if (treeEntry.TargetType == TreeEntryTargetType.Tree)
{
RecursivelyAddTreeItems(treeItems, treeEntry.Target as Tree);
}
}
}
private void TruncateCommits(IRepository repository, String repositoryPath, string newInitialCommitSHA)
{
//get a repository object
Dictionary<String, TreeEntry> treeItems = new Dictionary<string, TreeEntry>();
Commit selectedCommit = null;
Dictionary<String, GitObject> gitObjectDeleteList = new Dictionary<String, GitObject>();
//loop thru the commits starting at the head moving towards the initial commit
foreach (Commit masterCommit in repository.Head.Commits)
{
//if non null then we have already found the commit where we want the truncation to occur
if (selectedCommit != null)
{
//since this is a commit after the truncation point add it to our deletion list
gitObjectDeleteList.Add(masterCommit.Sha, masterCommit);
//check the blobs of this commit to see if they should be deleted
RecursivelyCheckTreeItems(masterCommit.Tree, treeItems, gitObjectDeleteList);
}
else
{
//have we found the commit that we want to be the initial commit
if (String.Equals(masterCommit.Sha, newInitialCommitSHA, StringComparison.CurrentCultureIgnoreCase))
{
selectedCommit = masterCommit;
}
//this commit is before the new initial commit so record the tree entries that need to be kept.
RecursivelyAddTreeItems(treeItems, masterCommit.Tree);
}
}
//this function simply clears out the parents of the new initial commit
Func<Commit, IEnumerable<Commit>> rewriter = (c) => { return new Commit[0]; };
//perform the rewrite
repository.Refs.RewriteHistory(new RewriteHistoryOptions() { CommitParentsRewriter = rewriter }, selectedCommit);
//clean up references now in origional and remove the commits that they point to
foreach (var reference in repository.Refs.FromGlob("refs/original/*"))
{
repository.Refs.Remove(reference);
//skip branch reference on file deletion
if (reference.CanonicalName.IndexOf("master", 0, StringComparison.CurrentCultureIgnoreCase) == -1)
{
//delete the Blob from the file system
DeleteGitBlob(repositoryPath, reference.TargetIdentifier);
}
}
//now remove any tags that reference commits that are going to be deleted in the next step
foreach (var reference in repository.Refs.FromGlob("refs/tags/*"))
{
if (gitObjectDeleteList.ContainsKey(reference.TargetIdentifier))
{
repository.Refs.Remove(reference);
}
}
//remove the commits from the GIT ObectDatabase
foreach (KeyValuePair<String, GitObject> kvp in gitObjectDeleteList)
{
//delete the Blob from the file system
DeleteGitBlob(repositoryPath, kvp.Value.Sha);
}
}
private void DeleteGitBlob(String repositoryPath, String blobSHA)
{
String shaDirName = System.IO.Path.Combine(System.IO.Path.Combine(repositoryPath, ".git\\objects"), blobSHA.Substring(0, 2));
String shaFileName = System.IO.Path.Combine(shaDirName, blobSHA.Substring(2));
//if the directory exists
if (System.IO.Directory.Exists(shaDirName))
{
//get the files in the directory
String[] directoryFiles = System.IO.Directory.GetFiles(shaDirName);
foreach (String directoryFile in directoryFiles)
{
//if we found the file to delete
if (String.Equals(shaFileName, directoryFile, StringComparison.CurrentCultureIgnoreCase))
{
//if readonly set the file to RW
FileInfo fi = new FileInfo(shaFileName);
if (fi.IsReadOnly)
{
fi.IsReadOnly = false;
}
//delete the file
File.Delete(shaFileName);
//eliminate the directory if only one file existed
if (directoryFiles.Length == 1)
{
System.IO.Directory.Delete(shaDirName);
}
}
}
}
}
}
感谢您的所有帮助。真诚地感谢。 请注意我编辑了原始代码,因为它没有考虑目录。