快速获取远程svn存储库的所有svn:externals的列表

时间:2012-04-17 00:16:36

标签: svn

我们有一个包含大量目录和文件的svn存储库,我们的构建系统需要能够在检出之前以递归方式查找存储库中的分支的所有svn:externals属性。目前我们使用:

svn propget svn:externals -R http://url.of.repo/Branch

事实证明这非常耗时并且是一个真正的带宽生长。似乎客户端正在接收回购中所有内容的所有道具并在本地进行过滤(尽管我还没有通过wireshark确认这一点)。有更快的方法吗?最好是让服务器只返回所需数据的某种方式。

6 个答案:

答案 0 :(得分:28)

正如您所提到的,它确实消耗了网络带宽。但是,如果您可以访问托管这些存储库的服务器,则可以通过file://协议运行它。它被证明更快,而不是网络消耗。

svn propget svn:externals -R file:///path/to/repo/Branch

此外,如果你有整个工作副本,你也可以在你的WC中运行它。

svn propget svn:externals -R /path/to/WC

希望它能帮助您更快地实现结果!

答案 1 :(得分:2)

我终于想出了一个解决方案。我决定将请求分解为多个小的svn请求,然后让每个请求都由一个线程池运行。这种sms服务器,但在我们的情况下,svn服务器在局域网上,这个查询只在完整版本中进行,因此它似乎不是问题。

import os
import sys
import threading
import ThreadPool

thread_pool = ThreadPool.ThreadPool(8)
externs_dict = {}
externs_lock = threading.Lock()

def getExternRev( path, url ):
    cmd = 'svn info "%s"' % url
    pipe = os.popen(cmd, 'r')
    data = pipe.read().splitlines()

    #Harvest last changed rev
    for line in data:
        if "Last Changed Rev" in line:
            revision = line.split(":")[1].strip()
            externs_lock.acquire()
            externs_dict[path] = (url, revision)
            externs_lock.release()

def getExterns(url, base_dir):
    cmd = 'svn propget svn:externals "%s"' % url
    pipe = os.popen(cmd, 'r')
    data = pipe.read().splitlines()
    pipe.close()

    for line in data:
        if line:
            line = line.split()
            path = base_dir + line[0]
            url = line[1]
            thread_pool.add_task( getExternRev, path, url )

def processDir(url, base_dir):
    thread_pool.add_task( getExterns, url, base_dir )

    cmd = 'svn list "%s"' % url
    pipe = os.popen(cmd, 'r')
    listing = pipe.read().splitlines()
    pipe.close()

    dir_list = []
    for node in listing:
        if node.endswith('/'):
            dir_list.append(node)

    for node in dir_list:
        #externs_data.extend( analyzePath( url + node, base_dir + node ) )
        thread_pool.add_task( processDir, url+node, base_dir+node )

def analyzePath(url, base_dir = ''):
    thread_pool.add_task( processDir, url, base_dir )
    thread_pool.wait_completion()


analyzePath( "http://url/to/repository" )
print externs_dict

答案 2 :(得分:0)

由于-R开关,它很慢;在存储库路径中的所有目录都会以递归方式搜索属性。这是很多工作。

答案 3 :(得分:0)

不理想的解决方案(可能有副作用)而不回答您的问题,但

您可以重写所有外部定义并在一个常见的已知位置添加(重写) - 这样您就可以在更改后消除pg中的递归

答案 4 :(得分:0)

如果您不介意使用Python和pysvn库,这里是一个完整的命令行程序,我用于SVN外部:

"""
@file
@brief SVN externals utilities.
@author Lukasz Matecki
"""
import sys
import os
import pysvn
import argparse

class External(object):

    def __init__(self, parent, remote_loc, local_loc, revision):
        self.parent = parent
        self.remote_loc = remote_loc
        self.local_loc = local_loc
        self.revision = revision

    def __str__(self):
        if self.revision.kind == pysvn.opt_revision_kind.number:
            return """\
Parent:     {0}
Source:     {1}@{2}
Local name: {3}""".format(self.parent, self.remote_loc, self.revision.number, self.local_loc)
        else:
            return """\
Parent:     {0}
Source:     {1} 
Local name: {2}""".format(self.parent, self.remote_loc, self.local_loc)


def find_externals(client, repo_path, external_path=None):
    """
    @brief Find SVN externals.
    @param client (pysvn.Client) The client to use.
    @param repo_path (str) The repository path to analyze.
    @param external_path (str) The URL of the external to find; if omitted, all externals will be searched.
    @returns [External] The list of externals descriptors or empty list if none found.
    """
    repo_root = client.root_url_from_path(repo_path)

    def parse(ext_prop):
        for parent in ext_prop:
            external = ext_prop[parent]
            for line in external.splitlines():
                path, name = line.split()
                path = path.replace("^", repo_root)
                parts = path.split("@")
                if len(parts) > 1:
                    url = parts[0]
                    rev = pysvn.Revision(pysvn.opt_revision_kind.number, int(parts[1]))
                else:
                    url = parts[0]
                    rev = pysvn.Revision(pysvn.opt_revision_kind.head)
                retval = External(parent, url, name, rev)
                if external_path and not external_path == url:
                    continue
                else:
                    yield retval

    for entry in client.ls(repo_path, recurse=True):
        if entry["kind"] == pysvn.node_kind.dir and entry["has_props"] == True:
            externals = client.propget("svn:externals", entry["name"])
            if externals:
                for e in parse(externals):
                    yield e


def check_externals(client, externals_list):
    for i, e in enumerate(externals_list):
        url = e.remote_loc
        rev = e.revision
        try:
            info = client.info2(url, revision=rev, recurse=False)
            props = info[0][1]
            url = props.URL
            print("[{0}] Existing:\n{1}".format(i + 1, "\n".join(["   {0}".format(line) for line in str(e).splitlines()])))
        except:
            print("[{0}] Not found:\n{1}".format(i + 1, "\n".join(["   {0}".format(line) for line in str(e).splitlines()])))

def main(cmdargs):
    parser = argparse.ArgumentParser(description="SVN externals processing.",
                                     formatter_class=argparse.RawDescriptionHelpFormatter,
                                     prefix_chars='-+')

    SUPPORTED_COMMANDS = ("check", "references")

    parser.add_argument(
        "action",
        type=str,
        default="check",
        choices=SUPPORTED_COMMANDS,
        help="""\
the operation to execute:
   'check' to validate all externals in a given location;
   'references' to print all references to a given location""")

    parser.add_argument(
        "url",
        type=str,
        help="the URL to operate on")

    parser.add_argument(
        "--repo", "-r",
        dest="repo",
        type=str,
        default=None,
        help="the repository (or path within) to perform the operation on, if omitted is inferred from url parameter")

    args = parser.parse_args()

    client = pysvn.Client()

    if args.action == "check":
        externals = find_externals(client, args.url)
        check_externals(client, externals)
    elif args.action == "references":
        if args.repo:
            repo_root = args.repo
        else:
            repo_root = client.root_url_from_path(args.url)
        for i, e in enumerate(find_externals(client, repo_root, args.url)):
            print("[{0}] Reference:\n{1}".format(i + 1, "\n".join(["   {0}".format(line) for line in str(e).splitlines()])))

if __name__ == "__main__":
    sys.exit(main(sys.argv))

这应该适用于Python 2和Python 3。 您可以像这样使用它(删除实际地址):

python svn_externals.py references https://~~~~~~~~~~~~~~/cmd_utils.py
[1] Reference:
   Parent:     https://~~~~~~~~~~~~~~/BEFORE_MK2/scripts/utils
   Source:     https://~~~~~~~~~~~~~~/tools/python/cmd_utils.py
   Local name: cmd_utils.py
[2] Reference:
   Parent:     https://~~~~~~~~~~~~~~/VTB-1425_PCU/scripts/utils
   Source:     https://~~~~~~~~~~~~~~/tools/python/cmd_utils.py
   Local name: cmd_utils.py
[3] Reference:
   Parent:     https://~~~~~~~~~~~~~~/scripts/utils
   Source:     https://~~~~~~~~~~~~~~/tools/python/cmd_utils.py
   Local name: cmd_utils.py

至于性能,这很快(尽管我的存储库非常小)。你必须亲自检查一下。

答案 5 :(得分:0)

不确定我在哪里找到这个宝石,但是看到外部有自己的外部非常有用:

Windows:
    svn status . | findstr /R "^X"

Linux/Unix:
    svn status . | grep -E "^X"