匹配不完全合格的路径

时间:2013-09-24 17:23:05

标签: python python-2.7

假设我有一系列不完全限定的路径,这些路径缺少某些部分,但保证有两个属性:

  1. 不完全和完全限定路径的最后部分将完全相同,并且
  2. 不完全限定路径的每个部分的顺序将与完全限定路径的部分的实际顺序相匹配。
  3. 例如,

    p1 = '/foo/baz/myfile.txt'
    p2 = '/bar/foo/myfile.txt'
    actual = '/foo/bar/baz/myfile.txt'
    

    在这种情况下,p1会匹配,但p2不匹配,因为在实际路径中,bar发生在foo之后。足够简单:[actual.split('/').index(part) for part in p1.split('/')]将是一个有序列表,但同样的理解但p2不会。

    但如果路径中有重复会发生什么?

    p1 = '/foo/bar/bar/myfile.txt'
    p2 = '/bar/bar/baz/myfile.txt'
    actual = '/foo/bar/baz/bar/myfile.txt'
    

    如何确定p1确实匹配,但p2没有匹配(因为虽然baz发生在第一个bar之后,但它不会在第二个{{1}}之后发生?

2 个答案:

答案 0 :(得分:1)

方法1:使用list.index

def match(strs, actual):
    seen = {}
    act = actual.split('/')
    for x in strs.split('/'):
        if x in seen:
            #if the item was already seen, so start search 
            #after the previous matched index
            ind = act.index(x, seen[x]+1)
            yield ind
            seen[x] = ind
        else:
            ind = act.index(x)
            yield ind
            seen[x] = ind
...             
>>> p1 = '/foo/baz/myfile.txt'
>>> p2 = '/bar/foo/myfile.txt'
>>> actual = '/foo/bar/baz/myfile.txt'
>>> list(match(p1, actual))     #ordered list, so matched
[0, 1, 3, 4]
>>> list(match(p2, actual))     #unordered list, not matched
[0, 2, 1, 4]

>>> p1 = '/foo/bar/bar/myfile.txt'
>>> p2 = '/bar/bar/baz/myfile.txt'
>>> actual = '/foo/bar/baz/bar/myfile.txt'
>>> list(match(p1, actual))     #ordered list, so matched
[0, 1, 2, 4, 5]
>>> list(match(p2, actual))     #unordered list, not matched
[0, 2, 4, 3, 5]

方法2:使用defaultdictdeque

from collections import defaultdict, deque
def match(strs, actual):
    indexes_act = defaultdict(deque)
    for i, k in enumerate(actual.split('/')):
        indexes_act[k].append(i)
    prev = float('-inf')
    for item in strs.split('/'):
        ind = indexes_act[item][0]
        indexes_act[item].popleft()
        if ind > prev:
            yield ind
        else:
            raise ValueError("Invalid string")
        prev = ind

<强>演示:

>>> p1 = '/foo/baz/myfile.txt'
>>> p2 = '/bar/foo/myfile.txt'
>>> actual = '/foo/bar/baz/myfile.txt'
>>> list(match(p1, actual))
[0, 1, 3, 4]
>>> list(match(p2, actual))
    ...
    raise ValueError("Invalid string")
ValueError: Invalid string

>>> p1 = '/foo/bar/bar/myfile.txt'
>>> p2 = '/bar/bar/baz/myfile.txt'
>>> actual = '/foo/bar/baz/bar/myfile.txt'
>>> list(match(p1, actual))
[0, 1, 2, 4, 5]
>>> list(match(p2, actual))
    ...
    raise ValueError("Invalid string")
ValueError: Invalid string

答案 1 :(得分:1)

def match(path, actual):
    path = path.strip('/').split('/')
    actual = iter(actual.strip('/').split('/'))
    for pathitem in path:
        for item in actual:
            if pathitem == item:
                break
        else:
            # The for-loop never breaked, so pathitem was never found
            return False
    return True

q1 = '/foo/baz/myfile.txt'
q2 = '/bar/foo/myfile.txt'
p1 = '/foo/bar/bar/myfile.txt'
p2 = '/bar/bar/baz/myfile.txt'
actual = '/foo/bar/baz/bar/myfile.txt'

print(match(q1, actual))
# True

print(match(q2, actual))
# False

print(match(p1, actual))
# True

print(match(p2, actual))
# False