如何从Python中的列表返回匹配项

时间:2016-07-18 06:06:41

标签: python

我希望我的函数返回列表中的一个短字符串,如果它存在于另一段长字符串中。你会怎么做?

目前我脑子里想到了这一点,但有没有更好的方法在Python中实现func?

>>> def func(shortStrList, longStr):
...     return shortStrList[[x in longStr for x in shortStrList].index(True)]
...
>>> func(['ABC', 'DEF', 'GHI'], 'PQRABCD')
'ABC'
>>> func(['ABC', 'DEF', 'GHI'], 'DEFPQRACD')
'DEF'

4 个答案:

答案 0 :(得分:4)

您可以将生成器表达式与if子句一起使用:

def func(shortStrList, longStr):
    return next(s for s in shortStrList if s in longStr)

答案 1 :(得分:2)

整合答案/评论,并对不同答案的表现进行快速测试......

>>> def timeTest(s, f):
...     t1 = time.clock()
...     for x in xrange(s):
...         f(['ABC', 'DEF', 'GHI'], 'PQRABCD')
...         f(['ABC', 'DEF', 'GHI'], 'PQRACDEF')
...         f(['ABC', 'DEF', 'GHI'], 'PGHIQRCD')
...     t2 = time.clock()
...     print t2 - t1
...
>>>
>>> def func1(shortStrList, longStr):
...     return shortStrList[[x in longStr for x in shortStrList].index(True)]
... 
>>> timeTest(10000000, func1)
18.4710161502
>>>
>>> def func2(shortStrList, longStr):
...     return next(s for s in shortStrList if s in longStr)
... 
>>> timeTest(10000000, func2)
26.1494262581
>>>
>>> def func3(shortStrList, longStr):
...     filter( lambda x: x in longStr, shortStrList)[0]
...  
>>> timeTest(10000000, func3)
26.1221138429
>>>
>>> def func4(shortStrList, longStr):
...     for s in shortStrList:
...         if s in longStr: return s
...  
>>> timeTest(10000000, func4)
8.78067844999
>>>
>>> def func5(shortStrList, longStr):
...     return [string for string in shortStrList if string in longStr][0]
... 
>>> timeTest(10000000, func5)
12.549210555
>>>

似乎做了循环(func4),因为 Ekeyme Mo 建议最快。 (虽然不确定这是否可以重写为一个班轮)

如果短字符串列表的长度不同,可能会优先考虑不同的方法。 虽然简单循环仍然执行最快,但当列表很长时,next()比列表理解更快。

>>> def timeTest(s, f):
...     sl = ['ABC'] + ['ZXYZ']*50 + ['DEF'] + ['RQDSF']*50 + ['GHI']
...     t1 = time.clock()
...     for x in xrange(s):
...         f(sl, 'PQRABCD')
...         f(sl, 'PQRACDEF')
...         f(sl, 'PGHIQRCD')
...     t2 = time.clock()
...     print t2 - t1
...     
>>> def func1(shortStrList, longStr):
...     return shortStrList[[x in longStr for x in shortStrList].index(True)]
... 
>>> timeTest(100000, func1)
2.14106761862
>>> 
>>> def func2(shortStrList, longStr):
...     return next(s for s in shortStrList if s in longStr)
... 
>>> timeTest(100000, func2)
0.867831158122
>>> 
>>> def func3(shortStrList, longStr):
...     filter( lambda x: x in longStr, shortStrList)[0]
...     
>>> timeTest(100000, func3)
3.19491244615
>>> 
>>> def func4(shortStrList, longStr):
...     for s in shortStrList:
...         if s in longStr: return s
...         
>>> timeTest(100000, func4)
0.629572839949
>>> 
>>> def func5(shortStrList, longStr):
...     return [string for string in shortStrList if string in longStr][0]
... 
>>> timeTest(100000, func5)
1.31148152449
>>> 

答案 2 :(得分:0)

就品味而言,如果您愿意,可以使用filter代替列表推导。

def func(shortStrList, longStr):
    filter( lambda x: x in longStr, shortStrList)[0]
func( ['ABC', 'DEF', 'GHI', 'JDSLDF'], 'PQRABCD')
# ABC

希望有所帮助:)

答案 3 :(得分:0)

你可以像这样保持简单:

(?R)

输出:

def func(shortStrList, longStr):
    try:
        return [string for string in shortStrList if string in longStr][0]
    except IndexError:
        return("No matches found")

你也可以这样做,没有列表推导。一旦找到第一个解决方案,它就会停止。

>>> func(['ABC', 'DEF', 'GHI'], 'PQRABCD')
'ABC'
>>> func(['ABC', 'DEF', 'GHI'], 'DEFPQRACD')
'DEF'
>>> func(['ABC', 'DEF', 'GHI'], 'ABCDEF')
'ABC'

甚至这样,这是最简单的方法:

def func2(shortStrList, longStr):

    result = ""
    for string in shortStrList:
        if string in longStr:
            result += string
            break
    else:
        return("No matches found")

    return result