在字符串中查找单词的方法

时间:2013-12-17 11:18:19

标签: python regex string

我正在搜索一个字符串,会有一行以ssid开头,我需要在那之后直接找到这个单词。所以一个例子是“ssid home”,家就是我希望回归的词。 我以一种迂回的方式做到了这一点,看起来非常混乱,我应该采用什么方式做这个,也许还有一种方法来整理我所做的事情?

a = """
!
interface blah
a
ssid test
v
v
"""

b = a.split("\n")
matches = [x for x in b if "ssid" in x]
matches = [i.split() for i in matches]
print matches[0][1]

4 个答案:

答案 0 :(得分:5)

a = """
!
interface blah
a
ssid test1
v
ssid test2
v
ssid test3
"""
p = r'(?<=ssid )\S+' # non-whitespace character chunk after ssid
match = re.findall(p, a)

这将为您提供:['test1', 'test2', 'test3']

答案 1 :(得分:3)

根据您的密钥ssid拆分字符串,然后在丢弃第一个分区后,迭代剩余的分区,只接受第一个字并丢弃其余分区。

>>> a = """
!
interface blah
a
ssid test1
v
ssid test2
v
ssid test3
"""
>>> [e.split(None, 1)[0] for e in a.split("ssid")[1:]]
['test1', 'test2', 'test3']

类似的正则表达式解决方案是

>>> re.findall("ssid\s+(\w+)", a)
['test1', 'test2', 'test3']

答案 2 :(得分:1)

flag, result = False, []
for item in a.split():
    if flag:
        result.append(item)
        flag = False
    if item == "ssid":
        flag = True
        continue
return result

让我们做一些时间比较:)

a = """
!
interface blah
a
ssid test1
v
ssid test2
v
ssid test3
"""
import re
p = r'(?<=ssid )\S+'
def ray(a):
    return re.findall(p, a)

def abhijit(a):
    return [e.split(None, 1)[0] for e in a.split("ssid")[1:]]

def thefourtheye(a):
    flag, result = False, []
    for item in a.split():
        if flag:
            result.append(item)
            flag = False
        if item == "ssid":
            flag = False
            continue
    return result

from timeit import timeit
print "Ray", timeit("ray(a)", "from __main__ import ray, a, p")
print "Abhijit", timeit("abhijit(a)", "from __main__ import abhijit, a")
print "thefourtheye", timeit("thefourtheye(a)", "from __main__ import thefourtheye, a")

<强>输出

Ray 2.4214360714
Abhijit 1.39024496078
thefourtheye 1.11726903915

答案 3 :(得分:1)

a = """
!
interface blah
a
ssid test
v
v
"""

for line in a.split("\n"):
   if line.startswith("ssid"):
      result = line.split()[1]
      break

使用for循环可以在找到匹配行时突破,而不是检查所有剩余的匹配行。这是否值得取决于预期数据的长度。