Question

我将路由器的以下输出存储在文件

中

-#- --length-- -----date/time------ path

 3     97103164 Feb 7 2016 01:36:16 +05:30 taas/NN41_R11_Golden_Image
 4         1896 Sep 27 2019 14:22:08 +05:30 taas/NN41_R11_Golden_Config
 5         1876 Nov 27 2017 20:07:50 +05:30 taas/nfast_default.cfg

我想从文件＆amp;中搜索子字符串'Golden_Image'。得到完整的道路。所以在这里，所需的输出将是这个字符串：

taas/NN41_R11_Golden_Image

首次尝试：

import re 
with open("outlog.out") as f:
    for line in f:
         if  "Golden_Image" in  line:
            print(line)

输出：

3 97103164 Feb 7 2016 01:36:16 +05:30 taas/NN41_R11_Golden_Image

第二次尝试

import re
hand = open('outlog.out')
for line in hand:
    line = line.rstrip()
    x = re.findall('.*?Golden_Image.*?',line)
    if len(x) > 0:
         print x

输出：

['3 97103164 Feb 7 2016 01:36:16 +05:30 taas/NN41_R11_Golden_Image']

这些都没有提供所需的输出。我该如何解决这个问题？

Answer 1

如果路径可以包含空格，这实际上是非常繁琐的您需要使用split的maxsplit参数来标识路径字段。

with open("outlog.out") as f:
    for line in f:
         field = line.split(None,7)
         if "Golden_Image" in field:
            print(field)

Answer 2

分割线并检查分割部分中是否存在“Golden_Image”字符串。

import re 
with open("outlog.out") as f:
    for line in f:
         if not "Golden_Image" in i:
             continue
         print re.search(r'\S*Golden_Image\S*', line).group()

或

images = re.findall(r'\S*Golden_Image\S*', open("outlog.out").read())

实施例：

>>> s = '''
-#- --length-- -----date/time------ path

 3     97103164 Feb 7 2016 01:36:16 +05:30 taas/NN41_R11_Golden_Image
 4         1896 Sep 27 2019 14:22:08 +05:30 taas/NN41_R11_Golden_Config
 5         1876 Nov 27 2017 20:07:50 +05:30 taas/nfast_default.cfg'''.splitlines()
>>> for line in s:
    for i in line.split():
        if  "Golden_Image" in i:
            print i


taas/NN41_R11_Golden_Image
>>>

Answer 3

一次阅读完整内容，然后进行搜索效率不高。相反，可以逐行读取文件，如果行符合条件，则可以提取路径而无需进一步拆分并使用RegEx。

使用以下RegEx获取路径

\s+(?=\S*$).*

链接：https://regex101.com/r/zuH0Zv/1

这里有工作代码：

import re
data = "3     97103164 Feb 7 2016 01:36:16 +05:30 taas/NN41_R11_Golden_Image"
regex = r"\s+(?=\S*$).*"
test_str = "3     97103164 Feb 7 2016 01:36:16 +05:30 taas/NN41_R11_Golden_Image"
matches = re.search(regex, test_str)
print(matches.group().strip())

Answer 4

关注您的代码，如果您只想获得正确的输出，您可以更简单。

IWorkbench#isStarting

输出是：

with open("outlog.out") as f:
    for line in f:
         if  "Golden_Image" in  line:
            print(line.split(" ")[-1])

PS：如果你想要一些更复杂的操作，你可能需要尝试@Avinash Raj回答的taas/NN41_R11_Golden_Image模块。

Python：如何从文件中提取字符串 - 只有一次

4 个答案: