读取文件时提取引号括起来的文本

时间:2015-01-24 16:22:15

标签: python

我正在逐行阅读文件并希望抓住我想要的东西。我在行中查找关键字,然后立即逐个字符地读取它。在C / C ++中,我只是将字符串抛出一个for循环并迭代它说

这是我到目前为止的代码。

i = 0

with open("test.txt") as f:
    for line in f:
        if "test" in line:
            for character in line:
                if character == "\"":
                   //append all characters to a string until the 2nd quote is seen

有什么想法吗?

1 个答案:

答案 0 :(得分:1)

试试这个:

in_string = False
current_string = ""
strings = []

with open("test.txt") as f:
    for line in f:
        if "test" in line:
            for character in line:
                if character == '"':
                    if in_string:
                        strings.append(current_string)
                    in_string = not in_string
                    current_string = ""
                    continue
                elif in_string:
                    current_string += character

它遍历行中的所有字符,如果它是"',它会开始将正在进行的字符收集到字符串中,或​​者它会停止并附加到收集的列表中字符串。

或者,使用正则表达式:

import re
strings = []

with open("test.txt") as f:
    for line in f:
        if "test" in line:
            strings.extend(re.findall(r'"(.*?)"', line, re.DOTALL))