Question

我有2个文件（a.txt和shell.txt）

在a.txt中有59行，我用正则表达式提取了他们的域

shell.txt中的

有5881行。

a.txt中的域存在于shell.txt中，如果a.txt的域存在于shell.txt中，我想提取整个shell.txt行

不幸的是我的循环不正常，所以我想得到你们的帮助。

感谢。

import re

s1 = open('a.txt', 'r').read().splitlines()
s2 = open('shell.txt', 'r').read().splitlines()


for x in s1:

    c1 = re.findall("\/\/(.*)\/",x.split("|")[0])[0]

    for x2 in s2:

        c2 = re.findall("\/\/(.*)\/",x2.split("|")[2])

        if c1 == c2:

            print x2

Answer 1

首先，尽量不要在内部使用正则表达式进行循环。而是直接从s1和s2（不splitlines()）抓取findall。生成的c1和c2应该是列表。

要找到两个列表之间的交集，我只使用集合：

intersects = set(c1).intersection(set(c2))
for intersect in intersects:
    print intersect

如果你需要帮助构建你需要的正则表达式，我需要了解更多有关文件以及您要提取的内容。

编辑：

对于正则表达式，这可能有效：

regex1 = r"^[^|]*\/\/([^|]*)\/"
c1 = re.findall(regex1, s1, re.M)
regex2 = r"^[^|]*(?:\|[^|]*){2}\/\/([^|]*)\/"
c2 = re.findall(regex2 s2, re.M)

Python For循环重复第二个循环

1 个答案: