Question

我有2个文件：file1，file2。 file2包含所有file1等等。例如：

file1:
data1/111 
data2/222 
data3/333 

file2:
data1/111 \ewr\xcgf\wer 54645623456.xml
data23/42234 \asdqw\aqerf 23525.xml
data2/222 \asd\qwe 234234.xml
data66/2331 \a53\fdf355 12312333311.xml
data3/333 \from\where 123123.xml
data4/444 \xcv\sdf\ghf 98546.xml 
and MANY more...

所以，我正在尝试从file2打印出两个文件中存在的行。这意味着打印输出必须在每一行都有额外的数据。与路径和XML文件名一样。

我试过了;

lines1 = open(path1).readlines()
lines2 = open(path2).readlines()

for i in lines1:
    for j in lines2:
        if i in j:
            print(j.rstrip())

这会打印lines2处的所有行，但我想要找出的是;搜索lines1中lines2的第一行，如果在lines2的任何位置找到该行，请从lines2打印该行，依此类推。所以在那之后它应该对lines1

中的第二行做同样的事情

有人可以帮忙吗？

感谢您的时间。

Answer 1

lines1 = open(path1).readlines()
lines2 = open(path2).readlines()

for l1 in lines1:
    if l1 in lines2:
        print(l1)

或使用列表理解：

lines1 = open(path1).readlines()
lines2 = open(path2).readlines()
print([line for line in lines1 if line in lines2])

Answer 2

问题不是很清楚，但是如果你知道你有相同的行，但在某些情况下有更多数据用于file2，你可以只为 O（n）解决方案做以下事情：

lines1 = open(path1).readlines()
lines2 = open(path2).readlines()

for line1, line2 in zip(lines1, lines2):
    if line1 != line2:
        print line2.rstrip()

Answer 3

我有Cross-Check的解决方案;

lines1 = open(path1).readlines()
lines2 = open(path2).readlines()

for i in lines1:
    for j in lines2:
        if j.startswith(i.rstrip()):
            print(j.rstrip())
            break

这样做：从lines1中搜索lines2的所有行中的1行。 break会阻止重复

如何比较两个文件并用Python提取一些数据

3 个答案: