Python RE“re,findall”

时间:2014-01-13 20:17:30

标签: python regex

提前谢谢你。我的问题是:

我有一段Python代码,其中我试图使用“os.walk,re和re.findall ip”来尝试查找几个文件中的所有IP地址,例如:

file1:192.168.3.1
file1:192.168.3.2
file1:mary had a little lamb
file1:192.168.3.3
file1:192.168.3.11
file1:10.255.3.1

file10:192.168.3.1
file10:192.168.3.2
file10:192.168.3.3
file10:192.168.3.4
file10:192.168.3.11
file10:192.168.1.1
file10:10.255.3.1

file2:192.168.3.1
file2:192.168.3.2
file2:192.168.3.3
file2:192.168.3.4
file2:192.168.3.11
file2:192.168.1.1
file2:10.255.3.1

file3:192.168.3.1
file3:192.168.3.2
file3:192.168.3.3
file3:192.168.3.4
file3:192.168.3.11
file3:192.168.1.1
file3:10.255.3.1

等我的代码块

for subdir, dirs, files in os.walk('.'):
  for file in files:
    matches = re.findall(r"\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}", open(file, "r").read())
    if matches:
        print "Here is what is inside %s = %s" % (file,matches[0])

它只会列出一种特定类型的ip,例如:

Here is what is inside file3 = 192.168.3.1
Here is what is inside file6 = 192.168.3.1
Here is what is inside file7 = 192.168.3.1
Here is what is inside file1 = 192.168.3.1
Here is what is inside file9 = 192.168.3.1
Here is what is inside file5 = 192.168.3.1
Here is what is inside file8 = 192.168.3.1
Here is what is inside file10 = 192.168.3.1
Here is what is inside file4 = 192.168.3.1

在认为我的正则表达式不正确时,我使用http://gskinner.com/RegExr/

进行了测试

正则表达式使用我在网站上提供的数据进行了测试,因为它找到了所有的IP地址。我做错了什么,为什么re.findall不接受我测试的正则表达式?

3 个答案:

答案 0 :(得分:6)

您只打印一场比赛:

if matches:
    print "Here is what is inside %s = %s" % (file,matches[0])

而不是所有

if matches:
    for match in matches:
        print "Here is what is inside %s = %s" % (file,match)

答案 1 :(得分:1)

您只打印第一个匹配项,并且 - 至少对于您显示的数据集部分 - first 条目始终为192.168.3.1

也许你想要打印所有比赛?你可以用

做到这一点
print '\n'.join(matches) 

答案 2 :(得分:0)

你可以只匹配第一行吗? 尝试将/ m标志添加到正则表达式

pattern = re.compile("whatever",re.MULTILINE)

另请注意,如果您要将模式与其中的组匹配,则findall会返回列表列表