如何读取文本文件中的所有IP地址并仅迭代每个IP地址一次

时间:2016-10-05 01:14:46

标签: python python-3.x

我在第re.search行上收到错误消息。

def search(pattern, string, flags=0):
"""Scan through string looking for a match to the pattern, returning
    a match object, or None if no match was found."""
return _compile(pattern, flags).search(string)

代码:

def IP():
file = open('mbox.txt' , 'r' )
count = 0
for line in file:
    address = re.search(r"\b\d{1,3}\. \d{1,3}\. d{1,3}\. \d{1,3}\b", file)
    for line in address:
        ip = address
        if line != allIPS:
            ip.add(ip)
            ip.add('\n')
            count = count +1
return (count)

def main():
    #global statement for fhand
    print("This program does the folowing: ")
    print("The sum of lines in the file: %d " % ( lineCount()))
    print("The number of messages in the file: %d " % ( MsgCount()))
    print("All IP Addresses: %d\n " % ( IP()  ))

if __name__ == '__main__':
main()

2 个答案:

答案 0 :(得分:1)

你的正则表达式中有额外的空格,可以防止匹配点缀的IPv4地址,并且你在如何尝试迭代行时遇到问题。试试这个:

def IP():
    # use a set to maintain unique addresses. no need to check if the
    # address is in the set because duplicates are automatically removed
    # during add.
    allIPS = set()

    # open with a context manager for automatic close. You dont need to
    # specify the mode because "r" is the default.
    with open('mbox.txt') as myfile:

        # now iterate the lines
        for line in myfile:

            # use findall to find all matches in the line and
            # update the set
            allIPS.update(
                re.findall(r"\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b",
                    line))

    # it seems like all you care about is the number of unique addresses
    return len(allIPS)

答案 1 :(得分:0)

我发现一个错误 - 您需要line而不是file中的re.search()