Question

我需要计算一个给定数字的文本模式出现在日志文件中的次数，并将其存储在字典中。

我的问题是我的代码正在计算文件的所有条目到各种文本模式。

日志文件如下所示：

我做错了什么？

>Feb  1 00:00:02 bridge kernel: INBOUND TCP: IN=br0 PHYSIN=eth0 OUT=br0 >PHYSOUT=eth1 SRC=XXX.XXX.XXX.XXX DST=XXX.XXX.XXX.XXX LEN=40 TOS=0x00 >PREC=0x00 TTL=110 ID=12973 PROTO=TCP SPT=220 DPT=6129 WINDOW=16384 RES=0x00 >SYN URGP=0  
>Feb  1 00:00:02 bridge kernel: INBOUND TCP: IN=br0 PHYSIN=eth0 OUT=br0 >PHYSOUT=eth1 SRC=XXX.XXX.XXX.XXX DST=XXX.XXX.XXX.XXX LEN=40 TOS=0x00 >PREC=0x00 TTL=113 ID=27095 PROTO=TCP SPT=220 DPT=6129 WINDOW=16384 RES=0x00 >SYN URGP=0

我的代码目前是这样的：

#!//usr/bin/python3

import sys
import os
import re
from collections import defaultdict

    tipos={}
    p= re.compile ('bridge kernel:.*:')
    with open (sys.argv[1], 'r') as f:
        for line in f:
            match = p.search(line)
            if match:
                taux=(line.split(":") [3])
                tipos[taux]=1
    print (tipos)

代码不会出错，但所有密钥都有保存值。

我读过有关defaultdict和Counters的信息，但无法使其有效。

请帮帮我。

Answer 1

至于你的代码版本，你永远不会增加tipos中的taux数量，所以它们都应该是一个。是的，defaultdicts会有所帮助，因为它们会自动使用你传入的类型实例化缺少的字典条目。一般的defaultdict计数模式如下：

a = defaultdict(int)
a['asdf'] += 1
# a['asdf'] will now be 1, since it updates from 0

编辑：包括@ Jean-FrançoisFabre评论，我想指出collections模块带有一个专门设计用于计算任何可用的对象 - 计数器。从事物的外观来看，它依赖于大部分相同的后端，所以性能应该相似，但它带有一些不错的小额外内容（比如most_common(number_of_most_common_elements)方法。这可以像defaultdict一样使用，但没有专用(int)参数：

a = Counter()
a['asdf'] += 1
# a['asdf'] will now be 1, since it updates from 0

通常，传递的每个参数都对应一个默认值。这意味着您也可以执行以下操作：

a = defaultdict(int)
print(a['asdf'])  # will print 0
a = defaultdict(float)
print(a['asdf'])  # will print 0.0
a = defaultdict(list)
print(a['asdf'])  # will print [], and is particularly useful if you want a dict of lists, since you don't need to check whether your key already exists in the dict

至于你的代码，这意味着你想要：

tipos=defaultdict(int)
p= re.compile ('bridge kernel:.*:')
with open (sys.argv[1], 'r') as f:
    for line in f:
        match = p.search(line)
        if match:
            taux=(line.split(":") [3])
            tipos[taux]+=1
print (tipos)

Answer 2

您想使用defaultdict：

tipos = defaultdict(int)
p= re.compile ('bridge kernel:.*:')
with open (sys.argv[1], 'r') as f:
    for line in f:
        match = p.search(line)
        if match:
            taux=(line.split(":") [3])
            tipos[taux] += 1
print (tipos)

你在那里导入它但你没有使用它

循环中增量字典：

2 个答案: