我有一个TXT和CSV文件,其中也尝试登录用户名和其他信息,但是我想计算在这种情况下某些用户名尝试了多少次,我想计算一下此处使用的每个单词有多少个示例: <hostname> = 12
,ssh2 = 6
,除外。
python脚本将是完美的
示例(关键信息已更改为“ ip”和“东西”):
sshd|XXX.XX.XX.XXX|1587574870|{"matches": ["Apr 22 18:53:46 <hostname> sshd[****]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=XXX.XX.XX.XXX", "Apr 22 18:53:48 <hostname> sshd[****]: Failed password for invalid user pengjing from XXX.XX.XX.XXX port **** ssh2", "Apr 22 18:55:14 <hostname> sshd[****]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=XXX.XX.XX.XXX", "Apr 22 18:55:15 <hostname> sshd[****]: Failed password for invalid user git from XXX.XX.XX.XXX port **** ssh2", "Apr 22 18:56:42 <hostname> sshd[****]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=XXX.XX.XX.XXX", "Apr 22 18:56:44 <hostname> sshd[****]: Failed password for invalid user test from XXX.XX.XX.XXX port **** ssh2", "Apr 22 18:58:14 <hostname> sshd[****]: Failed password for root from XXX.XX.XX.XXX port **** ssh2", "Apr 22 18:59:44 <hostname> sshd[****]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=XXX.XX.XX.XXX", "Apr 22 18:59:46 <hostname> sshd[****]: Failed password for invalid user za from XXX.XX.XX.XXX port **** ssh2", "Apr 22 19:01:09 <hostname> sshd[****]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=XXX.XX.XX.XXX", "Apr 22 19:01:10 <hostname> sshd[****]: Failed password for invalid user yw from XXX.XX.XX.XXX port **** ssh2"], "failures": 18, "mlfid": " <hostname> sshd[****]: ", "user": "root", "ip4": "XXX.XX.XX.XXX"}```
答案 0 :(得分:0)
以下是如何使用str.count()
方法的方法:
s = """sshd|XXX.XX.XX.XXX|1587574870|{"matches": ["Apr 22 18:53:46 <hostname> sshd[****]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=XXX.XX.XX.XXX", "Apr 22 18:53:48 <hostname> sshd[****]: Failed password for invalid user pengjing from XXX.XX.XX.XXX port **** ssh2", "Apr 22 18:55:14 <hostname> sshd[****]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=XXX.XX.XX.XXX", "Apr 22 18:55:15 <hostname> sshd[****]: Failed password for invalid user git from XXX.XX.XX.XXX port **** ssh2", "Apr 22 18:56:42 <hostname> sshd[****]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=XXX.XX.XX.XXX", "Apr 22 18:56:44 <hostname> sshd[****]: Failed password for invalid user test from XXX.XX.XX.XXX port **** ssh2", "Apr 22 18:58:14 <hostname> sshd[****]: Failed password for root from XXX.XX.XX.XXX port **** ssh2", "Apr 22 18:59:44 <hostname> sshd[****]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=XXX.XX.XX.XXX", "Apr 22 18:59:46 <hostname> sshd[****]: Failed password for invalid user za from XXX.XX.XX.XXX port **** ssh2", "Apr 22 19:01:09 <hostname> sshd[****]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=XXX.XX.XX.XXX", "Apr 22 19:01:10 <hostname> sshd[****]: Failed password for invalid user yw from XXX.XX.XX.XXX port **** ssh2"], "failures": 18, "mlfid": " <hostname> sshd[****]: ", "user": "root", "ip4": "XXX.XX.XX.XXX"}"""
print(s.count('ssh2'))
print(s.count('<hostname>'))
输出:
6
12
更新:
from collections import Counter
from re import findall
with open('file.txt', 'r') as f:
print(Counter(findall('(?<=Failed password for invalid user ).*(?= from XXX\.XX\.XX\.XXX port \*\*\*\* ssh2)', f.read())))
输出:
Counter({'pengjing': 1,
'git': 1,
'test': 1,
'za': 1,
'yw': 1})
答案 1 :(得分:0)
将此逻辑附加到您的代码中。读取文件后,它将起作用。 str变量应替换为您拥有的变量。还必须处理文本并删除不必要的关键字,例如双引号,方括号,逗号等。您可以添加更多内容。
with open('input_file.txt', 'r') as file:
str = file.read()
# str = """sshd|XXX.XX.XX.XXX|1587574870|{"matches": ["Apr 22 18:53:46 <hostname> sshd[****]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=XXX.XX.XX.XXX", "Apr 22 18:53:48 <hostname> sshd[****]: Failed password for invalid user pengjing from XXX.XX.XX.XXX port **** ssh2", "Apr 22 18:55:14 <hostname> sshd[****]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=XXX.XX.XX.XXX", "Apr 22 18:55:15 <hostname> sshd[****]: Failed password for invalid user git from XXX.XX.XX.XXX port **** ssh2", "Apr 22 18:56:42 <hostname> sshd[****]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=XXX.XX.XX.XXX", "Apr 22 18:56:44 <hostname> sshd[****]: Failed password for invalid user test from XXX.XX.XX.XXX port **** ssh2", "Apr 22 18:58:14 <hostname> sshd[****]: Failed password for root from XXX.XX.XX.XXX port **** ssh2", "Apr 22 18:59:44 <hostname> sshd[****]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=XXX.XX.XX.XXX", "Apr 22 18:59:46 <hostname> sshd[****]: Failed password for invalid user za from XXX.XX.XX.XXX port **** ssh2", "Apr 22 19:01:09 <hostname> sshd[****]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=XXX.XX.XX.XXX", "Apr 22 19:01:10 <hostname> sshd[****]: Failed password for invalid user yw from XXX.XX.XX.XXX port **** ssh2"], "failures": 18, "mlfid": " <hostname> sshd[****]: ", "user": "root", "ip4": "XXX.XX.XX.XXX"} """
word_dict = {}
for k in str.split(" ") : word_dict[k.replace('"','').replace("]","").replace(",","")] = 0
print(word_dict)
# {'sshd|XXX.XX.XX.XXX|1587574870|{matches:': 0, '[Apr': 0, '22': 0, '18:53:46': 0, '<hostname>': 0, 'sshd[****:': 0, 'pam_unix(sshd:auth):': 0, 'authentication': 0, 'failure;': 0, 'logname=': 0, 'uid=0': 0, 'euid=0': 0, 'tty=ssh': 0, 'ruser=': 0, 'rhost=XXX.XX.XX.XXX': 0, 'Apr': 0, '18:53:48': 0, 'Failed': 0, 'password': 0, 'for': 0, 'invalid': 0, 'user': 0, 'pengjing': 0, 'from': 0, 'XXX.XX.XX.XXX': 0, 'port': 0, '****': 0, 'ssh2': 0, '18:55:14': 0, '18:55:15': 0, 'git': 0, '18:56:42': 0, '18:56:44': 0, 'test': 0, '18:58:14': 0, 'root': 0, '18:59:44': 0, '18:59:46': 0, 'za': 0, '19:01:09': 0, '19:01:10': 0, 'yw': 0, 'failures:': 0, '18': 0, 'mlfid:': 0, '': 0, 'user:': 0, 'ip4:': 0, 'XXX.XX.XX.XXX}': 0}
for i in word_dict.keys() :
counter = 0
for j in str.split(" ") :
# print(j)
if j.__contains__(i) :
counter +=1
word_dict[i] = counter
print(word_dict["ssh2"])
# 6
print(word_dict["<hostname>"])
# 12
for k, v in word_dict.items() :
print("Word : ", k , " Occurences : ",v)
# Word : sshd|XXX.XX.XX.XXX|1587574870|{matches: Occurences : 0
# Word : [Apr Occurences : 0
# Word : 22 Occurences : 22
# Word : 18:53:46 Occurences : 2
# Word : <hostname> Occurences : 24
# Word : sshd[****: Occurences : 0
# Word : pam_unix(sshd:auth): Occurences : 10
# Word : authentication Occurences : 10
# .
# .
# .