Question

我是编程新手。我使用powershell过滤并从文本文件中的远程服务器的Windows安全事件日志返回记录。我使用python脚本来计算用户名在文本中出现的次数。当对原始文本文件运行时，python打印并清空字典{}。但是，如果我复制文本文件的内容并将其粘贴到新的文本文件并对其运行我的python脚本，它将返回正确的计数：{'name1': 2, 'name2': 13, 'name3': 1, 'name4': 1, 'name5': 2, 'name6': 2}。文本文件看起来相同，字符位置相同。可能是什么问题？

Powershell的

Get-WinEvent -LogName "Security" -ComputerName server01 | Where-Object {$_.ID -eq 4663} | where Message -CNotLike "*name1*" | where Message -CNotLike "*name2*" | Format-List -Property * | Out-File "C:\apowershell\winsec\events.txt"

的Python

fhand = open('events2.txt')
counts = dict()
for line in fhand:
    if line.startswith('            Account Name:'):
        words = line.split()
        words.remove('Account')
        words.remove('Name:')
        for word in words:
            if word not in counts:
               counts[word] = 1
            else:
               counts[word] += 1
print(counts)

日志记录消息：尝试访问对象。

      Subject:
        Security ID:        S-1-5-21-495698755-754321212-623647154-4521
        Account Name:       name1
        Account Domain:     companydomain
        Logon ID:       0x8CB9C5024

      Object:
        Object Server:      Security
        Object Type:        File
        Object Name:        e:\share\file.txt
        Handle ID:      0x439c
        Resource Attributes:    S:PAI

      Process Information:
        Process ID:     0x2de8
        Process Name:       C:\Windows\System32\memshell.exe

      Access Request Information:
        Accesses:       Execute/Traverse

        Access Mask:        0x20

Answer 1

答案在你的问题陈述中。您正在使用在（可能是）非Windows系统上运行的python程序读取在MS Windows上创建的文件。

问题是原始文件的字符编码与python程序所期望的不匹配。具体来说，原始文件采用UCS-2（或UTF-16）编码。如果您在类似操作系统的UNIX上运行您的python代码，它可能期望UTF-8。但这取决于您的语言环境，请查看locale的输出。 Google＆＃34; python utf-16 decode＆＃34;关于如何处理这个的想法。虽然，就个人而言，我试图找到一种方法将内容转换为Windows系统上的UTF-8，而不是试图让你的python程序处理UTF-16。

Python在相同的文本文件上返回不同的结果

1 个答案: