python:UnicodeDecodeError:'utf-8'编解码器无法解码字节

时间:2018-08-06 20:23:49

标签: python-3.x python-unicode

我正在尝试读取最新的jstack文件并搜索“ RUNNABLE”,“ BLOCKED”和“ TIMED_WAITING”。它在此之前工作,但经过几次运行并尝试修改某些列表单词后,它停止工作并开始在输出中看到以下错误。我尝试编码为utf-8,但收到相同的错误。当我尝试编码为ISO-8859-1时有效,但计数不正确

import os

def wordcount(filename, listwords):
try:
  # file = open(filename, encoding ='ISO-8859-1')
  #  file = open(filename, encoding ='utf-8')
    file = open(filename, "r")
    read = file.readlines()
    file.close()

    for word in listwords:
        #lower = word.lower()
        count = 0
        for sentence in read:
            line = sentence.split()
            for each in line:
                line2 = each.upper()
                #line2 = line2.strip("java.lang.Thread.State: ")
                if word == line2:
                    count += 1

        print (word, ":", count)
    except FileExistsError:
        print ("Thread dump is not there")

path = '/Users/YEscobar/Desktop/jstack'
filePath = [os.path.join(path, fname) for fname in os.listdir(path)]
lastFile = sorted(filePath, key=os.path.getctime)[-1]


wordcount (lastFile,["RUNNABLE","BLOCKED", "TIMED_WAITING"])

控制台输出

/Users/YEscobar/.virtualenvs/python_workstation1/bin/python /Users/YEscobar/Library/Preferences/PyCharmCE2018.2/scratches/test6.py
Traceback (most recent call last):
 File "/Users/YEscobar/Library/Preferences/PyCharmCE2018.2/scratches/test6.py", line 32, in <module>
   wordcount (lastFile,["RUNNABLE","BLOCKED","TIMED_WAITING"])
 File "/Users/YEscobar/Library/Preferences/PyCharmCE2018.2/scratches/test6.py", line 9, in wordcount
   read = file.readlines()
 File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/codecs.py", line 321, in decode
   (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xdb in position 20: invalid continuation byte

控制台输出,其未注释的编码= ISO-8859-1

RUNNABLE : 2
BLOCKED : 0
TIMED_WAITING : 3

控制台上的Grep

grep -o RUNNABLE jstack.20180802-202002.log | wc -l
      14
grep -o BLOCKED jstack.20180802-202002.log | wc -l
      0
grep -o TIMED_WAITING jstack.20180802-202002.log | wc -l
      24

0 个答案:

没有答案