我正在尝试使用dask(也是pandas)读取csv文件,但我收到以下错误。我试图更改编码格式,但似乎没有任何效果。但是当我确实在Excel中另存为csv ut8
时,代码开始工作。我尝试对大熊猫使用同样的方法,并给了我相同的错误。我尝试显式地将编码指定为utf-16
,但出现错误,要求您使用utf-16-le or utf-16-be
。当我也使用我得到的错误。
我正在使用的csv文件有问题吗?
import dask.dataframe as dd
with open(Mar_N_W, 'rb') as f:
result = chardet.detect(f.read())
Mar_NW = dd.read_csv(Mar_N_W,encoding=result['encoding'],sep=None)
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\io\parsers.py in _next_iter_line(self, row_num)
2693
2694 try:
-> 2695 return next(self.data)
2696 except csv.Error as e:
2697 if self.warn_bad_lines or self.error_bad_lines:
~\AppData\Local\Continuum\anaconda3\lib\codecs.py in decode(self, input, final)
320 # decode input (taking the buffer into account)
321 data = self.buffer + input
--> 322 (result, consumed) = self._buffer_decode(data, self.errors, final)
323 # keep undecoded input until the next call
324 self.buffer = data[consumed:]
~\AppData\Local\Continuum\anaconda3\lib\encodings\utf_16.py in _buffer_decode(self, input, errors, final)
67 raise UnicodeError("UTF-16 stream does not start with BOM")
68 return (output, consumed)
---> 69 return self.decoder(input, self.errors, final)
70
71 def reset(self):
UnicodeDecodeError: 'utf-16-le' codec can't decode byte 0x0a in position 0: truncated data