访问此aws CSV时,出现以下错误:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 19: invalid start byte
正在忽略此错误,因为它似乎可以打印以下代码中的所有行(如下)。此外,如何将打印的行转换为熊猫数据框?谢谢。
import requests
from contextlib import closing
import csv
url = "http://google.com/MLI/data/devicefailure2.csv"
with closing(requests.get(url, stream=True)) as r:
f = (line.decode('utf-8') for line in r.iter_lines())
reader = csv.reader(f, delimiter=',', quotechar='"')
for row in reader:
print(row)
答案 0 :(得分:1)
尝试直接阅读csv
>>> x = [1, 2, 0, 3, 4, 0, 5, 6, 0, 7, 8, 0]
>>> X = numpy.array(x)
>>> X < 1
array([False, False, True, False, False, True, False, False, True,
False, False, True])
>>> X[X < 1] = -1
>>> X
array([ 1, 2, -1, 3, 4, -1, 5, 6, -1, 7, 8, -1])
>>> X[x]
array([ 2, -1, 1, 3, 4, 1, -1, 5, 1, 6, -1, 1])
或pd.read_csv('file', encoding = "ISO-8859-1")
将解决问题