File "C:\Users\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 1748, in read
data = self._reader.read(nrows)
File "pandas\_libs\parsers.pyx", line 890, in pandas._libs.parsers.TextReader.read (pandas\_libs\parsers.c:10862)
File "pandas\_libs\parsers.pyx", line 912, in pandas._libs.parsers.TextReader._read_low_memory (pandas\_libs\parsers.c:11138)
File "pandas\_libs\parsers.pyx", line 989, in pandas._libs.parsers.TextReader._read_rows (pandas\_libs\parsers.c:12175)
File "pandas\_libs\parsers.pyx", line 1117, in pandas._libs.parsers.TextReader._convert_column_data (pandas\_libs\parsers.c:14136)
File "pandas\_libs\parsers.pyx", line 1169, in pandas._libs.parsers.TextReader._convert_tokens (pandas\_libs\parsers.c:14972)
File "pandas\_libs\parsers.pyx", line 1273, in pandas._libs.parsers.TextReader._convert_with_dtype (pandas\_libs\parsers.c:17119)
File "pandas\_libs\parsers.pyx", line 1289, in pandas._libs.parsers.TextReader._string_convert (pandas\_libs\parsers.c:17347)
File "pandas\_libs\parsers.pyx", line 1524, in pandas._libs.parsers._string_box_utf8 (pandas\_libs\parsers.c:23041)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe3 in position 43: invalid continuation byte
以上代码用于读取简单的csv文件。但我不断收到以下错误
SELECT CONCAT(COUNT(DISTINCT bar_id),'/',COUNT(distinct foo_id)) cnt,
CAST(COUNT(DISTINCT bar_id) AS FLOAT)/CASE WHEN COUNT(distinct foo_id) = 0
THEN 1
ELSE CAST(COUNT(distinct foo_id) AS FLOAT)
END "div",
CASE WHEN DATE_TRUNC('day', b.created_at) IS NULL
THEN DATE_TRUNC('day', f.created_at)
ELSE DATE_TRUNC('day', b.created_at)
END "date"
FROM bars b
FULL OUTER JOIN foos f
ON date_trunc('day', b.created_at) = date_trunc('day', f.created_at)
GROUP BY "date"
ORDER BY "date" DESC
答案 0 :(得分:1)
您的解析器正在尝试解析utf-8
数据,但您的文件似乎处于其他编码状态(或者可能只有一个无效字符)。
尝试指示解析器解析为plain ascii
,可能还有一些代码页(我不知道Python,所以无法帮助解决)。
您似乎需要使用encoding
参数。
答案 1 :(得分:0)
对不起,我对此太迟了,请将您的代码更改为以下代码,看看是否可行。
import pandas
df = pandas.read_csv("trial.csv", encoding="ISO-8859-1")
答案 2 :(得分:0)
import pandas
df = pandas.read_csv("trial.csv", "rb")
如果上述建议均无效,则“ rb”读取二进制文件可能会成功
答案 3 :(得分:0)
store=pd.read_csv('Super_Store.csv', encoding='windows-1252')
我们只需要告诉 Python 这个文件的实际编码。经过一些跟踪和错误,我发现它是在 windows-1252
编码中。
这可能是因为这些文件在某个时候保存在 Windows 计算机上,这是该计算机的默认字符编码。
详情请至:
HTML Windows-1252 (ANSI) Reference