我在jupiternotebook中无法读取csv文件,以下是csv文件的链接github链接
https://github.com/roshanthokchom/new-assignment/blob/master/spam.csv
import numpy as np
import pandas as pd
from sklearn.naive_bayes import GaussianNB
import urllib
pd.read_csv('spam.csv',encoding='latin-1')
ParserError: Error tokenizing data. C error: Expected 2 fields in line 13, saw 4
答案 0 :(得分:-1)
@Roshan这是您解决问题的方法:
import pandas as pd
import csv
with open('spam.csv', newline='') as f:
csvread = csv.reader(f)
raw_data = list(csvread)
data = []
for i in batch_data:
i = i[0].split("\t")
data.append(i)
final_data = pd.DataFrame(data)
您可以指定编码方式,但是文件中的逗号之间包含逗号,因此,如果您正常阅读大熊猫,它们将基于“,”分隔数据。这就是为什么您遇到错误