在散点图上尝试使用不同颜色绘制离群值时,我遇到了以下错误:
TypeError:无法从中强制转换数组数据 dtype('U1')到dtype('float64') 根据“安全”规则
我的代码:
import statsmodels.api as sm
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
CRP = pd.read_csv('CarsProp.csv')
priceMean = CRP.price.mean()
priceStd = CRP.price.std()
CRP['isOutlierPrice'] = np.nan
testColumn1 = abs(CRP.price - priceMean) > 2*priceStd
for i, value in enumerate(testColumn1):
if value == True:
CRP['isOutlierPrice'][i] = 1
mileageMean = CRP.mileage.mean()
mileageStd = CRP.mileage.mean()
CRP['isOutlierMileage'] = np.nan
testColumn2 = abs(CRP.mileage - mileageMean) > 2*priceStd
for i, value in enumerate(testColumn2):
if value == True:
CRP['isOutlierMileage'][i] = 1
outlierPmsJoint = ((CRP['isOutlierPrice'] == 1) | (CRP['isOutlierMileage'] == 1))
colorChoiceDict = {True: (1.0, 0.55, 0.0, 1.0),
False: (0.11, 0.65, 0.72, 0.1)}
colorCol = [colorChoiceDict[val] for val in outlierPmsJoint]
PriceFloat = [float(val) for val in CRP.price]
MileageFloat = [float(val) for val in CRP.mileage]
plt.figure()
plt.scatter(PriceFloat, MileageFloat, c = colorCol, linewidth='0')
plt.set_title('Price vs. Mileage with outliers')
有人知道问题在哪里以及如何解决吗?谢谢
答案 0 :(得分:0)
问题出在散布图线上。应该是:
plt.scatter(PriceFloat, MileageFloat, c = colorCol, linewidth=0)
我将线宽arg置于倒置昏迷状态,这没有产生输出。但是,错误消息是非常令人误解的。下次在调试时,我可能会较少关注该错误消息。