我正在努力弄清楚为什么我会收到这个错误。我已经搜索了很多但无法解决任何问题
print (df.loc[0].to_frame())
0
A 2
B a
C a
print (df.loc[0].to_frame().T)
A B C
0 2 a a
跑完之后。它给出了错误
import numpy as np
import warnings
from collections import Counter
import pandas as pd
def k_nearest_neighbors(data, predict, k=3):
if len(data) >= k:
warnings.warn('K is set to a value less than total voting groups!')
distances = []
for group in data:
for features in data[group]:
euclidean_distance = np.linalg.norm(np.array(features)-
np.array(predict))
distances.append([euclidean_distance,group])
votes = [i[1] for i in sorted(distances)[:k]]
vote_result = Counter(votes).most_common(1)[0][0]
return vote_result
df = pd.read_csv("data.txt")
df.replace('?',-99999, inplace=True)
df.drop(['id'], 1, inplace=True)
full_data = df.astype(float).values.tolist()
print(full_data)
如果我删除Traceback (most recent call last):
File "E:\Jazab\Machine Learning\Lec18(Testing K Neatest Nerighbors
Classifier)\Lec18(Testing K Neatest Nerighbors
Classifier)\Lec18_Testing_K_Neatest_Nerighbors_Classifier_.py", line 25, in
<module>
full_data = df.astype(float).values.tolist()
File "C:\Python27\lib\site-packages\pandas\util\_decorators.py", line 91, in
wrapper
return func(*args, **kwargs)
File "C:\Python27\lib\site-packages\pandas\core\generic.py", line 3299, in
astype
**kwargs)
File "C:\Python27\lib\site-packages\pandas\core\internals.py", line 3224, in
astype
return self.apply('astype', dtype=dtype, **kwargs)
File "C:\Python27\lib\site-packages\pandas\core\internals.py", line 3091, in
apply
applied = getattr(b, f)(**kwargs)
File "C:\Python27\lib\site-packages\pandas\core\internals.py", line 471, in
astype
**kwargs)
File "C:\Python27\lib\site-packages\pandas\core\internals.py", line 521, in
_astype
values = astype_nansafe(values.ravel(), dtype, copy=True)
File "C:\Python27\lib\site-packages\pandas\core\dtypes\cast.py", line 636,
in astype_nansafe
return arr.astype(dtype)
ValueError: invalid literal for float(): 3) <-----Reappears in Group 8 as:
Press any key to continue . . .
程序运行正常
我该怎么办?
答案 0 :(得分:1)
有错误数据(3)
),因此to_numeric
需要apply
,因为需要处理所有列。
非数字转换为NaN
s,由fillna
替换为某个标量,例如0
:
full_data = df.apply(pd.to_numeric, errors='coerce').fillna(0).values.tolist()
样品:
df = pd.DataFrame({'A':[1,2,7], 'B':['3)',4,5]})
print (df)
A B
0 1 3)
1 2 4
2 7 5
full_data = df.apply(pd.to_numeric, errors='coerce').fillna(0).values.tolist()
print (full_data)
[[1.0, 0.0], [2.0, 4.0], [7.0, 5.0]]
答案 1 :(得分:0)
您的CSV文件中的条目似乎有3)
,Pandas抱怨因为)
而无法将其转换为浮点数。