我试图读取我的汽车销售数据并将其转移到numpy阵列。但它不起作用。 这是数据图像。 enter image description here
import numpy as np
import pandas as pd
for i in range(2,34):
data = pd.read_csv('Book2.csv')[i].values
data.shape
print(data)
错误讯息:
Traceback (most recent call last):
File "C:\Users\ThinkPad\AppData\Local\Programs\Python\Python36\lib\site-packages\pandas\core\indexes\base.py", line 2525, in get_loc
return self._engine.get_loc(key)
File "pandas\_libs\index.pyx", line 117, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\index.pyx", line 139, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\hashtable_class_helper.pxi", line 1265, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas\_libs\hashtable_class_helper.pxi", line 1273, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 2
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "F:\Files\python\neutral_network\2.py", line 5, in <module>
data = pd.read_csv('Book2.csv')[i].values
File "C:\Users\ThinkPad\AppData\Local\Programs\Python\Python36\lib\site-packages\pandas\core\frame.py", line 2139, in __getitem__
return self._getitem_column(key)
File "C:\Users\ThinkPad\AppData\Local\Programs\Python\Python36\lib\site-packages\pandas\core\frame.py", line 2146, in _getitem_column
return self._get_item_cache(key)
File "C:\Users\ThinkPad\AppData\Local\Programs\Python\Python36\lib\site-packages\pandas\core\generic.py", line 1842, in _get_item_cache
values = self._data.get(item)
File "C:\Users\ThinkPad\AppData\Local\Programs\Python\Python36\lib\site-packages\pandas\core\internals.py", line 3843, in get
loc = self.items.get_loc(item)
File "C:\Users\ThinkPad\AppData\Local\Programs\Python\Python36\lib\site-packages\pandas\core\indexes\base.py", line 2527, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas\_libs\index.pyx", line 117, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\index.pyx", line 139, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\hashtable_class_helper.pxi", line 1265, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas\_libs\hashtable_class_helper.pxi", line 1273, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 2
答案 0 :(得分:0)
由于第5行中的索引i
,您遇到的错误。
将整个csv转换为numpy ndarray的更好方法如下。
data = pd.read_csv('Book2.csv')
numpyMatrix = data.as_matrix()
您也可以尝试data.values
转换为numpy ndarray,但元素类型将是对象。
答案 1 :(得分:0)
正如Prakash所说,问题出在索引变量i第5行.read_csv返回一个pandas数据帧,而Pandas不知道如何处理索引值。
还有另外两个基本问题。首先,每次通过循环重新分配数据时都要重新读取文件,因此即使代码按预期工作,最多也只能得到一列数据。其次,read_csv无法正确解释您的数据。问题是第二个字段中的逗号,pandas最初将解释为分隔符,因此您必须告诉它忽略引号内的逗号。我找到了以下工作(在您的数据的子集上):
In [35]: data2=pd.read_csv("Book2.csv", skipinitialspace=True, quotechar='"')
In [36]: data2
Out[36]:
Date H6sv h6mi h6shv
0 1 26, 368 17.30 18182
1 2 24, 402 18.00 15030
2 3 24, 451 30.33 11312
3 4 26, 528 60.52 9730
然后删除您不想要的列:
In [55]: data2.drop(columns="Date")
Out[55]:
H6sv h6mi h6shv
0 26, 368 17.30 18182
1 24, 402 18.00 15030
2 24, 451 30.33 11312
3 26, 528 60.52 9730
是的,我花了55次试图得到我想要的东西......