set_index在最新的熊猫版本(0.8)中有显着变化吗?我无法按预期工作:
我原来的尝试尝试在'id'上设置索引
ipdb> merged2['id']
16 130809
25 130687
32 130686
9 41736
22 131913
7 130691
33 129993
13 130680
28 134295
29 130708
ipdb> merged2.set_index('id')
*** KeyError: 0
ipdb> [type(i) for i in merged2['id']]
[<type 'numpy.float64'>, <type 'numpy.float64'>, <type 'numpy.float64'>, <type 'numpy.float64'>, <type 'numpy.float64'>, <type 'numpy.float64'>, <type 'numpy.float64'>, <type 'numpy.float64'>, <type 'numpy.float64'>, <type 'numpy.float64'>]
当前索引为int:
ipdb> merged2.index
Int64Index([16, 25, 32, 9, 22, 7, 33, 13, 28, 29])
ipdb> [type(i) for i in merged2.index]
[<type 'numpy.int64'>, <type 'numpy.int64'>, <type 'numpy.int64'>, <type 'numpy.int64'>, <type 'numpy.int64'>, <type 'numpy.int64'>, <type 'numpy.int64'>, <type 'numpy.int64'>, <type 'numpy.int64'>, <type 'numpy.int64'>]
解决方法尝试构建新索引:
ndx=range(len(merged2))
[type(i) for i in ndx]
[<type 'int'>, <type 'int'>, <type 'int'>, <type 'int'>, <type 'int'>, <type 'int'>, <type 'int'>, <type 'int'>, <type 'int'>, <type 'int'>]
ipdb> merged2.set_index(ndx)
*** KeyError: 'no item named 0'
最后,将我的索引映射为int有效:
merged2['id']=map(lambda x: int(x), merged2['id']
merged2.set_index('id')
关于我做错了什么的想法?
答案 0 :(得分:1)
似乎在0.8.1dev上对我有用。你可以发布堆栈跟踪和/或merged2看起来像什么?你确定你还在使用pandas 0.8吗?
In [50]: import pandas as pd
In [51]: idx = pd.Index([16, 25, 32, 9, 22, 7, 33, 13, 28, 29])
In [52]: idx
Out[52]: Int64Index([16, 25, 32, 9, 22, 7, 33, 13, 28, 29])
In [53]: df = DataFrame(np.random.randn(len(idx), 3), idx, ['id', 1, 2])
In [54]: df
Out[54]:
id 1 2
16 0.351188 2.082303 -0.143037
25 0.633243 -1.731306 0.749934
32 -0.337893 -0.264249 -0.549856
9 -0.728056 0.786955 1.103877
22 1.131559 -0.255439 -0.397913
7 -1.384519 0.397626 -0.421481
33 1.356455 2.863659 -2.060498
13 -0.355786 -0.051383 -0.609486
28 -0.056607 0.767800 1.433946
29 -0.288202 -0.437992 0.843746
In [55]: df.set_index('id')
Out[55]:
1 2
id
0.351188 2.082303 -0.143037
0.633243 -1.731306 0.749934
-0.337893 -0.264249 -0.549856
-0.728056 0.786955 1.103877
1.131559 -0.255439 -0.397913
-1.384519 0.397626 -0.421481
1.356455 2.863659 -2.060498
-0.355786 -0.051383 -0.609486
-0.056607 0.767800 1.433946
-0.288202 -0.437992 0.843746
In [56]: pd.__version__
Out[56]: '0.8.1.dev-e2633d4'