我有一系列的numpy数组:
import pandas as pd
import numpy as np
pd.Series({10: np.array([[0.72260683, 0.27739317, 0. ],
[0.7187053 , 0.2812947 , 0. ],
[0.71435467, 0.28564533, 1. ],
[0.3268072 , 0.6731928 , 0. ],
[0.31941951, 0.68058049, 1. ],
[0.31260015, 0.68739985, 0. ]]),
20: np.array([[0.7022099 , 0.2977901 , 0. ],
[0.6983866 , 0.3016134 , 0. ],
[0.69411673, 0.30588327, 1. ],
[0.33857735, 0.66142265, 0. ],
[0.33244109, 0.66755891, 1. ],
[0.32675582, 0.67324418, 0. ]]),
38: np.array([[0.68811957, 0.31188043, 0. ],
[0.68425783, 0.31574217, 0. ],
[0.67994496, 0.32005504, 1. ],
[0.34872593, 0.65127407, 0. ],
[0.34276171, 0.65723829, 1. ],
[0.33722803, 0.66277197, 0. ]])}
)
和索引np.array([1, 4, 1])
的数组,该索引指示应从连续数组中过滤出哪些行。预期的输出将是这样的:
pd.Series({10: np.array([[0.7187053 , 0.2812947 , 0. ]]),
20: np.array([[0.33244109, 0.66755891, 1. ]]),
38: np.array([[0.68425783, 0.31574217, 0. ]])}
)
我该怎么做?如果我想从每个结果数组中取出第三个元素以获得以下Series,会有什么不同?
pd.Series({10: 0, 20: 1, 30: 0})
答案 0 :(得分:3)
如果可能,将2d数组系列转换为每个2d数组的3d数组(相同长度):
a = np.array([1, 4, 1])
b = np.array(s.tolist())[np.arange(len(s)), a, 2]
print (b)
[0. 1. 0.]
c = pd.Series(b, index=s.index)
print (c)
10 0.0
20 1.0
38 0.0
dtype: float64
如果要按索引数组选择:
b1 = np.array(s.tolist())[np.arange(len(s)), a]
print (b1)
[[0.7187053 0.2812947 0. ]
[0.33244109 0.66755891 1. ]
[0.68425783 0.31574217 0. ]]