我有一个pandas Series
,其中每个元素都是一个带索引的列表:
series_example = pd.Series([[1, 3, 2], [1, 2]])
另外,我有一个数组,其值与每个索引相关联:
arr_example = np.array([3., 0.5, 0.25, 0.1])
我想创建一个新的Series
,其中包含由输入Series
行中的索引给出的数组元素的累积和。在示例中,输出Series
将具有以下内容:
0 [0.5, 0.6, 0.85]
1 [0.5, 0.75]
dtype: object
执行此操作的非矢量化方式如下:
def non_vector_transform(series, array):
series_output = pd.Series(np.zeros(len(series_example)), dtype = object)
for i in range(len(series)):
element_list = series[i]
series_output[i] = []
acum = 0
for element in element_list:
acum += array[element]
series_output[i].append(acum)
return series_output
我想以矢量化的方式做到这一点。任何矢量化魔术师帮助我在这里?
答案 0 :(得分:2)
import numpy as np
import pandas as pd
series_example = pd.Series([[1, 3, 2], [1, 2]])
arr_example = np.array([3., 0.5, 0.25, 0.1])
result = series_example.apply(lambda x: np.cumsum(arr_example[x]))
print(result)
或者如果您更喜欢for
循环:
import numpy as np
import pandas as pd
series_example = pd.Series([[1, 3, 2], [1, 2]])
arr_example = np.array([3., 0.5, 0.25, 0.1])
# Copy only if you do not want to overwrite the original series
result = series_example.copy()
for i, x in result.iteritems():
result[i] = np.cumsum(arr_example[x])
print(result)
输出:
0 [0.5, 0.6, 0.85]
1 [0.5, 0.75]
dtype: object