Question

我需要在lambda函数中使用时间序列的索引位置。 lambda函数需要在转换中使用索引的位置。与此问题中提出的问题类似：Can I use index information inside the map function?，但使用以DateTime为索引的熊猫数据框。

我希望从lambda函数中得出的方程是：

值在时间序列索引中的位置x（1 /时间序列长度）+值

此函数的目的是向时间序列添加线性趋势。我期望的输出是相对于第一时间步长，在时间序列结束时增加+1。

到目前为止，我的想法一直是使用枚举和get_loc函数的组合，例如：

dates = pd.date_range(start='2018-10-01', end='2019-09-30', freq='D')
df = pd.DataFrame(np.random.randint(0,100,size=(365, 4)), columns=list('ABCD'), index=dates)

a = df['A']
test = map(lambda (idx, val): df.index.get_loc(idx) * (1/len(df.index)) + val, enumerate(a))

我收到以下错误消息：

File "<ipython-input-6-8fb927ed0ecd>", line 8
test = map(lambda (idx, val): df.index.get_loc(idx) * (1/len(df.index)) + val, enumerate(a))
                  ^
SyntaxError: invalid syntax

Answer 1

IIUC，您可以先计算时间序列x的索引（1 /时间序列的长度）中的值，然后将df中的值添加为

import pandas as pd
import numpy as np

dates = pd.date_range(start='2018-10-01', periods=365)

df = pd.DataFrame(np.random.randint(0,100,size=(365, 4)),
                  columns=list('ABCD'), index=dates)
# You can't use index in df as they are datetime
x = np.arange(len(df)) * 1/len(df)
# You need this trick as broadcasting is not working
# In this case
res = np.array([x]*4).T + df.values

如何在lambda函数中使用时间序列索引

1 个答案: