我有一个数据集,其中每个ID均以数据时间和值作为列。我对此进行了一些计算,但是在使用递归函数时遇到了麻烦。
这是数据集的外观
Date-Time Volume ID Load
10/22/2019 3862 10
10/23/2019 3800 10
10/24/2019 3700 10
10/25/2019 5000 10 Yes
10/26/2019 4900 10
10/27/2019 4800 10
10/22/2019 3862 11
10/23/2019 3800 11
10/24/2019 3700 11
10/25/2019 5000 11 Yes
10/26/2019 4900 11
10/27/2019 4800 11
我在另一个函数中循环了ID并进行了调用。
这是我尝试过的,
curr_load = 0
def Load_number(data):
global curr_load
if(data['Load'] == 'Load'):
curr_load = curr_load + 1
return curr_load
ids = unique(data)
newdata = pd.DataFrame()
for id in ids:
data = data.loc[data['ID'] == id]
data = calculations(data)
def calculations(data):
data['Load_number'] = data.apply(Load_number, axis = 1)
必需的输出是
Date-Time Volume ID Load Load_number
10/22/2019 3862 100 0
10/23/2019 3800 100 0
10/24/2019 3700 100 0
10/25/2019 5000 100 Yes 1
10/26/2019 4900 100 1
10/27/2019 4800 100 1
10/28/2019 4700 100 1
10/22/2019 3862 111 0
10/23/2019 3800 111 0
10/24/2019 3700 111 0
10/25/2019 5000 111 Yes 1
10/26/2019 4900 111 1
10/27/2019 5800 111 Yes 2
10/28/2019 5500 111 2
10/29/2019 50000 111 2
日期为
Date-Time Volume ID Load LoadDate
10/22/2019 3862 10 None 0
10/23/2019 3800 10 None 0
10/24/2019 3700 10 None 0
10/25/2019 5000 10 Yes 10/25/2019
10/26/2019 4900 10 None 10/25/2019
10/27/2019 4800 10 None 10/25/2019
10/22/2019 3862 11 None 0
10/23/2019 3800 11 None 0
10/24/2019 3700 11 None 0
10/25/2019 5000 11 Yes 10/25/2019
10/26/2019 4900 11 None 10/25/2019
10/27/2019 4800 11 None 10/25/2019
答案 0 :(得分:2)
应该这样做:
df['Load Number'] = np.where(df.Load == 'Yes', 1, 0)
df['Load Number'] = df.groupby('ID')['Load Number'].cumsum()
(编辑) 关于第二个问题,您可以使用类似的方法:
df['LoadDate'] = np.where(df.Load == 'Yes', df['Date-Time'], np.nan)
df['LoadDate'] = df.groupby('ID')['LoadDate'].ffill().fillna(0)
输出:
Date-Time Volume ID Load Load Number LoadDate
0 10/22/2019 3862 10 None 0 0
1 10/23/2019 3800 10 None 0 0
2 10/24/2019 3700 10 None 0 0
3 10/25/2019 5000 10 Yes 1 10/25/2019
4 10/26/2019 4900 10 None 1 10/25/2019
5 10/27/2019 4800 10 None 1 10/25/2019
6 10/22/2019 3862 11 None 0 0
7 10/23/2019 3800 11 None 0 0
8 10/24/2019 3700 11 None 0 0
9 10/25/2019 5000 11 Yes 1 10/25/2019
10 10/26/2019 4900 11 None 1 10/25/2019
11 10/27/2019 4800 11 None 1 10/25/2019