我有一个带有成对列的数据帧(例如,CS带有CS_Capacity,RD带有RD_Capacity等)。我有一列包含数月的列,然后每组对都有一个单独的值。
我想比较一月份的CS与CS_Capacity,如果CS_Capacity大于CS,我想将CS递增以使CS = CS_Capacity,然后将CSval递减相同的量。然后我想去下个月做同样的事情,直到CSval = 0。
样本数据:
data = [['2020-01-31', 3, 6, 7, 11], ['2020-02-29', 13 ,11, 8, 13], ['2020-03-31', 22, 19, 8 ,5], ['2020-04-30', 2, 3, 6, 4], ['2020-05-31', 19, 6, 4, 5],
['2020-06-30', 2, 14, 6, 8], ['2020-07-31', 5, 4, 3, 6], ['2020-08-31', 5, 11, 7, 19], ['2020-09-30',2,1, 4, 5], ['2020-10-31',29, 16, 14, 10],
['2020-11-30',2, 4, 6, 7], ['2020-12-31', 25, 20, 5, 3]]
# Create the pandas DataFrame
df = pd.DataFrame(data, columns = ['StartDate', 'RD', 'RD_Capacity', 'CS', 'CS_Capacity'], dtype=('<M8[ns]'))
df['CS_Capacity'] = df['CS_Capacity'].astype('int')
df['CS'] = df['CS'].astype('int')
df['RD_Capacity'] = df['RD_Capacity'].astype('int')
df['RD'] = df['RD'].astype('int')
CSval = 40
RDval = 22
pairs = [('CS', 'CS_Capacity', CSval), ('RD', 'RD_Capacity', RDval)]
我正在努力遍历这里的选项以进行递增和递减:
for val1, val2, val3 in pairs:
df['New Target' + val1] = df[val1] #create new column to store new target- set equal to val1
df['Value' + val1 + 'Original'] = val3 # starting value incremented row by row
for index, row in df.iterrows(): # loop through rows
if (df[val1][index] < df[val2][index]): # for first row, if val1 is less than val2
delta = df[val2] - df[val1] # create a delta so you know how much to decrement val3
df['New Target' + val1]= df[val2] # set new target equal to val2
val3 = val3 - delta # decrement val3
df['Delta' + val1] = delta
df['Value' + val1] = val3 # decremented value
if (df[val1][index] > df[val2][index]): # for first row, if val2 is less than val1
delta = df[val1] - df[val2] # create a delta so you know how much to increment val3
df['New Target' + val1]= df[val2] # set new target equal to val2
val3 = val3 + delta # increment val3
df['Delta' + val1] = delta
df['Value' + val1] = val3 # incremented value
虽然此代码有效,但是我遇到了一些问题:
因此,如果CSval = 40开始,并且在一月份的增量为3(-),则df ['Value'+ val1]应该等于37。 然后,下一行的df ['Value'+ val1 +'Original']应该从37开始。由于下一行的增量为2(+),因此df ['Value'+ val1]应当等于39。 / p>
我要去哪里错了?我希望val3在if语句中增加,所以应该可以工作。我想念什么?
谢谢大家!
答案 0 :(得分:1)
这似乎比您制作的要简单得多。此解决方案可以满足您的要求吗?
df['CSval'] = -(df['CS_Capacity'] - df['CS']).cumsum() + CSval
mask = (df['CSval']>=0) & (df['CS_Capacity'] > df['CS'])
df.loc[mask, 'CS'] = df.loc[mask, 'CS_Capacity']
在迭代结束时,CSVal的值为22,数据帧如下所示:
0 2020-01-31 3 6 11 11 36.0
1 2020-02-29 13 11 13 13 31.0
2 2020-03-31 22 19 8 5 34.0
3 2020-04-30 2 3 6 4 36.0
4 2020-05-31 19 6 5 5 35.0
5 2020-06-30 2 14 8 8 33.0
6 2020-07-31 5 4 6 6 30.0
7 2020-08-31 5 11 19 19 18.0
8 2020-09-30 2 1 5 5 17.0
9 2020-10-31 29 16 14 10 21.0
10 2020-11-30 2 4 7 7 20.0
11 2020-12-31 25 20 5 3 22.0
即使该值降至0以下,此解决方案仍会创建CSVal,但仅在CSval大于零时才更改CS值。如果您愿意,之后可以通过执行以下
清理数据框