在Python中遍历行时比较列并增加值

时间:2020-02-26 20:26:29

标签: python pandas comparison increment itertools

我有一个带有成对列的数据帧(例如,CS带有CS_Capacity,RD带有RD_Capacity等)。我有一列包含数月的列,然后每组对都有一个单独的值。

我想比较一月份的CS与CS_Capacity,如果CS_Capacity大于CS,我想将CS递增以使CS = CS_Capacity,然后将CSval递减相同的量。然后我想去下个月做同样的事情,直到CSval = 0。

样本数据:

data = [['2020-01-31', 3, 6, 7, 11], ['2020-02-29', 13 ,11, 8, 13], ['2020-03-31', 22, 19, 8 ,5], ['2020-04-30', 2, 3, 6, 4], ['2020-05-31', 19, 6, 4, 5], 
        ['2020-06-30', 2, 14, 6, 8], ['2020-07-31', 5, 4, 3, 6], ['2020-08-31', 5, 11, 7, 19], ['2020-09-30',2,1, 4, 5], ['2020-10-31',29, 16, 14, 10], 
        ['2020-11-30',2, 4, 6, 7], ['2020-12-31', 25, 20, 5, 3]] 

# Create the pandas DataFrame 
df = pd.DataFrame(data, columns = ['StartDate', 'RD', 'RD_Capacity', 'CS', 'CS_Capacity'], dtype=('<M8[ns]'))
df['CS_Capacity'] = df['CS_Capacity'].astype('int')
df['CS'] = df['CS'].astype('int')
df['RD_Capacity'] = df['RD_Capacity'].astype('int')
df['RD'] = df['RD'].astype('int')

CSval = 40
RDval = 22

pairs = [('CS', 'CS_Capacity', CSval), ('RD', 'RD_Capacity', RDval)]

我正在努力遍历这里的选项以进行递增和递减:


for val1, val2, val3 in pairs:
    df['New Target' + val1] = df[val1] #create new column to store new target- set equal to val1
    df['Value' + val1 + 'Original'] = val3 # starting value incremented row by row
    for index, row in df.iterrows(): # loop through rows
        if (df[val1][index] < df[val2][index]): # for first row, if val1 is less than val2
            delta = df[val2] - df[val1] # create a delta so you know how much to decrement val3
            df['New Target' + val1]= df[val2] # set new target equal to val2
            val3 = val3 - delta # decrement val3
            df['Delta' + val1] = delta
            df['Value' + val1] = val3 # decremented value
        if (df[val1][index] > df[val2][index]): # for first row, if val2 is less than val1
            delta = df[val1] - df[val2] # create a delta so you know how much to increment val3
            df['New Target' + val1]= df[val2] # set new target equal to val2
            val3 = val3 + delta # increment val3                        
            df['Delta' + val1] = delta
            df['Value' + val1] = val3 # incremented value

虽然此代码有效,但是我遇到了一些问题:

  1. df ['Value'+ val1]没有适当地增加或减少。
  2. 我想要新的val3继续前进到下一行,因此我可以跟踪每行如何累积递增或递减。

因此,如果CSval = 40开始,并且在一月份的增量为3(-),则df ['Value'+ val1]应该等于37。 然后,下一行的df ['Value'+ val1 +'Original']应该从37开始。由于下一行的增量为2(+),因此df ['Value'+ val1]应当等于39。 / p>

当前输出:enter image description here

所需的输出:enter image description here

我要去哪里错了?我希望val3在if语句中增加,所以应该可以工作。我想念什么?

谢谢大家!

1 个答案:

答案 0 :(得分:1)

这似乎比您制作的要简单得多。此解决方案可以满足您的要求吗?

df['CSval'] = -(df['CS_Capacity'] - df['CS']).cumsum() + CSval
mask = (df['CSval']>=0) & (df['CS_Capacity'] > df['CS'])
df.loc[mask, 'CS'] = df.loc[mask, 'CS_Capacity']

在迭代结束时,CSVal的值为22,数据帧如下所示:

0  2020-01-31   3            6  11           11   36.0
1  2020-02-29  13           11  13           13   31.0
2  2020-03-31  22           19   8            5   34.0
3  2020-04-30   2            3   6            4   36.0
4  2020-05-31  19            6   5            5   35.0
5  2020-06-30   2           14   8            8   33.0
6  2020-07-31   5            4   6            6   30.0
7  2020-08-31   5           11  19           19   18.0
8  2020-09-30   2            1   5            5   17.0
9  2020-10-31  29           16  14           10   21.0
10 2020-11-30   2            4   7            7   20.0
11 2020-12-31  25           20   5            3   22.0

即使该值降至0以下,此解决方案仍会创建CSVal,但仅在CSval大于零时才更改CS值。如果您愿意,之后可以通过执行以下

清理数据框