Question

我根据日期有两个数据框，例如：df1

id       date      time      sum
abc   15/03/2020  01:00:00    15
abc   15/03/2020  02:00:00    25
abc   15/03/2020  04:00:00    10
abc   15/03/2020  04:30:00    5
abc   15/03/2020  05:00:00    20
xyz   15/03/2020  12:00:00    3
xyz   15/03/2020  03:00:00    20
xyz   15/03/2020  04:00:00    20
xyz   15/03/2020  05:00:00    50

df2 是

id        date      sum_last   high_last  low_last
abc    14/03/2020    10           10          5
xyz    14/03/2020     5            9          7

我想通过比较 sum 列的值来在 df1 中创建 Flag 列，如果 sum 行的值大于前一个 sum 行，则 flag 为 1，否则为 0 但对于 sum 列的第一行值为 15它不会是 Nan，它将与 df2 总和值的值进行比较，因为它对于一个较小的日期（即 2020 年 3 月 14 日）具有相同的 ID。高列逻辑是如果标志为 1，那么它将采用相邻的总和列值作为高值即 15 为 15>10（10 是 abc 的 sum_last 值），因此该 id 的标志为 1，低将是其前一行值，即该 id 的 sum_last 值，即 10。如果标志为 0，则它采用前一个行的高低值如 xyz,14/03/2021 sum_last 是 5 和 15/03/2021 是 3 和 3<5 所以标志是 0 并且相邻的高和低值将与前一行相同，即 9 和 7 .所以输出将是：

id       date      time      sum   Flag   high  low
abc   15/03/2020  01:00:00    15    1      15    10  #flag is 1 as 10(sum_last)<15(sum) and high is now 15 and low is 10(previous value i.e sum_last column value)
abc   15/03/2020  02:00:00    25    1      25    15  #high changed coz flag is 1,so does low
abc   15/03/2020  04:00:00    10    0      25    15  #high remains unchanged coz 0 so no change for low value
abc   15/03/2020  04:30:00    5     0      25    15  #flag=0 so no change in high and low
abc   15/03/2020  05:00:00    20    1      20    10  #flag=1 high changed and so does low
xyz   15/03/2020  12:00:00    3     0      9     7  #id is changed and acc to that flag is 0 as 5>3 high will not change and remain 9 and low will also not change   
xyz   15/03/2020  03:00:00    20    1      20    3  #flag=1 high = sum value low=previous sum value
xyz   15/03/2020  04:00:00    20    0      20    3  #flag=0 high and low will same as previous row
xyz   15/03/2020  05:00:00    50    1      50    20  #flag=1 high=50 and low=20

我正在使用标记列的代码，如下所示：

cols = ['sum']
new = [x + '_last' for x in cols]
d = dict(zip(new, cols))
print (d)

#set id to index
df1 = df1.set_index('id')
df2 = df2.set_index('id')

#shifting per id and first NaN repalced by df2
df = df1.groupby('id')[cols].shift().fillna(df2.rename(columns=d)[cols])
print (df)
df1 = pd.concat([df1, df1[cols].gt(df[cols]).astype(int).add_prefix('flag_')],axis=1)
print (df1)

它给了我 Flag 列，但我无法制作高低列。有人可以在这里帮助我。提前致谢

将一个 df 的值与另一个 Pandas 的第一个值进行连接和比较

0 个答案: