如果两列满足一个条件,则返回其他列的计算值。熊猫 / 窗户

时间:2021-06-30 13:22:14

标签: python pandas dataframe

Refer Image

引用黄色突出显示的单元格:

如果 K = LDE,则在 J 列(LDE 行上方)中查找 FDE,在结果列返回(来自 LDE 的 D 减去来自 FDE 的 A)(即 223-307 = -84)

引用绿色突出显示的单元格:152-385 = -233 等等。

如何解决?

数据:

['03-01-2011', 523, 698, 284, 33, 416, 675, 300, 690, 314, '', '', 'FDM', ''] ['27-01-2011', 353, 1, 50, 547, 514, 957, 804, 490, 108, '', 'LDE', '', ''] ['28-01-2011', 307, 837, 656, 755, 792, 568, 119, 439, 943, 'FDE', '', '', ''] ['31-01-2011', 327, 409, 155, 358, 120, 401, 385, 965, 888, '', '', '', 'LDM'] ['01-02-2011', 686, 313, 714, 12, 140, 112, 589, 908, 605, '', '', 'FDM', ''] ['24-02-2011', 161, 846, 816, 223, 387, 566, 435, 567, 36, '', 'LDE', '', ''] ['25-02-2011', 889, 652, 190, 324, 947, 778, 575, 604, 314, 'FDE', '', '', ''] ['28-02-2011', 704, 33, 232, 630, 344, 796, 331, 409, 597, '', '', '', 'LDM'] ['01-03-2011', 592, 148, 974, 540, 848, 393, 505, 699, 315, '', '', 'FDM', ''] ['31-03-2011', 938, 768, 325, 756, 971, 644, 546, 238, 376, '', 'LDE', '', 'LDM'] ['01-04-2011', 385, 298, 654, 655, 2, 112, 960, 306, 477, 'FDE', '', 'FDM', ''] ['28-04-2011', 704, 516, 785, 152, 355, 348, 106, 611, 426, '', 'LDE', '', ''] ['29-04-2011', 753, 719, 776, 826, 756, 370, 660, 536, 903, 'FDE', '', '', 'LDM'] ['02-05-2011', 222, 28, 102, 363, 952, 860, 48, 976, 478, '', '', 'FDM', ''] ['26-05-2011', 361, 588, 866, 884, 809, 662, 801, 843, 668, '', 'LDE', '', '']

2 个答案:

答案 0 :(得分:0)

我找到了一个非常棘手的解决方案。

import numpy as np

input = torch.randn(3, 10)
result = np.zeros(input.size())
np_argmax = torch.argmax(input, dim=0, keepdim=True).numpy()
result = np.put_along_axis(result, np_argmax, 1, axis=0)
# go back to torch:
result = torch.from_numpy(result)

结果

enter image description here


起始数据

import pandas as pd

# define groups between two LDE
df['Group'] = (df['K'] == 'LDE').cumsum().shift(1, fill_value=0)

# custom function to perform your subtraction
def f(x):
    if x.loc[x['J'] == 'FDE', 'A'].size == 0:
        return None
    else:
        return x.loc[x['K'] == 'LDE', 'D'].iloc[0] - x.loc[x['J'] == 'FDE', 'A'].iloc[0]

# get list of numerical results
results = df.groupby('Group').apply(f).tolist()

# input the list into the specified LDE rows
df.loc[df['K'] == 'LDE', 'Results'] = results

答案 1 :(得分:0)

  • 丑陋但有效...
  • 在列 J
  • 中查找包含 FDE 的行
  • 对于每一行,找到你想要的其他行
  • 最后计算并返回它
df = pd.DataFrame([['03-01-2011', 523, 698, 284, 33, 416, 675, 300, 690, 314, '', '', 'FDM', ''], 
['27-01-2011', 353, 1, 50, 547, 514, 957, 804, 490, 108, '', 'LDE', '', ''] ,
['28-01-2011', 307, 837, 656, 755, 792, 568, 119, 439, 943, 'FDE', '', '', ''], 
['31-01-2011', 327, 409, 155, 358, 120, 401, 385, 965, 888, '', '', '', 'LDM'] ,
['01-02-2011', 686, 313, 714, 12, 140, 112, 589, 908, 605, '', '', 'FDM', ''] ,
['24-02-2011', 161, 846, 816, 223, 387, 566, 435, 567, 36, '', 'LDE', '', ''] ,
['25-02-2011', 889, 652, 190, 324, 947, 778, 575, 604, 314, 'FDE', '', '', ''] ,
['28-02-2011', 704, 33, 232, 630, 344, 796, 331, 409, 597, '', '', '', 'LDM'] ,
['01-03-2011', 592, 148, 974, 540, 848, 393, 505, 699, 315, '', '', 'FDM', ''] ,
['31-03-2011', 938, 768, 325, 756, 971, 644, 546, 238, 376, '', 'LDE', '', 'LDM'], 
['01-04-2011', 385, 298, 654, 655, 2, 112, 960, 306, 477, 'FDE', '', 'FDM', ''] ,
['28-04-2011', 704, 516, 785, 152, 355, 348, 106, 611, 426, '', 'LDE', '', ''] ,
['29-04-2011', 753, 719, 776, 826, 756, 370, 660, 536, 903, 'FDE', '', '', 'LDM'], 
['02-05-2011', 222, 28, 102, 363, 952, 860, 48, 976, 478, '', '', 'FDM', ''] ,
['26-05-2011', 361, 588, 866, 884, 809, 662, 801, 843, 668, '', 'LDE', '', '']], columns=["Date"]+list("ABCDEFGHIJKLM"))

def findandcalc(lde):
    # find last row from begining of DF, to place LDE was found that contains "FDE"
    fde = df.iloc[0:lde.name].loc[lambda d: d["J"].eq("FDE")].tail(1)
    # if row was found do calc and return it
    return np.nan if len(fde)==0 else lde["D"] - fde["A"].values[0]

df.loc[df["K"].eq("LDE"), "Result"] = df.loc[df["K"].eq("LDE")].apply(findandcalc, axis=1)

df

矢量化解决方案

  • 更干净....
  • 过滤到在所需列中有LDEFDE 的行
  • 测试FDE是否在前一行,然后执行简单计算
  • 将本系列加入到数据框以获得最终结果
rs = df.loc[df["K"].eq("LDE") | df["J"].eq("FDE")].assign(
    Result=lambda d: np.where(
        d["K"].eq("LDE") & d["J"].shift().eq("FDE"), d["D"] - d["A"].shift(), np.nan
    )
)["Result"]

df.join(rs)
<头>
日期 A B C D E F G H J K L M 结果
03-01-2011 523 698 284 33 416 675 300 690 314 FDM nan
27-01-2011 353 1 50 547 514 957 804 490 108 LDE nan
28-01-2011 307 837 656 755 792 568 119 439 943 FDE nan
31-01-2011 327 409 155 358 120 401 385 965 888 LDM nan
01-02-2011 686 313 714 12 140 112 589 908 605 FDM nan
24-02-2011 161 846 816 223 387 566 435 567 36 LDE -84
25-02-2011 889 652 190 324 947 778 575 604 314 FDE nan
28-02-2011 704 33 232 630 344 796 331 409 597 LDM nan
01-03-2011 592 148 974 540 848 393 505 699 315 FDM nan
31-03-2011 938 768 325 756 971 644 546 238 376 LDE LDM -133
01-04-2011 385 298 654 655 2 112 960 306 477 FDE FDM nan
28-04-2011 704 516 785 152 355 348 106 611 426 LDE -233
29-04-2011 753 719 776 826 756 370 660 536 903 FDE LDM nan
02-05-2011 222 28 102 363 952 860 48 976 478 FDM nan
26-05-2011 361 588 866 884 809 662 801 843 668 LDE 131