Question

我有一个数据框A，它有一个称为A['Income']的列，另一个数据框B，它有-B['Income']和B['category']列。我需要将A['Income']与B['Income']进行比较，并创建A ['category']，这样，当A['Income'] <= B['Income']时，A['category']取相应的值B['category']。如果A['Income'] > 1000，则A['category'] = 0.1

A:
Income
1000
1234
3007
4569
7065
1456
2980
8990
900
489

B:
Income   category
1000      1.1
2500      1.2
4000      1.3
5500      1.4
7000      2.1
8500      2.2

Desired output: 
A:
    Income   category
    1000      1.1
    1234      1.1
    3007      1.2
    4569      1.4
    7065      2.2
    1456      1.1
    2980      1.3
    6450      2.1    
    900       0.1
    489       0.1

下面是我正在尝试的方法，但是我无法开发将相应的值分配给新列的逻辑。它是一个类似于映射的字典，但是没有完全相等，需要定义范围。

for e in A.Income:
print(e)
l=[]    
for j,k in zip(B.Income, B.category):
    if e<=j:
        l.append(k)
    else:
        pass
p.append(B[B['Income']==l[0]].category.values)

brack=list(chain.from_iterable(p))
A['category']=brack

Answer 1

尝试使用merge_asof

df=pd.merge_asof(A.sort_values('Income'),B,on='Income').fillna(0.1)
   Income  category
0     489       0.1
1     900       0.1
2    1000       1.1
3    1234       1.1
4    1456       1.1
5    2980       1.2
6    3007       1.2
7    4569       1.3
8    7065       2.1
9    8990       2.2

更新以匹配输出

s=pd.merge_asof(A.reset_index().sort_values('Income'),B,on='Income',direction='forward').\
   dropna().set_index('index').sort_index()
s.loc[s.Income<1000,'category']=0.1
s
       Income  category
index                  
0        1000       1.1
1        1234       1.2
2        3007       1.3
3        4569       1.4
4        7065       2.2
5        1456       1.2
6        2980       1.3
8         900       0.1
9         489       0.1

根据两个数据框之间的比较匹配来创建列

1 个答案: