我试图基于一个值落在另一个数据框的两个值之间的情况下,为数据框列分配一个值:
intervals = pd.DataFrame(columns = ['From','To','Value'], data = [[0,100,'A'],[100,200,'B'],[200,500,'C']])
print('intervals\n',intervals,'\n')
points = pd.DataFrame(columns = ['Point', 'Value'], data = [[45,'X'],[125,'X'],[145,'X'],[345,'X']])
print('points\n',points,'\n')
DesiredResult = pd.DataFrame(columns = ['Point', 'Value'], data = [[45,'A'],[125,'B'],[145,'B'],[345,'C']])
print('DesiredResult\n',DesiredResult,'\n')
非常感谢
答案 0 :(得分:2)
让我们使用map
,首先使用pd.IntervalIndex
和from_arrays
方法创建一个系列:
intervals = intervals.set_index(pd.IntervalIndex.from_arrays(intervals['From'],
intervals['To']))['Value']
points['Value'] = points['Point'].map(intervals)
输出:
Point Value
0 45 A
1 125 B
2 145 B
3 345 C
答案 1 :(得分:0)
另一种方法:
def calculate_value(x):
return intervals.loc[(x >= intervals['From']) & (x < intervals['To']), 'Value'].squeeze()
desired_result = points.copy()
desired_result['Value'] = desired_result['Point'].apply(calculate_value)