我想使用['x1']比较两个具有不同x的数字数据框[x1,y1]和[x2,y2]
import pandas as pd
first = {'x1':[0,3,5],'y1':[0,3,6]}
df1 = pd.DataFrame(first,columns=['x1','y1'])
print (df1)
x1 y1
0 0 0
1 3 3
2 5 6
second = {'x2':[0,2,4,6],'y2':[0,2,4,6]}
df2 = pd.DataFrame(second,columns=['x2','y2'])
print (df2)
x2 y2
0 0 0
1 2 2
2 4 4
3 6 6
使用x1值对x2进行插值以找到对应的y2。在比较y1和y2之前,我需要计算:
x2 y2
0 0 0
1 2 2
? 3 ?
2 4 4
? 5 ?
3 6 6
然后比较y1和y2找出:
x2 y2 y1 y1>y2?
0 0 0 0
1 2 2
? 3 3 3 False
2 4 4
? 5 5 6 True
3 6 6
答案 0 :(得分:2)
用Series.append
创建一列DataFrame
,用Series.drop_duplicates
删除重复项,并按Series.sort_values
排序:
df = (df2['x2'].append(df1['x1'], ignore_index=True)
.drop_duplicates()
.sort_values()
.to_frame('x2'))
print (df)
x2
0 0
1 2
5 3
2 4
6 5
3 6
然后由DataFrame.merge
的y2
左联接并调用Series.interpolate
,由Series.map
添加新的列y1
,最后是比较列:>
df = df.merge(df2, how='left')
df['y2'] = df['y2'].interpolate()
df['y1'] = df['x2'].map(df1.set_index('x1')['y1'])
df['y1>y2'] = df['y1'] > df['y2']
print (df)
x2 y2 y1 y1>y2
0 0 0.0 0.0 False
1 2 2.0 NaN False
2 3 3.0 3.0 False
3 4 4.0 NaN False
4 5 5.0 6.0 True
5 6 6.0 NaN False