在比较之前将两个数字熊猫数据帧(x,y)与插值进行比较

时间:2020-02-23 10:28:00

标签: python pandas dataframe

我想使用['x1']比较两个具有不同x的数字数据框[x1,y1]和[x2,y2]

import pandas as pd
first = {'x1':[0,3,5],'y1':[0,3,6]}
df1 = pd.DataFrame(first,columns=['x1','y1'])
print (df1)
   x1   y1
0   0    0
1   3    3
2   5    6
second = {'x2':[0,2,4,6],'y2':[0,2,4,6]}
df2 = pd.DataFrame(second,columns=['x2','y2'])
print (df2)
   x2  y2
0   0   0
1   2   2
2   4   4
3   6   6

使用x1值对x2进行插值以找到对应的y2。在比较y1和y2之前,我需要计算:

   x2  y2
0   0   0
1   2   2
?   3   ?
2   4   4
?   5   ?
3   6   6

然后比较y1和y2找出:

   x2  y2 y1 y1>y2?
0   0   0  0
1   2   2  
?   3   3  3 False 
2   4   4
?   5   5  6 True
3   6   6

1 个答案:

答案 0 :(得分:2)

Series.append创建一列DataFrame,用Series.drop_duplicates删除重复项,并按Series.sort_values排序:

df = (df2['x2'].append(df1['x1'], ignore_index=True)
              .drop_duplicates()
              .sort_values()
              .to_frame('x2'))
print (df)
   x2
0   0
1   2
5   3
2   4
6   5
3   6

然后由DataFrame.mergey2左联接并调用Series.interpolate,由Series.map添加新的列y1,最后是比较列:

df = df.merge(df2, how='left') 
df['y2'] = df['y2'].interpolate()
df['y1'] = df['x2'].map(df1.set_index('x1')['y1'])
df['y1>y2'] = df['y1'] > df['y2']
print (df)
   x2   y2   y1  y1>y2
0   0  0.0  0.0  False
1   2  2.0  NaN  False
2   3  3.0  3.0  False
3   4  4.0  NaN  False
4   5  5.0  6.0   True
5   6  6.0  NaN  False