关系之间的区别

时间:2017-04-18 14:21:12

标签: python pandas

我在Pandas中有一个数据框,relation_between_countries:

    country_from   country_to  points
1   Albania        Austria     10
2   Denmark        Austria     5   
3   Austria        Albania     2 
4   Greece         Norway      4   
5   Norway         Greece      5   

我试图弄清楚关系点之间的区别,如下:

country_from_or_to   country_to_or_from  difference
Albania              Austria             8
Denmark              Austria             
Greece               Norway              -1

你有什么想法怎么办?

1 个答案:

答案 0 :(得分:5)

使用DataFrameGroupBy.diff

cols = ['country_from','country_to']
#sort values in columns
df[cols] = df[cols].apply(sorted, axis=1)
#get difference
df['difference'] = df.groupby(cols)['points'].diff(-1)
print (df)
  country_from country_to  points  difference
1      Albania    Austria      10         8.0
2      Austria    Denmark       5         NaN
3      Albania    Austria       2         NaN
4       Greece     Norway       4        -1.0
5       Greece     Norway       5         NaN

也可以替换NaN来清空空格,但是在列中得到混合值 - 带字符串的数字,所以某些函数可以返回奇怪的输出:

cols = ['country_from','country_to']
df[cols] = df[cols].apply(sorted, axis=1)
df['difference'] = df.groupby(cols)['points'].diff(-1).fillna('')
print (df)
  country_from country_to  points difference
1      Albania    Austria      10          8
2      Austria    Denmark       5           
3      Albania    Austria       2           
4       Greece     Norway       4         -1
5       Greece     Norway       5