根据条件从大熊猫中提取数据

时间:2018-04-15 20:28:44

标签: python pandas loops indexing

我有一个数据框A = [1,2,3,5,9,8,11,13] and B = [2,1,6,19,16,15,14,12]。我想检查的是在任何情况下A和B的十字交叉元素是否相等

例如:这里A[0]==B[1] and B[0]==A[1],这是一个十字交叉的元素。

import pandas as pd
df=pd.DataFrame({'A':[1,2,3],'B':[2,1,6]})
if df.loc[0,"A"] == df.loc[1,"B"] & df.loc[1,"A"] == df.loc[0,"B"]:
    print("the values which are equal")
else:
    print("the values which are not equal")

1 个答案:

答案 0 :(得分:0)

比较连续列

为了检查数据框中所有行的RandomAccess是否为A[i]==B[i+1] & A[i+1]==B[i],您可以矢量地比较列但移位:

A = np.array([1,2,3,5,14,16,16,13]) # I mdified input data from the question for the second example
B = [2,1,6,19,16,15,14,12]
df = pd.DataFrame({'A':A,'B':B})
eq_diag = (df['A'][:-1].values==df['B'][1:].values) & (df['A'][1:].values==df['B'][:-1].values)
# boolean array with rows=rows_in_df-1, eq_diag[i] will be true if the 
# diagonal between rows i and i+1 is equal
# Output for eq_diag
# [ True False False False False False False]

然后,可以打印这些对角线比较中相等的值:

print df[:-1][eq_diag] # the [:-1] is important for dimensions to match
# Out
    A  B
 0  1  2 # it also returns the index i (not i+1) where the diagonal is equal

比较数据框

中的所有列组合

如果不是将列i与列i+1进行比较,则应比较所有可能的组合,可以使用模块itertools

import itertools
combinations =  np.array(list(itertools.combinations(range(len(df)),2)))
print combinations.T
eq_diag = ((df['A'].values[combinations[:,0]]==df['B'].values[combinations[:,1]]) & 
           (df['A'].values[combinations[:,1]]==df['B'].values[combinations[:,0]]))
# Out: all the column combinations
[[0 0 0 0 0 0 0 1 1 1 1 1 1 2 2 2 2 2 3 3 3 3 4 4 4 5 5 6]
 [1 2 3 4 5 6 7 2 3 4 5 6 7 3 4 5 6 7 4 5 6 7 5 6 7 6 7 7]]

然后,可以打印出相同的元素:

for i,j in combinations[eq_diag]:
    print 'The criss cross element of columns {} and {} is equal:\n{}'.format(i,j,df.values[[i,j]])
# Out
# The criss cross element of columns 0 and 1 is equal:
# [[1 2]
#  [2 1]]
# The criss cross element of columns 4 and 6 is equal:
# [[14 16]
#  [16 14]]