检查pandas df中列表中任何列名称对之间的差异是否大于3

时间:2019-08-06 11:39:05

标签: python pandas dataframe

比方说,我有一个包含所有pandas df列名称的列表。如何检查任何一对列之间的差异是否大于3?

伪代码

IF difference between df['T01'] and df['T02'] > 3 or difference between df['T03'] and df['T04'] > 3 or difference between df['T05'] and df['T06'] > 3 and so on... THEN
DO SOMETHING

代码

df_column_names = ['T01', 'T02', 'T03', 'T04', 'T05', 'T06', 'T07', 'T08', 'T09', 'T10', 'T11', 'T12', 'T13', 'T14', 'T15', 'T16', 'T17', 'T18', 'T19', 'T20', 'T21', 'T22', 'T23', 'T24', 'T25', 'T26', 'T27', 'T28', 'T29', T30', 'T31', 'T32']

df
| T01 | T02    | T03 | T04   | ... |
|-----|--------|-----|-------|-----|
| 0.1 | 0.5685 | 1.4 | 0.333 | ... |

1 个答案:

答案 0 :(得分:2)

如果列对数通过索引选择并减去:

df1 = df.iloc[:, ::2] - df.iloc[:, 1::2].values

使用DataFrameGroupBy.diff的常规解决方案:

c = np.arange(len(df.columns)) // 2
df1 = df.groupby(c, axis=1).diff(axis=1).dropna(axis=1, how='all')

编辑:

如果需要按列表中的列名进行选择:

df1 = df[df_column_names].iloc[:, ::2] - df[df_column_names].iloc[:, 1::2].values


df = df[df_column_names]
c = np.arange(len(df.columns)) // 2
df1 = df.groupby(c, axis=1).diff(axis=1).dropna(axis=1, how='all')

示例

df = pd.DataFrame({
         'A':list('abcdef'),
         'T04':[4,5,4,5,5,4],
         'T03':[7,8,9,4,2,3],
         'T02':[1,3,5,7,1,0],
         'T01':[5,3,6,9,2,4],
         'F':list('aaabbb')
})

df_column_names = ['T01', 'T02', 'T03', 'T04']

df1 = df[df_column_names].iloc[:, ::2] - df[df_column_names].iloc[:, 1::2].values
print (df1)
   T01  T03
0    4    3
1    0    3
2    1    5
3    2   -1
4    1   -3
5    4   -1

mask = df1 > 3
print (mask)
     T01    T03
0   True  False
1  False  False
2  False   True
3  False  False
4  False  False
5   True  False