Question

对于下一个数据框，我想删除列c, d, e, f, g

    a   b   c   d   e   f   g   h   i   j
0   0   1   2   3   4   5   6   7   8   9
1   10  11  12  13  14  15  16  17  18  19

所以我使用下一个代码：

import pandas as pd
import numpy as np
df = pd.DataFrame(np.arange(20).reshape(2, 10), columns=list('abcdefghij'))
df.drop(['c', 'd', 'e', 'f', 'g'], axis=1)

问题可能是我的数据框不仅只有很少的列，我可能需要删除很多连续的列，所以我可能会以'c': 'g'之类的任何方式快速选择要删除的列？

Answer 1

使用DataFrame.loc选择列的连续名称：

df = df.drop(df.loc[:, 'c':'g'].columns, axis=1)
print (df)
    a   b   h   i   j
0   0   1   7   8   9
1  10  11  17  18  19

或使用Index.isin：

c = df.loc[:, 'c':'g'].columns
df = df.loc[:, ~df.columns.isin(c)]

如果可能，多个连续的组将Index.union，Index.isin或Index.difference的连接值使用Index.drop：

c1 = df.loc[:, 'c':'g'].columns
c2 = df.loc[:, 'i':'j'].columns

df = df.loc[:, ~df.columns.isin(c1.union(c2))]
print (df)
    a   b   h
0   0   1   7
1  10  11  17

df = pd.DataFrame(np.arange(20).reshape(2, 10), columns=list('wbcdefghij'))
print (df)
    w   b   c   d   e   f   g   h   i   j
0   0   1   2   3   4   5   6   7   8   9
1  10  11  12  13  14  15  16  17  18  19

c1 = df.loc[:, 'c':'g'].columns
c2 = df.loc[:, 'i':'j'].columns

#possible change order of columns, because function difference sorting
df1 = df[df.columns.difference(c1.union(c2))]
print (df1)
    b   h   w
0   1   7   0
1  11  17  10

#ordering is not changed
df2 = df[df.columns.drop(c1.union(c2))]
print (df2)
    w   b   h
0   0   1   7
1  10  11  17

如果可能的话，使用切片选择方法批量删除数据框的列？

1 个答案: