Question

我有一个大熊猫数据框（> 100列）。我需要删除各种列集，我希望有一种使用旧列的方法

df.drop(df.columns['slices'],axis=1)

我已经建立了如下选择：

a = df.columns[3:23]
b = df.colums[-6:]

a和b代表我要删除的列集。

以下

list(df)[3:23]+list(df)[-6:]

产生正确的选择，但是我无法使用drop来实现它：

df.drop(df.columns[list(df)[3:23]+list(df)[-6:]],axis=1)

ValueError：操作数不能与形状一起广播（20，）（6，）

我环顾四周，但找不到答案。

Selecting last n columns and excluding last n columns in dataframe

（以下与我收到的错误有关）：

python numpy ValueError: operands could not be broadcast together with shapes

这个感觉就像他们有一个类似的问题，但是“切片”不是分开的： Deleting multiple columns based on column names in Pandas

欢呼

Answer 1

您可以使用np.r_无缝组合多个范围/切片：

from string import ascii_uppercase

df = pd.DataFrame(columns=list(ascii_uppercase))

idx = np.r_[3:10, -5:0]

print(idx)

array([ 3,  4,  5,  6,  7,  8,  9, -5, -4, -3, -2, -1])

然后，您可以使用idx来为您的列建立索引并馈送至pd.DataFrame.drop：

df.drop(df.columns[idx], axis=1, inplace=True)

print(df.columns)

Index(['A', 'B', 'C', 'K', 'L', 'M', 'N',
       'O','P', 'Q', 'R', 'S', 'T', 'U'], dtype='object')

Answer 2

IIUC：

a = df.columns[3:23].values.tolist()
b = df.colums[-6:].values.tolist()

a.extend(b)
df.drop(a,1,inplace=True)

Answer 3

这将返回删除了列的数据框

df.drop(list(df)[2:5], axis=1)

Answer 4

您可以使用以下简单解决方案：

cols = [3,7,10,12,14,16,18,20,22]
df.drop(df.columns[cols],axis=1,inplace=True)

结果：

    0   1   2   4   5   6   8   9    11  13      15     17      19       21
0   3   12  10  3   2   1   7   512  64  1024.0  -1.0   -1.0    -1.0    -1.0
1   5   12  10  3   2   1   7   16   2   32.0    32.0   1024.0  -1.0    -1.0
2   5   12  10  3   2   1   7   512  2   32.0    32.0   32.0    -1.0    -1.0
3   5   12  10  3   2   1   7   16   1   32.0    64.0   1024.0  -1.0    -1.0

如您所见，具有给定索引的列已全部删除。

如果我们假设您有A，B，C ...等，则可以用数组中列的名称替换int值，例如，您可以像这样替换cols中的int值：

cols = ['A','B','C','F']

Answer 5

我之前也遇到过类似的问题，但遇到了麻烦，但是通过从另一个“减去”一个df来解决了这个问题，不确定这是否对您有用，但是对我有用：

df = df[~df.index.isin(a.index)]
df = df[~df.index.isin(b.index)]

通过索引删除多个Pandas列

5 个答案: