在Python中删除常量列和某些列名称的替代方法

时间:2018-03-19 13:17:12

标签: python pandas

我使用以下代码删除具有特定标题的常量列和列。

有更多的pythonic方式吗?

import numpy as np
import pandas as pd
from sklearn.datasets import make_classification
from sklearn.tree import DecisionTreeClassifier

X, y = make_classification(n_samples=1000,
                           n_features=6,
                           n_informative=3,
                           n_classes=2,
                           random_state=0,
                           shuffle=False)

# Creating a dataFrame
df = pd.DataFrame({'car':X[:,0],
                                  'ball':X[:,1],
                                  'Feature 3': 5,
                                  'Feature 4':X[:,3],
                                  'Feature 5':X[:,4],
                                  'Feature 6':X[:,5],
                                  'Class':y})
one = df.std().eq(0).reindex(df.columns, fill_value=True)
two = one.index.str.contains("ball|car")
all = one| two


df_auto = df.loc[:, ~all].copy()

1 个答案:

答案 0 :(得分:0)

I see no obvious issues with your current logic. "Pythonic" is subjective and I offer a different solution below.

This is an alternative numpy + .iloc based method which you may prefer:

n1 = np.where(np.std(df.values, axis=0) == 0)[0]
n2 = np.where(df.columns.str.contains('ball|car'))[0]

df_auto = df.iloc[:, np.delete(range(len(df.columns)), np.hstack((n1, n2)))].copy()