Question

我使用以下代码删除具有特定标题的常量列和列。

有更多的pythonic方式吗？

import numpy as np
import pandas as pd
from sklearn.datasets import make_classification
from sklearn.tree import DecisionTreeClassifier

X, y = make_classification(n_samples=1000,
                           n_features=6,
                           n_informative=3,
                           n_classes=2,
                           random_state=0,
                           shuffle=False)

# Creating a dataFrame
df = pd.DataFrame({'car':X[:,0],
                                  'ball':X[:,1],
                                  'Feature 3': 5,
                                  'Feature 4':X[:,3],
                                  'Feature 5':X[:,4],
                                  'Feature 6':X[:,5],
                                  'Class':y})
one = df.std().eq(0).reindex(df.columns, fill_value=True)
two = one.index.str.contains("ball|car")
all = one| two


df_auto = df.loc[:, ~all].copy()

Answer 1

I see no obvious issues with your current logic. "Pythonic" is subjective and I offer a different solution below.

This is an alternative numpy + .iloc based method which you may prefer:

n1 = np.where(np.std(df.values, axis=0) == 0)[0]
n2 = np.where(df.columns.str.contains('ball|car'))[0]

df_auto = df.iloc[:, np.delete(range(len(df.columns)), np.hstack((n1, n2)))].copy()

在Python中删除常量列和某些列名称的替代方法

1 个答案: