Question

我正在尝试根据我之前从过滤下面的数据帧获得的列过滤数据帧。

AA BB CC DD EE FF GG 
0  1  1  0  1  0  0

数据框来自一个文件，其中每行中的数据是0或1，并且将根据加载的文件进行更改。我使用以下代码过滤此数据框，以便我的输出包含只有列值为1的列。

with open('Factors.txt') as b:
    IncludedFactors = pd.read_table(b, sep=',' )
print IncludedFactors

InterestingFactors = IncludedFactors.drop(IncludedFactors.columns[~IncludedFactors.iloc[0].astype(bool)],axis=1)
print InterestingFactors

output:
BB CC EE
1  1  1

然后我需要过滤掉一个包含许多标题的更大的数据框，但是我只需要ID，Xposition，Yposition和InterestingFactors数据框的标题。

以下是我尝试的代码，但输出仍然只包含3个标题而不是我需要的6个标题。

headers = InterestingFactors.columns.values
print headers
PivotTable = InfoTable.filter(items=['ID', 'Postion_X','Position_Y','headers'])
print PivotTable

非常感谢有关如何正确执行此操作的任何帮助！

Answer 1

以下是您可以这样做的一种方式：

headers = InterestingFactors.columns.append(pd.Index(['ID','Postion_X','Position_Y']))
PivotTable = InfoTable.loc[:, headers]

这会将您InterestingFactors寻找的列与上面提到的3列相结合。此Index已传递给.loc[]。

这也有效：

headers = InterestingFactors.columns
PivotTable = InfoTable.loc[:, (pd.Index(['ID','Postion_X','Position_Y']) | headers)]

用于比较的数据类型（我相信）必须相同。将3个标准列的列表转换为pd.Index，您可以在|中使用.loc[]。

使用Python中其他数据框的标头过滤数据框

1 个答案: