Question

我有2个数据帧（Thresholds和InfoTable），第一行是一行标题：

(Thresholds)

AA BB CC DD EE 
0  15 7  0  23

和

(InfoTable)

ID Xposition Yposition AA BB CC DD EE
1  1         1         10 20 5  10 50
2  2         2         20 12 10 20 2
3  3         3         30 19 17 30 26
4  4         4         40 35 3  40 38
5  5         5         50 16 5  50 16

我正在尝试过滤数据，以便Thresholds数据框中包含0的列是从InfoTable数据框中删除的列。然后，我尝试将Thresholds数据帧中每行的值与InfoTable数据帧中的值进行比较，以便可以在Infotable中将它们替换为1或0。我想要的输出如下：

ID Xposition Yposition BB CC EE
1  1         1         1  0  1
2  2         2         0  1  0
3  3         3         1  1  1
4  4         4         1  0  1
5  5         5         1  0  0

这是我现在用来过滤每个表的代码。

with open('thresholds_test.txt' ) as a:
    Thresholds = pd.read_table(a, sep=',') 
print Thresholds 

with open('includedThresholds.txt') as b:
    IncludedThresholds = pd.read_table(b, sep=',' )
print IncludedThresholds

InterestingThresholds = IncludedThresholds.drop(IncludedThresholds.columns[~IncludedThresholds.iloc[0].astype(bool)],axis=1)
print InterestingThresholds 

with open('PivotTable.tab') as c:
    PivotTable = pd.read_table(c, sep='\t' )
print PivotTable

headers = InterestingThresholds.columns.append(pd.Index(['ID','XPostion','YPosition']))
InfoTable = PivotTable.loc[:, headers]
print InfoTable

任何帮助将不胜感激！

Answer 1

查找要保留和删除的列：

cols = Thresholds.columns[Thresholds.iloc[0].astype(bool)]
dcols = Thresholds.columns[~Thresholds.iloc[0].astype(bool)]

做比较：

comp_df = pd.DataFrame(InfoTable[cols].values >= Thresholds[cols].values, columns=cols).astype(int)

将比较结果分配给原始数据框和删除列：

df_out = InfoTable.assign(**comp_df).drop(dcols, axis=1)
print(df_out)

输出：

  ID  Xposition  Yposition  BB  CC  EE
0   1          1          1   1   0   1
1   2          2          2   0   1   0
2   3          3          3   1   1   1
3   4          4          4   1   0   1
4   5          5          5   1   0   0

根据另一个数据框中的值替换数据框中一行中的值（Python）

1 个答案: