我有一个数千行的数据框,其中一部分包括如下数据。我还有其他专栏[" FP"," Y"," SLC"," C_ID"," NR" ]在这个数据框中。
z_to_s | z_to_t | s_to_t | t_p | min | max
0.04 | | 0.06 | 0.29 | 0.04 | 0.29
0.01 | | NS | NS | 0.01 | 0.01
ND | | NS | NS | ND | ND
0.04 | | ND* | NS | ND* | 0.04
| 0.55* | | | 0.55 | 0.55
19.88* | | 0.46 | 0.09 | 0.09 |19.88
" min"和" max"列各自表示来自" z_to_s"," z_to_t"," s_to_t"和" t_p"的最小值和最大值。列。 ND或ND *始终被视为最小值,而NS被忽略。我需要保持输入数据的原始形式,所以我的最终输出应如下所示:
z_to_s | z_to_t | s_to_t | t_p | min | max
0.04 | | 0.06 | 0.29 | 0.04 | 0.29
0.01 | | NS | NS | 0.01 | 0.01
ND | | NS | NS | ND | ND
0.04 | | ND* | NS | ND* | 0.04
| 0.55* | | | 0.55* | 0.55
19.88* | | 0.46 | 0.09 | 0.09 | 19.88*
为此,我一直在尝试使用以下代码来制定名为" QC_min"和" QC_max"
df["QC_min"] = df.drop(["FP","Y","SLC","C_ID","NR","min","max"], axis = 1).isin(data_concat["min"]).any(axis = 1)
df["QC_max"] = df.drop(["FP","Y","SLC","C_ID","NR","min","max"], axis = 1).isin(data_concat["max"]).any(axis = 1)
所以" QC_min"和" QC_max"具有TRUE / FALSE值取决于" min" /" max"匹配[" z_to_s"," z_to_t"," s_to_t"," t_p"]列值中的任何一个。我想写另一行代码,如果" QC_min"或者" QC_max"是的,我加了一个" *"到相应的" min"或者" max"值。但是,上面代码的输出显示如下。
z_to_s | z_to_t | s_to_t | t_p | min | max | QC_min | QC_max
0.04 | | 0.06 | 0.29 | 0.04 | 0.29 | FALSE | FALSE
0.01 | | NS | NS | 0.01 | 0.01 | FALSE | FALSE
ND | | NS | NS | ND | ND | TRUE | TRUE
0.04 | | ND* | NS | ND* | 0.04 | TRUE | FALSE
| 0.55* | | | 0.55 | 0.55 | FALSE | FALSE
19.88* | | 0.46 | 0.09 | 0.09 | 19.88 | FALSE | FALSE
其中所有数字对象都显示为false,无论它们是否匹配,而字符串对象为true。我检查了我的数据类型,想知道这是否是数据类型int / float / str问题。如果我添加一个astype(str)到我的" min"或者" max"所以我的代码变成了
df["QC_min"] = df.drop(["FP","Y","SLC","C_ID","NR","min","max"], axis = 1).isin(data_concat["min"]).astype(str).any(axis = 1)
df["QC_max"] = df.drop(["FP","Y","SLC","C_ID","NR","min","max"], axis = 1).isin(data_concat["max"]).astype(str).any(axis = 1)
一切都变为TRUE,无论*如此:
z_to_s | z_to_t | s_to_t | t_p | min | max | QC_min | QC_max
0.04 | | 0.06 | 0.29 | 0.04 | 0.29 | TRUE | TRUE
0.01 | | NS | NS | 0.01 | 0.01 | TRUE | TRUE
ND | | NS | NS | ND | ND | TRUE | TRUE
0.04 | | ND* | NS | ND* | 0.04 | TRUE | TRUE
| 0.55* | | | 0.55 | 0.55 | TRUE | TRUE
19.88* | | 0.46 | 0.09 | 0.09 | 19.88 | TRUE | TRUE
我哪里错了?关于如何解决这个/做我想做的建议将非常感激。感谢。