我正在尝试将样本列与两个参考列D和R进行比较。如果样本与D或R匹配,则用D或R替换该数据;除非./。在示例列中,我希望调用是NR。我已经添加了LogicCALL列来演示 - 在我的实际数据数据框中,这些调用将替换(1,0,。/。)
ReferenceD ReferenceR sample LogicCALL
0 1 0 1 D
1 1 1 ./. NC
2 1 0 0 R
Index(['ReferenceD', 'ReferenceR', 'sample', 'LogicCALL'], dtype='object')
到目前为止,我已经在下面构建了循环;其中Alt是样本列表。该循环适用于调用D和R而不是NC,而脚本返回" R"。
for sample in Alt:
gtdata[(sample)] = np.where((gtdata[(sample)] == gtdata['ReferenceD']) & (gtdata[sample] != gtdata['ReferenceR']), "D",
np.where((gtdata[(sample)] == "D") & (gtdata[(sample)] is not ('\./.')), "D",
np.where((gtdata[(sample)] == "D") & (gtdata[(sample)].str.contains('\./.')), "NC",
"R")))
答案 0 :(得分:0)
这不是一种功能性语法,但最可行的方法是在程序上制作这些语言:
df.loc[df['ReferenceD'].astype(str) == df['sample'], 'LogicCALL'] = 'D'
df.loc[df['ReferenceR'].astype(str) == df['sample'], 'LogicCALL'] = 'R'
df.loc[df['sample'] == './.', 'LogicCALL'] = 'NR'