我想使用ref
pd.DataFrame
填充矩阵xxx
,但请跳过NaN
。
print xxx
OUT >>
intensity name rowtype1 rowtype2
0 100 A 1 4.0
1 200 A 2 NaN
2 300 B 3 5.0
然后我按ref[rowtype,col] = intensity
填充矩阵,其中我有2 rowtype
。
ref = np.zeros(shape=(7,4))
for idx, inte, name, r1, r2 in xxx.itertuples():
ref[r1,idx] = inte
ref[r2,idx] = inte # error because of NaN in rowtype2
print ref
如何在此处跳过NaN
?
我知道使用drop.na()
的一种方法,但必须创建具有rowtype2
和intensity
的新数据框。我希望有一个简单快捷的方法,例如只需NaN
intensity = 200
跳转到下一个rowtype2 = 5
intensity = 300
xxx
。
其他信息:
1)以下是创建prot = ['A','A','B']
calc_m = [1,2,3]
calc_m2 = [4, np.nan,5]
inte = [100,200,300]
xxx = pd.DataFrame({'name' : pd.Series(prot),
'rowtype1': pd.Series(calc_m),
'rowtype2': pd.Series(calc_m2),
'intensity': pd.Series(inte)
})
const express = require("express"),
path = require("path"),
app = express()
const DIST_DIR = path.normalize(__dirname + "/../../VueOutputDir")
app.use(express.static(DIST_DIR))
答案 0 :(得分:1)
您可以使用melt
使用此选项,然后使用numpy的索引与使用for循环设置ref
的索引
set = xxx.reset_index().melt(['intensity','index'],['rowtype1','rowtype2']).dropna()
ref[set.value.astype(int).values,set['index'].values] = set.intensity.values
给你
array([[ 0., 0., 0., 0.],
[ 100., 0., 0., 0.],
[ 0., 200., 0., 0.],
[ 0., 0., 300., 0.],
[ 100., 0., 0., 0.],
[ 0., 0., 300., 0.],
[ 0., 0., 0., 0.]])
答案 1 :(得分:0)
I'm not sure I fully understand what behavior you are looking for, but the pandas dropna() command has the "subset" argument... for example, dropping all rows with NaN in the rowtype2 column could be done with
xxx.dropna(subset=['rowtype2'],inplace=True)
That way, you would drop only rows with NaN in the rowtype2 column.