使用以下代码:
import pandas as pd
from sklearn.preprocessing import scale
df = pd.DataFrame({"Probe":["1430378_at","1439896_at","1439896_at"],
"Gene":["2900011G08Rik","Trappc5","Limk2"],
"A.x1":[0.0767, 0.4383, 0.7866],
"A.x2":[0.8091, 0.1954, 0.6307],
"A.x3":[ 0.6599, 0.1065, 0.0508]
}
)
df = df[["Probe","Gene","A.x1","A.x2","A.x3"]]
我可以获得以下数据框:
In [55]: df
Out[55]:
Probe Gene A.x1 A.x2 A.x3
0 1430378_at 2900011G08Rik 0.0767 0.8091 0.6599
1 1439896_at Trappc5 0.4383 0.1954 0.1065
2 1439896_at Limk2 0.7866 0.6307 0.0508
我想要做的是为列z-score
计算Ax1,x2,x3
。
我怎样才能做到这一点?
例如,对于第二行,我们使用以下
计算z得分from sklearn.preprocessing import scale
scale([0.4383,0.1954,0.1065],axis=0,with_mean=True, with_std=True,copy=False)
Out[61]: array([ 1.36603199, -0.36604999, -0.999982 ])
最后我们希望得到:
Probe Gene A.x1 A.x2 A.x3
0 1430378_at 2900011G08Rik -1.38769528 0.92991195 0.45778333
1 1439896_at Trappc5 1.36603199 -0.36604999 -0.999982
2 1439896_at Limk2 0.93889666 0.44644183 -1.3853385
答案 0 :(得分:1)
对于Pandas,尝试使用apply
和匿名函数来执行每一行的计算通常是个好主意。这对你有用吗?:
df.iloc[:,2:5] = df.filter(regex = 'A.x').apply(
lambda V: scale(V,axis=0,with_mean=True, with_std=True,copy=False),axis=1)