Question

我有以下输入文件（Input.xls）：

Mouse   No_neigh_mouse  Human   No_neigh_hum    Intersection    TotalGeneTested
Gm20645 1   lnc3    2   1   8
Gm20645 1   lnc2    1   0   8
Gm20645 1   lnc1    2   1   8
Gm26549 2   lnc3    2   1   8
Gm26549 2   lnc2    1   1   8
Gm26549 2   lnc1    2   1   8

我想：

计算每行的超额标识p值（成功完成）
然后计算p值修正的fdr（与BH相同）
将调整后的p值添加为最后一列。

我的预期输出文件将有四列。首先是＆＃34;鼠标＆＃34;的值，第二个是＆＃34;人类＆＃34;的值，第三个是＃34; Hypergeom-pvalue＆＃34;，第四个是＆＃34;调整后的 - p值＆＃34 ;.我可以使用以下代码生成前3列：

output=open("Hypergeom.xls", "w")
output.write("Mouse\tHuman\tHypergeom-pvalue\tAdjusted-pvalue\n")
Input = pd.read_table("Input.xls", sep="\t")

for i in range (0, len(Input.index)):
    hyperg= scipy.stats.hypergeom.sf(Input.ix[i,4], Input.ix[i,5], Input.ix[i,1], Input.ix[i,3],1) #calculates hypergeom p value without a problem
    newline = Input.ix[i,0], Input.ix[i,2], str(hyper)
    output.write('\t'.join(newline)+'\n')
    output.close()

直到这里，脚本运行正常，我得到以下输出文件（＆＃34; Hypergeom.xls＆＃34;）：

Mouse   Human   Hypergeom-pvalue    Adjusted-pvalue
Gm20645 lnc3    0.25
Gm20645 lnc2    1
Gm20645 lnc1    0.25
Gm26549 lnc3    0.464285714
Gm26549 lnc2    0.25
Gm26549 lnc1    0.464285714

然后，我的目标是重新打开输出文件作为输入，然后根据其中一个使用R的用户建议的命令计算fdr：How to implement R's p.adjust in Python

我的代码：

import rpy2.robjects as R
pvaluefile = pd.read_table("Hypergeom.xls", sep="\t")
pvalue_list = pvaluefile.ix[:,2].tolist()  #converts the value column series to a list
#Now, i try to apply the command from the SO link above
p_adjusted = R['p.adjust'](R.FloatVector(pvalue_list),method='BH')
for v in p_adjusted:
    print v

我在步骤p_adjusted = R [...]时收到错误。错误是： TypeError：＆＃39; module＆＃39;对象没有属性＆＃39; getitem ＆＃39;

因此，我有两个问题：

我无法弄清楚如何通过克服这个来计算fdr 错误
如何将文件末尾的fdr列添加为第四列？

如何在python中计算和添加FDR列到文件

0 个答案: