我一直在尝试将data.frame的每一行保存在一个名为该特定行索引的文件中。
data.frame的结构基本上是这样的:
SUBMITTED_ID SYMBOL IMMUNE CLASS CELL_HUM LOCATION_ARM LOCATION_MIN LOCATION_MAX
FBgn0000047 Act88F control control control 3R 15439969 15442177
FBgn0000094 Anp immunity humoral AMP 3R 30209948 30210382
FBgn0000116 Argk control control control 3L 9048781 9066027
到目前为止我得到的是:
import sys
import pandas as pd
import numpy as np
df = pd.read_csv(sys.argv[1])
df['NAME']= df['SUBMITTED_ID']+'-'+df['SYMBOL']+'-'+df['IMMUNE']+'-'+df['CLASS']+'-'+df['CELL_HUM']
df_indexed = df.set_index('NAME')
df_bed =df_indexed[['LOCATION_ARM','LOCATION_MIN','LOCATION_MAX']]
for index, row in df_bed.iterrows():
np.savetxt(str(index)+'.bed', row, delimiter='\t', fmt="%s")
它有效,但它将行的每个值保存在单独的行中,如下所示:
3R
22034298
22038925
有人知道我在这里做错了什么吗?
谢谢,
略
答案 0 :(得分:1)
如果您需要索引,请先reset_index
,按values
转换为ndarray
,然后使用tofile
:
for row in df_bed.reset_index().values:
#print row
row[1:].tofile(str(row[0])+'.bed', sep="\t", format="%s")