我在熊猫中有两个数据框
df1:
Genes N1 N2 N3 N4 N5 \
1 100130426 0 0 0.2262 0 0
2 100133144 6.0377 4.3819 15.9742 4.5751 14.5776
3 100134869 3.9512 2.3768 12.3047 5.6267 4.8288
4 10357 197.2475 87.8119 78.7874 328.9158 113.7614
5 10431 1615.9822 1645.1704 769.866 722.7625 863.5845
df2:
Genes T1 T2 T3 T4 T5 T6 \
1 100130426 0 0 0 0 0 0
2 100133144 4.0315 1.4705 0 8.2678 5.3467 3.1702
3 100134869 10.9554 9.111 7.5432 9.0772 8.2126 5.9363
4 10357 128.3177 129.6157 144.8024 108.58 107.5162 153.8304
5 10431 659.423 835.5713 900.873 878.8159 433.5901 579.3967
如何将这些格式输出到单个csv?请注意,“基因”列保持不变,N列紧挨着它们对应的T列。
Genes N1 T1 N2 T2 N3 \
1 100130426 0 0 0.2262 0 0
2 100133144 6.0377 4.0315 15.9742 1.4705 14.5776
3 100134869 3.9512 10.9554 12.3047 9.111 4.8288
4 10357 197.2475 128.3177 78.7874 129.6157 113.7614
5 10431 1615.9822 659.423 769.866 835.5713 863.5845 and so on...
我有25列数据,所以我想要按以下顺序输出列:
['Genes','N1','T1','N2','T2','N3','T3','N4','T4',...,'N23','T23 ','N24','T24','N25','T25']
答案 0 :(得分:4)
您可以这样做:
merged = df1.merge(df2).set_index('Genes')
merged = merged[sorted(merged.columns,key=lambda x: int(x[1:]))].reset_index()
这将按除基因列以外的所有列上字母后面的数字 对列进行排序:
Genes N1 T1 N2 T2 N3 T3 \
0 100130426 0.0000 0.0000 0.0000 0.0000 0.2262 0.0000
1 100133144 6.0377 4.0315 4.3819 1.4705 15.9742 0.0000
2 100134869 3.9512 10.9554 2.3768 9.1110 12.3047 7.5432
3 10357 197.2475 128.3177 87.8119 129.6157 78.7874 144.8024
4 10431 1615.9822 659.4230 1645.1704 835.5713 769.8660 900.8730
N4 T4 N5 T5 T6
0 0.0000 0.0000 0.0000 0.0000 0.0000
1 4.5751 8.2678 14.5776 5.3467 3.1702
2 5.6267 9.0772 4.8288 8.2126 5.9363
3 328.9158 108.5800 113.7614 107.5162 153.8304
4 722.7625 878.8159 863.5845 433.5901 579.3967