按行排序熊猫多索引数据框

时间:2019-08-02 10:05:06

标签: python-3.x pandas

我正在尝试按由多索引列标识的矩阵的特定行对列进行排序。 sort_values 函数似乎不起作用,因为定义多元索引的元组似乎不存在。 熊猫是0.23.4

 tpm = pd.read_csv(parameters.tpm,sep=',', index_col= [0,1,2,3,4,5],header=[0,1] ,low_memory=False)

print(tpm[tpm.index.get_level_values(1) =='RBM47'].index.values.tolist()[0])

tpm.sort_values(by=list(tpm[tpm.index.get_level_values(1) =='RBM47'].index.values.tolist()[0]), axis=1, ascending=True, inplace=True, kind='quicksort', na_position='last')
  

[1行x 188列]('ENSG00000163694','RBM47','protein_coding',   'chr4',40423266、40630874)

 Traceback (most recent call last):
  File "/home/jean-philippe.villemin/code/RNA-SEQ/src/reordermatrice.py", line 67, in <module>
    tpm.sort_values(by=list(tpm[tpm.index.get_level_values(1) =='RBM47'].index.values.tolist()[0]), axis=1 )
  File "/home/jean-philippe.villemin/bin/anaconda3/lib/python3.5/site-packages/pandas/core/frame.py", line 4414, in sort_values
    stacklevel=stacklevel)
  File "/home/jean-philippe.villemin/bin/anaconda3/lib/python3.5/site-packages/pandas/core/generic.py", line 1401, in _get_label_or_level_values
    multi_message=multi_message))
ValueError: The index label 'ENSG00000163694' is not unique.
For a multi-index, the label must be a tuple with elements corresponding to each level

0 个答案:

没有答案