我正在尝试按由多索引列标识的矩阵的特定行对列进行排序。 sort_values 函数似乎不起作用,因为定义多元索引的元组似乎不存在。 熊猫是0.23.4
tpm = pd.read_csv(parameters.tpm,sep=',', index_col= [0,1,2,3,4,5],header=[0,1] ,low_memory=False)
print(tpm[tpm.index.get_level_values(1) =='RBM47'].index.values.tolist()[0])
tpm.sort_values(by=list(tpm[tpm.index.get_level_values(1) =='RBM47'].index.values.tolist()[0]), axis=1, ascending=True, inplace=True, kind='quicksort', na_position='last')
[1行x 188列]('ENSG00000163694','RBM47','protein_coding', 'chr4',40423266、40630874)
Traceback (most recent call last):
File "/home/jean-philippe.villemin/code/RNA-SEQ/src/reordermatrice.py", line 67, in <module>
tpm.sort_values(by=list(tpm[tpm.index.get_level_values(1) =='RBM47'].index.values.tolist()[0]), axis=1 )
File "/home/jean-philippe.villemin/bin/anaconda3/lib/python3.5/site-packages/pandas/core/frame.py", line 4414, in sort_values
stacklevel=stacklevel)
File "/home/jean-philippe.villemin/bin/anaconda3/lib/python3.5/site-packages/pandas/core/generic.py", line 1401, in _get_label_or_level_values
multi_message=multi_message))
ValueError: The index label 'ENSG00000163694' is not unique.
For a multi-index, the label must be a tuple with elements corresponding to each level