Python:识别具有最大价值的列

时间:2014-05-06 02:20:35

标签: python tsv

假设我有n行的五列数据:

0.0374 0.1311 0.1502 0.5761 0.1052
0.0117 0.0301 0.1748 0.5980 0.1854
0.1261 0.7332 0.1182 0.0156 0.0069

对于每一行,我希望能够识别包含最大值的列号。例如,在我的示例数据的第一行中,第3列(从零开始的索引)具有max()值;对于第二行,第3列再次具有最大值;对于第三行,第1列具有最大值。我可以编写一个低效的方法来识别具有最大值的列,但是这个问题是否有优雅的解决方案?我欢迎其他人提供的任何建议。

1 个答案:

答案 0 :(得分:1)

使用以下代码:

<强> mytsv.tsv

0.0374 0.1311 0.1502 0.5761 0.1052
0.0117 0.0301 0.1748 0.5980 0.1854
0.1261 0.7332 0.1182 0.0156 0.0069

代码:

>>> contents = open('mytsv.tsv')
>>> linenum = 0
>>> for line in contents:
...     linenum+=1
...     print 'The maximum in line %d is in column %d' %(linenum, line.index(max(line.split())))
... 
The maximum in line 1 is in column 3
The maximum in line 2 is in column 3
The maximum in line 3 is in column 1
>>> 

它并不完全优雅,但它相对pythonic。如果你想让我试着进一步缩小它,我可以。

以下是单行:

['The maximum in line %d is in column %d' %(linenum, line.split().index(max(line.split()))) for linenum, line in enumerate(open('mytsv.tsv'))]

可以这样使用:

>>> for k in ['The maximum in line %d is in column %d' %(linenum, line.split().index(max(line.split()))) for linenum, line in enumerate(open('mytsv.tsv'))]:
...     print k
... 
The maximum in line 0 is in column 3
The maximum in line 1 is in column 3
The maximum in line 2 is in column 1
>>>