假设我有n行的五列数据:
0.0374 0.1311 0.1502 0.5761 0.1052
0.0117 0.0301 0.1748 0.5980 0.1854
0.1261 0.7332 0.1182 0.0156 0.0069
对于每一行,我希望能够识别包含最大值的列号。例如,在我的示例数据的第一行中,第3列(从零开始的索引)具有max()值;对于第二行,第3列再次具有最大值;对于第三行,第1列具有最大值。我可以编写一个低效的方法来识别具有最大值的列,但是这个问题是否有优雅的解决方案?我欢迎其他人提供的任何建议。
答案 0 :(得分:1)
使用以下代码:
<强> mytsv.tsv
强>
0.0374 0.1311 0.1502 0.5761 0.1052
0.0117 0.0301 0.1748 0.5980 0.1854
0.1261 0.7332 0.1182 0.0156 0.0069
代码:
>>> contents = open('mytsv.tsv')
>>> linenum = 0
>>> for line in contents:
... linenum+=1
... print 'The maximum in line %d is in column %d' %(linenum, line.index(max(line.split())))
...
The maximum in line 1 is in column 3
The maximum in line 2 is in column 3
The maximum in line 3 is in column 1
>>>
它并不完全优雅,但它相对pythonic。如果你想让我试着进一步缩小它,我可以。
以下是单行:
['The maximum in line %d is in column %d' %(linenum, line.split().index(max(line.split()))) for linenum, line in enumerate(open('mytsv.tsv'))]
可以这样使用:
>>> for k in ['The maximum in line %d is in column %d' %(linenum, line.split().index(max(line.split()))) for linenum, line in enumerate(open('mytsv.tsv'))]:
... print k
...
The maximum in line 0 is in column 3
The maximum in line 1 is in column 3
The maximum in line 2 is in column 1
>>>