Question

我有一个numpy数组（在这种情况下是矩阵），它至少有100行和10列。其中一些列包含数值，我想查找这些列的最大值和最小值

以下是其中一列的示例：

Python / Numpy中是否有任何方法可以计算特定列的最大值和最小值？

****** ****** EDIT

这是我尝试使用的实际数组 -

array([['"13316"', '26', '" Private"', '152855', '" HS-grad"', '9',
    '" Never-married"', '" Exec-managerial"', '" Own-child"',
    '" Other"', '" Female"', '0', '0', '40', '" Mexico"', '" <=50K"'],
   ['"28750"', '50', '" Self-emp-not-inc"', '99894', '" 5th-6th"', '3',
    '" Never-married"', '" Tech-support"', '" Not-in-family"',
    '" Asian-Pac-Islander"', '" Female"', '0', '0', '15',
    '" United-States"', '" <=50K"'],
   ['"30619"', '35', '" Private"', '412379', '" HS-grad"', '9',
    '" Never-married"', '" Other-service"', '" Not-in-family"',
    '" White"', '" Female"', '0', '0', '40', '" United-States"',
    '" <=50K"'],

某些属性是数字的，有些则不是。我使用np.genfromtxt加载了文件中的数据，并将dtype指定为None。我曾尝试在这些特定列上使用numpy.amax和amin，但无济于事。我意识到这可能是因为它们被加载为字符串，也许我必须在这之前将它们排版为int。我也试过了，这似乎也失败了。关于这个的任何想法？

Answer 1

如果我理解你的问题，这是一个丑陋却又有效的解决方案：

import numpy as np
# data : first two lines of your example
A = np.array([['"13316"', '26', '" Private"', '152855', '" HS-grad"','9',
               '" Never-married"', '" Exec-managerial"', '" Own-child"',
               '" Other"', '" Female"', '0', '0', '40', '" Mexico"', 
               '" <=50K"'],
              ['"28750"', '50', '" Self-emp-not-inc"', '99894', '"5th-6th"', '3',
'               " Never-married"', '" Tech-support"', '" Not-in-family"',
               '" Asian-Pac-Islander"', '" Female"', '0', '0', '15',
               '" United-States"', '" <=50K"']])

# extract an array containing only the columns of numbers :
numbers_columns = [0, 1, 3, 5, 11, 12, 13]
B = A[:, numbers_columns]
# remove the extra double quotes for each element of B :
C = [[b.replace('\"', '') for b in line] for line in B ]
# set as a numpy array and convert to np.int :
D = np.array(C).astype(np.int)

现在你有一个只包含数字的numpy数组。列的最小值和最大值可以简单地找到：

np.min( D[:, i] )
np.max( D[:, i] )

PS：我很担心这个解决方案非常不优雅，但我想不出更好的东西。我建议你改进阅读数据的方式，以防止出现这个问题。

如何使用python numpy在特定列中查找最大值和最小值？

1 个答案: