是否有另一种方法在numpy中实现scipy.stats.mode函数以获取沿轴的ndarrays中最常见的值?(不导入其他模块),即
import numpy as np
from scipy.stats import mode
a = np.array([[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]],
[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]],
[[40, 40, 42, 43, 44],
[45, 46, 47, 48, 49],
[50, 51, 52, 53, 54],
[55, 56, 57, 58, 59]]])
mode= mode(data, axis=0)
mode = mode[0]
print mode
>>>[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]
答案 0 :(得分:14)
使用此代码定义scipy.stats.mode
函数,该代码仅依赖于numpy:
def mode(a, axis=0):
scores = np.unique(np.ravel(a)) # get ALL unique values
testshape = list(a.shape)
testshape[axis] = 1
oldmostfreq = np.zeros(testshape)
oldcounts = np.zeros(testshape)
for score in scores:
template = (a == score)
counts = np.expand_dims(np.sum(template, axis),axis)
mostfrequent = np.where(counts > oldcounts, score, oldmostfreq)
oldcounts = np.maximum(counts, oldcounts)
oldmostfreq = mostfrequent
return mostfrequent, oldcounts
来源:https://github.com/scipy/scipy/blob/master/scipy/stats/stats.py#L609
答案 1 :(得分:1)
如果您知道没有多少不同的值(相对于输入“itemArray”的大小),这样的事情可能会有效:
uniqueValues = np.unique(itemArray).tolist()
uniqueCounts = [len(np.nonzero(itemArray == uv)[0])
for uv in uniqueValues]
modeIdx = uniqueCounts.index(max(uniqueCounts))
mode = itemArray[modeIdx]
# All counts as a map
valueToCountMap = dict(zip(uniqueValues, uniqueCounts))