我一直有使用Matlab转换为Python的问题。我有Matlab的代码我去年写过(工作),现在尝试将函数转换为Python。其中5个工作,4个不工作。我真的被困住了,我会很乐意帮忙。 这个是关于估计朴素贝叶斯概率。这是Matlab中的函数:
function [ p_x_y ] = estimate_p_x_y_NB(Xtrain,ytrain,a,b )
% Function calculates probability distribution p(x|y), assuming that x is binary
% and its elements are independent from each other
% Xtrain - training dataset NxD
% ytrain - training dataset class labels 1xN
% p_x_y - binomial distribution estimators - element at position(m,d)
% represents estimator p(x_d=1|y=m) MxD
% N - number of elements in training dataset
D = size(Xtrain,2);
M = length(unique(ytrain));
p_x_y = zeros(M,D);
for i=1:M
for j=1:D
numerator = sum((ytrain==i).*((Xtrain(:,j)==1))')+a-1;
denominator = sum(ytrain==i)+a+b-2;
p_x_y(i,j) = numerator/denominator;
end
end
end
这是我对Python的翻译:
def estimate_p_x_y_nb(Xtrain, ytrain, a, b):
"""
:param Xtrain: training data NxD
:param ytrain: class labels for training data 1xN
:param a: parameter a of Beta distribution
:param b: parameter b of Beta distribution
:return: Function calculated probality p(x|y) assuming that x takes binary values and elements
x are independent from each other. Function returns matrix p_x_y that has size MxD.
"""
D = Xtrain.shape[1]
M = len(np.unique(ytrain))
p_x_y = np.zeros((M, D))
for i in range (M):
for j in range(D):
up = np.sum((ytrain == i+1).dot((Xtrain[:, j]==1)).conjugate().T) + a - 1
down = np.sum((ytrain == i+1) + a + b -2)
p_x_y[i,j] = up/down
return p_x_y
回溯:
p_x_y[i,j] = up/down
ValueError: setting an array element with a sequence.
如果你能看到这个功能的任何问题,我会非常高兴地指出它。另外,我在.dot
变量中使用了*
而不只是up
,因为当它是*
时,我得到了关于不准确维度的错误,但是有了这个,似乎工作。谢谢。
答案 0 :(得分:1)
我认为您在分配分母的声明中存在问题。您错误地使用了括号
down = np.sum((ytrain == i + 1)+ a + b -2)
应该是
down = np.sum((ytrain == i+1)) + a + b -2
另外,尝试更改
up = np.sum((ytrain == i + 1).dot((Xtrain [:,j] == 1))。conjugate()。T)+ a - 1
到
up = np.sum((ytrain == i+1) * (Xtrain[:, j]==1)) + a - 1
我希望有效。我没有看到您的代码有任何其他问题。
更改后,我使用了值
Xtrain = np.array([[1,2,3,4,5], [1,2,3,4,5]])
ytrain = np.array([1,2])
a = 1
b = 1
这给出了输出
array([[ 1., 0., 0., 0., 0.],
[ 1., 0., 0., 0., 0.]])
在MATLAB和python中。如果结果符合预期,您可以使用这些值进行检查。