作为作业中的问题,我被要求编写一个Octave函数,模拟1000个实验,找到一个随机变量 X ,字母 {0,1,2,3} 和pmf:
Px(0)= 1/8
Px(1)= 1/4
Px(2)= 1/2
Px(3)= 1/8
通过询问一系列二元,“是”或“否”的问题。
我已经确定要求找到 X 的值的二进制问题的最佳序列是简单地询问“Is X = p?”其中p是可能的值,按概率递减的顺序排列。
所以最佳顺序是:
是X = 2?
如果不是:
是X = 1?
如果不是:
是X = 0?
如果没有,那么 X = 3
这是我写的功能:
function x = guessing_experiment(probabilities, n)
% generates n simulations of finding a random number in an alphabet by asking binary questions,
% where 'probabilities' is a list of the probabilities per number in the order the questions will be asked
num_Qs = zeros(1,n); % allocate array of size n for number of questions asked per experiment
[num_col, alphabet_size] = size(probabilities); % get size of alphabet
for i = 1:n % generate n experiments
Qs = 0; % number of questions asked in this experiment
for j = 1:alphabet_size - 1 % iterate through questions
question = rand; % generate random number in range [0, 1]
Qs++; % incremenet number of questions asked
if (question <= probabilities(j)) % if question produces a "yes" answer
break;
endif
endfor
num_Qs(i) = Qs; % store number of questions asked for this experiment
endfor
x = mean(num_Qs); % calculate mean number of questions asked over the n experiments
end
其中称为guessing_experiment([1/2, 1/4, 1/8, 1/8], 1000)
数组是每个问题产生“是”答案的概率,按照它们的询问顺序排列, n 是实验的数量。
提出这些问题应该会产生1.75的平均问题,但我的程序总是产生~1.87的平均值。我的脚本错误在哪里?
我假设它与生成一个新的随机数有关,以模拟所提出的3个问题中的每一个。
答案 0 :(得分:0)
我删除了之前的错误答案,其中说明您的脚本是正确的,并且您的计算错误。我再次考虑它,你的计算是正确的。我自己尝试使用以下MATLAB脚本:
% probabilities for each number
p = [1/8,1/4,1/2,1/8];
% sort them from higher to lower
p = sort(p,'descend');
% number of questions per probability
nq = 1:length(p)-1;
% the last question can distinguish between two variables
nq(end+1) = nq(end);
% number of trials
n = 100000;
% random sample number of questions
q = randsample(nq,n,true,p);
% mean number of questions
avgQ = mean(q)
和获得的平均值。是1.75 - 正如你计算的那样。 我将尝试再次查看您的代码以查看错误
修改强>
您的脚本存在的问题是您忽略了conditional probability,即在询问有关变量的问题时忽略了您已经了解的信息。例如,当您提出第三个问题时,该值0
的概率不是 p=1/8
而是p=1/2
,因为您已经知道它不是1
或2
。
您需要做的修复是将概率除以可能的事件概率probabilities(j)/sum(probabilities(j:end))
:
n = 10000;
p = [1/8,1/4,1/2,1/8];
% sort them from higher to lower
probabilities = sort(p,'descend');
probabilities(end-1) = probabilities(end-1) + probabilities(end);
probabilities(end) = [];
alphabet_size = numel(probabilities);
num_Qs = zeros(1,n); % allocate array of size n for number of questions asked per experiment
for i = 1:n % generate n experiments
Qs = 0; % number of questions asked in this experiment
for j = 1:alphabet_size % iterate through questions
question = rand; % generate random number in range [0, 1]
Qs = Qs + 1; % incremenet number of questions asked
if question < probabilities(j)/sum(probabilities(j:end)) % if question produces a "yes" answer
break;
end
end
num_Qs(i) = Qs; % store number of questions asked for this experiment
end
x = mean(num_Qs)
x~1.75
此场景中条件概率的向量是:
p = [1/8,1/4,1/2,1/8];
p = sort(p,'descend');
cond_p = p./cumsum(p,'reverse')
cond_p =
0.5000 0.5000 0.5000 1.0000