我最近发现了很棒的卡片 - SET。简而言之,有81张卡片具有以下四个特征:符号(椭圆形,波浪形或菱形),颜色(红色,紫色或绿色),数字(一,二或三)和着色(实心,条纹或开放)。任务是找到(从选定的12张卡片)一套3张卡片,其中每张卡片上的四个特征中的每一个都是相同的,或者每张卡片上的所有特征都是不同的(没有2 + 1组合)。
我在MATLAB中对其进行编码以找到解决方案,并估计随机选择卡片中设置一组的几率。
这是我估算赔率的代码:
%% initialization
K = 12; % cards to draw
NF = 4; % number of features (usually 3 or 4)
setallcards = unique(nchoosek(repmat(1:3,1,NF),NF),'rows'); % all cards: rows - cards, columns - features
setallcomb = nchoosek(1:K,3); % index of all combinations of K cards by 3
%% test
tic
NIter=1e2; % number of test iterations
setexists = 0; % test results holder
% C = progress('init'); % if you have progress function from FileExchange
for d = 1:NIter
% C = progress(C,d/NIter);
% cards for current test
setdrawncardidx = randi(size(setallcards,1),K,1);
setdrawncards = setallcards(setdrawncardidx,:);
% find all sets in current test iteration
for setcombidx = 1:size(setallcomb,1)
setcomb = setdrawncards(setallcomb(setcombidx,:),:);
if all(arrayfun(@(x) numel(unique(setcomb(:,x))), 1:NF)~=2) % test one combination
setexists = setexists + 1;
break % to find only the first set
end
end
end
fprintf('Set:NoSet = %g:%g = %g:1\n', setexists, NIter-setexists, setexists/(NIter-setexists))
toc
100-1000次迭代很快,但要小心多了。在我的家用电脑上进行一百万次迭代大约需要15个小时。无论如何,有12张卡和4个功能,我有13:1有一套。这实际上是一个问题。说明书说这个数字应该是33:1。它最近由Peter Norvig确认。他提供了Python代码,但我还没有测试它。
那你能找到错误吗?欢迎任何关于性能改进的评论。
答案 0 :(得分:2)
这是一个矢量化版本,可以在大约一分钟内计算1M手。我得到了大约28:1,因此找到“所有不同的”套装可能仍然会有一些东西。我的猜测是,这也是您的解决方案遇到的问题。
%# initialization
K = 12; %# cards to draw
NF = 4; %# number of features (this is hard-coded to 4)
nIter = 100000; %# number of iterations
%# each card has four features. This means that a card can be represented
%# by a coordinate in 4D space. A set is a full row, column, etc in 4D
%# space. We can even parallelize the iterations, at least as long as we
%# have RAM (each hand costs 81 bytes)
%# make card space - one dimension per feature, plus one for the iterations
cardSpace = false(3,3,3,3,nIter);
%# To draw cards, we put K trues into each cardSpace. I can't think of a
%# good, fast way to draw exactly K cards that doesn't involve calling
%# unique
for i=1:nIter
shuffle = randperm(81) + (i-1) * 81;
cardSpace(shuffle(1:K)) = true;
end
%# to test, all we have to do is check whether there is any row, column,
%# with all 1's
isEqual = squeeze(any(any(any(all(cardSpace,1),2),3),4) | ...
any(any(any(all(cardSpace,2),1),3),4) | ...
any(any(any(all(cardSpace,3),2),1),4) | ...
any(any(any(all(cardSpace,4),2),3),1));
%# to get a set of 3 cards where all symbols are different, we require that
%# no 'sub-volume' is completely empty - there may be something wrong with this
%# but since my test looked ok, I'm not going to investigate on Friday night
isDifferent = squeeze(~any(all(all(all(~cardSpace,1),2),3),4) & ...
~any(all(all(all(~cardSpace,1),2),4),3) & ...
~any(all(all(all(~cardSpace,1),3),4),2) & ...
~any(all(all(all(~cardSpace,4),2),3),1));
isSet = isEqual | isDifferent;
%# find the odds
fprintf('odds are %5.2f:1\n',sum(isSet)/(nIter-sum(isSet)))
答案 1 :(得分:2)
在查看代码之前,我解决了编写自己的实现的问题。我的第一次尝试与您已经拥有的非常相似:)
%# some parameters
NUM_ITER = 100000; %# number of simulations to run
DRAW_SZ = 12; %# number of cards we are dealing
SET_SZ = 3; %# number of cards in a set
FEAT_NUM = 4; %# number of features (symbol,color,number,shading)
FEAT_SZ = 3; %# number of values per feature (eg: red/purple/green, ...)
%# cards features
features = {
'oval' 'squiggle' 'diamond' ; %# symbol
'red' 'purple' 'green' ; %# color
'one' 'two' 'three' ; %# number
'solid' 'striped' 'open' %# shading
};
fIdx = arrayfun(@(k) grp2idx(features(k,:)), 1:FEAT_NUM, 'UniformOutput',0);
%# list of all cards. Each card: [symbol,color,number,shading]
[W X Y Z] = ndgrid(fIdx{:});
cards = [W(:) X(:) Y(:) Z(:)];
%# all possible sets: choose 3 from 12
setsInd = nchoosek(1:DRAW_SZ,SET_SZ);
%# count number of valid sets in random draws of 12 cards
counterValidSet = 0;
for i=1:NUM_ITER
%# pick 12 cards
ord = randperm( size(cards,1) );
cardsDrawn = cards(ord(1:DRAW_SZ),:);
%# check for valid sets: features are all the same or all different
for s=1:size(setsInd,1)
%# set of 3 cards
set = cardsDrawn(setsInd(s,:),:);
%# check if set is valid
count = arrayfun(@(k) numel(unique(set(:,k))), 1:FEAT_NUM);
isValid = (count==1|count==3);
%# increment counter
if isValid
counterValidSet = counterValidSet + 1;
break %# break early if found valid set among candidates
end
end
end
%# ratio of found-to-notfound
fprintf('Size=%d, Set=%d, NoSet=%d, Set:NoSet=%g\n', ...
DRAW_SZ, counterValidSet, (NUM_ITER-counterValidSet), ...
counterValidSet/(NUM_ITER-counterValidSet))
在使用Profiler发现热点之后,可以主要通过尽可能早地中断循环来进行一些改进。主要的瓶颈是调用UNIQUE函数。我们检查有效集合的上面两行可以改写为:
%# check if set is valid
isValid = true;
for k=1:FEAT_NUM
count = numel(unique(set(:,k)));
if count~=1 && count~=3
isValid = false;
break %# break early if one of the features doesnt meet conditions
end
end
不幸的是,对于较大的模拟,模拟仍然很慢。因此,我的下一个解决方案是矢量化版本,其中对于每次迭代,我们从12张抽取牌的手中构建所有可能的3张牌组的单个矩阵。对于所有这些候选集,我们使用逻辑向量来指示存在的特征,从而避免调用UNIQUE / NUMEL(我们希望在集合的每张卡上的所有相同或全部不同的特征)。
我承认代码现在可读性较差,难以遵循(因此我发布了两个版本进行比较)。原因是我试图尽可能地优化代码,以便每个迭代循环都完全矢量化。这是最终的代码:
%# some parameters
NUM_ITER = 100000; %# number of simulations to run
DRAW_SZ = 12; %# number of cards we are dealing
SET_SZ = 3; %# number of cards in a set
FEAT_NUM = 4; %# number of features (symbol,color,number,shading)
FEAT_SZ = 3; %# number of values per feature (eg: red/purple/green, ...)
%# cards features
features = {
'oval' 'squiggle' 'diamond' ; %# symbol
'red' 'purple' 'green' ; %# color
'one' 'two' 'three' ; %# number
'solid' 'striped' 'open' %# shading
};
fIdx = arrayfun(@(k) grp2idx(features(k,:)), 1:FEAT_NUM, 'UniformOutput',0);
%# list of all cards. Each card: [symbol,color,number,shading]
[W X Y Z] = ndgrid(fIdx{:});
cards = [W(:) X(:) Y(:) Z(:)];
%# all possible sets: choose 3 from 12
setsInd = nchoosek(1:DRAW_SZ,SET_SZ);
%# optimizations: some calculations taken out of the loop
ss = setsInd(:);
set_sz2 = numel(ss)*FEAT_NUM/SET_SZ;
col = repmat(1:set_sz2,SET_SZ,1);
col = FEAT_SZ.*(col(:)-1);
M = false(FEAT_SZ,set_sz2);
%# progress indication
%#hWait = waitbar(0./NUM_ITER, 'Simulation...');
%# count number of valid sets in random draws of 12 cards
counterValidSet = 0;
for i=1:NUM_ITER
%# update progress
%#waitbar(i./NUM_ITER, hWait);
%# pick 12 cards
ord = randperm( size(cards,1) );
cardsDrawn = cards(ord(1:DRAW_SZ),:);
%# put all possible sets of 3 cards next to each other
set = reshape(cardsDrawn(ss,:)',[],SET_SZ)';
set = set(:);
%# check for valid sets: features are all the same or all different
M(:) = false; %# if using PARFOR, it will complain about this
M(set+col) = true;
isValid = all(reshape(sum(M)~=2,FEAT_NUM,[]));
%# increment counter if there is at least one valid set in all candidates
if any(isValid)
counterValidSet = counterValidSet + 1;
end
end
%# ratio of found-to-notfound
fprintf('Size=%d, Set=%d, NoSet=%d, Set:NoSet=%g\n', ...
DRAW_SZ, counterValidSet, (NUM_ITER-counterValidSet), ...
counterValidSet/(NUM_ITER-counterValidSet))
%# close progress bar
%#close(hWait)
如果您有并行处理工具箱,则可以使用并行PARFOR轻松替换普通FOR循环(您可能希望再次在循环内移动矩阵M
的初始化:替换{{1}与M(:) = false;
)
以下是50000次模拟的一些示例输出(PARFOR与2个本地实例池一起使用):
M = false(FEAT_SZ,set_sz2);
并且有一百万次迭代(12次为PARFOR,15次为非PARFOR):
» tic, SET_game2, toc
Size=12, Set=48376, NoSet=1624, Set:NoSet=29.7882
Elapsed time is 5.653933 seconds.
» tic, SET_game2, toc
Size=15, Set=49981, NoSet=19, Set:NoSet=2630.58
Elapsed time is 9.414917 seconds.
比值比与Peter Norvig报告的结果一致。
答案 2 :(得分:1)
我发现了我的错误。感谢Jonas对RANDPERM的暗示。
我使用RANDI随机抽取K卡,但即使在12张牌中也有大约50%的机会获得重复。当我用randperm替换这一行时,我得到了33.8:1,10000次迭代,非常接近指令书中的数字。
setdrawncardidx = randperm(81);
setdrawncardidx = setdrawncardidx(1:K);
无论如何,看到问题的其他方法会很有趣。
答案 3 :(得分:1)
我确定我对这些赔率的计算有问题,因为其他几个人已通过模拟确认它接近33:1,如说明书所示,但是以下逻辑有什么问题?
对于12张随机卡,有三种卡的220种可能组合(12!/(9!3!)= 220)。三张牌的每个组合都有1/79的机会成为一套,所以有三张任意牌的78/79几率不是一套。因此,如果你检查了所有220个组合并且每个组合都有78/79的机会,那么你没有找到检查所有可能组合的集合的机会将是78/79提升到220次幂,或0.0606,这是约。 17:1赔率。
我一定错过了什么......?
克里斯托弗