Question

我正在努力解决斯坦福大学免费在线CS106B课程中遇到的问题。问题的内容如下所示。我写了一个函数，但我不确定逻辑是否正确（当你知道你有正确的答案时，不是其中一个程序）。请在下面查看问题和我的代码。我很感激任何反馈/建议。

问题：考虑一个1000选民选举，两名候选人之间的比例为一个百分点，即一名候选人投票率为50.5％，另一名候选人投票率为49.5％。投票机在8％的时间内出错并记录对候选人的投票而非预期。这个错误率是否足以使选举结果无效？由于对统计学知之甚少，因此计算无效结果的确切概率并不难，但模拟此过程更加容易。为候选人A生成505票的序列，为候选人B生成495票，其中每次投票在记录时有8％的机会被倒置。尽管选民最初的意图，投票总数是否导致B击败A？该结果代表了模拟中的一个试验。如果您多次重复此试用并跟踪结果，则比率为：

（选举结果无效的审判次数）/（审判总数）

提供无效选举结果的百分比估计值。

编写一个程序，提示用户输入投票模拟参数，然后执行500次模拟试验并报告上面计算的比率。该程序的示例运行如下所示：

输入选民人数：10000 输入候选人之间的百分比差距：.005
输入投票错误百分比：.15 500次试验后选举结果无效的可能性= 13.4％

您的程序应注意验证用户选择的模拟参数是否在范围内（百分比必须为0到1.0，选民数应为正数），如有必要，请重新提示输入有效值。请注意，由于模拟中的随机性，预计结果将随着运行而变化。

CODE（P.S。我使用斯坦福CPP图书馆）：

#include <iostream>
#include "console.h"
#include "gwindow.h" // for GWindow
#include "simpio.h"  // for getLine
#include "vector.h"  // for Vector
#include "queue.h"   // for queues
# include "random.h"
using namespace std;


/* FUNCTION PROTOTYPES */
void ElectionSimulation();


/* MAIN METHOD */
int main(){
    ElectionSimulation();
    return 0;
}

/* FUNCTION DEFINITIONS */

void ElectionSimulation(){
    int numVoters = 
        getInteger("Enter number of voters: ", 
        "You must enter a positive integer, try again");
    int numSimulations =
        getInteger("Enter the number of election simulations: ",
        "You must enter a positive integer, try again" );
    double voterSpread =
        getDoubleBetween("Enter spread between candidates, e.g. for 10%
        enter 0.1 etc: ", 0.0, 1.0);

    double votingError =
        getDoubleBetween("Enter vote recording error chance, e.g. for   
        15% enter 0.15 etc: ", 0.0, 1.0);


    // Determine the correct number of votes for each candidate 
    // given the spread and numVotes
    int correctVotesLower = numVoters*(0.5 - 0.5*voterSpread);
    int correctVotesHigher = numVoters*(0.5 + 0.5*voterSpread);
    int invalidElections = 0;


    // Run simulations
    for (int i = 0 ; i<numSimulations; i++){
        // Before every simulation, set the correct number 
        // of votes for each candidate   
        int votesLower = correctVotesLower;
        int votesHigher = correctVotesHigher;


        // Redistribute votes due to vote recording error
        for (int j = 0; j<correctVotesLower; j++){
            if (randomChance(votingError)){
                votesLower--;
                votesHigher++;
            }
        }

        for (int k = 0; k<correctVotesHigher; k++){
            if (randomChance(votingError)){
                votesLower++;
                votesHigher--;
            }
        }


        if(votesLower > votesHigher) {invalidElections++;}

    }

    cout << "After " << numSimulations << 
    " simulations, elections were invalid "
     << (double)invalidElections*100.0/(double)numSimulations
     << " percent of times" << endl;
}

特别是，如果我输入以下参数（如问题文本中所示）：

numVoters = 10000;
numSumulations = 500;
voterSpread = 0.005;
votingError = 0.15;

我在大约30％的时间内获得了无效选择。看起来有点高。问题文本说在这些参数下我应该得到约13.4％（由于随机性，每次运行会略有不同）。我认为我的逻辑是错误的，但我不知道在哪里。

Answer 1

我相信你的程序是正确的。

如果人们以0.5025的概率投票给候选人A，并且投票机以0.15的概率对投票进行错误注册，那么这意味着投票机将以0.5025 *（1-0.15）+（1-0.5025）的概率注册候选人A. * 0.15 = 0.50175。当我将其插入到二项分布中以找到10000票中A的概率低于5000票时，我发现概率约为0.36。

这只是一个背后的估计，而不是正确的计算，但它表明你的30％可能不会太高。

（更新：为了确保，我还写了一个快速的Python程序，使用不同的技术解决问题，它也提供了大约30％。）

更新2：我今天早上醒来时想出了一种计算确切概率的方法，只需要尝试一下。所以这是用scipy找到它的一种方法;

import scipy.stats as ss

numVoters = 10000
voterSpread = 0.005
votingError = 0.15

correctVotersLower = int(numVoters*(0.5 - 0.5*voterSpread))
correctVotersHigher = int(numVoters*(0.5 + 0.5*voterSpread))

votersDifference = correctVotersHigher - correctVotersLower
minHighErrors = (votersDifference + 1) / 2

lowerErrorDist = ss.binom(correctVotersLower, votingError)
higherErrorDist = ss.binom(correctVotersHigher, votingError)

print sum([higherErrorDist.sf(x + minHighErrors) * lowerErrorDist.pmf(x) for x in range(0,correctVotersLower)])

我获得的概率约为0.305598。

c ++

1 个答案: