生成排列x次并计算它的卡方分布的概率

时间:2015-04-19 02:28:44

标签: java apache count permutation chi-squared

这个问题非常具体。我已经找到了几十个地方告诉你如何在Java中生成随机排列,但它从未计算出卡方分布的概率。让我告诉你,设置它似乎很公平,有很多在线教程,但有一点关于这段代码真的让我烦恼的事实是,在作业的第二部分,我应该生成一个随机的来自索引j的字符串的置换,其中j在0和i的范围内随机选择。一种方法应该始终输出1.0的概率,这是有偏见和不公平的,而第二种方法产生0到1.0之间任何数的概率。我在第1部分中得到了第一部分,但第二部分我无法让它不能一直显示1.0。作业说我只是逐步完成数组。在这种情况下,尝试了两种排列生成方式:

方法1:

public static String generatePermutation(String prefix, String t){
    int n = 6;
    String s = "";
    StringBuilder test = new StringBuilder(t);
    if (n == 0){
        System.out.println(prefix);
    }
    else {
        for(int i = 0; i < n; i++){
            int j = randInt(0, i);
            char temp = test.charAt(j);
            test.setCharAt(j, test.charAt(i));
            test.setCharAt(i, temp);
        }
        s = test.toString();
        return s;
    }

    return s;
}

public static int randInt(int min, int max) {
    Random rand = new Random();
    int randomNum = rand.nextInt((max - min) + 1) + min;

    return randomNum;
}

方法2:

public static String generatePermutation(String prefix, String t){
    char[] letters = t.toCharArray();
    shuffle(letters);
    String s = new String(letters);
    return s;
}

public static void shuffle(char[] array){
    int n = array.length;
    Random rand = new Random();
    while(n > 1){
        int k = rand.nextInt(n--);
        char temp = array[n];
        array[n] = array[k];
        array[k] = temp;
    }
}

public static int randInt(int min, int max) {
    Random rand = new Random();
    int randomNum = rand.nextInt((max - min) + 1) + min;

    return randomNum;
}

两种方法似乎都没有给出0到1.0之间的随机数概率。作业第2部分的当前代码结构如下:

package math3323assignment7;

import java.util.HashMap;
import java.util.Map;
import java.util.Map.Entry;
import java.util.Random;
import java.util.Collections;

import org.apache.commons.math3.distribution.ChiSquaredDistribution;

import com.google.common.collect.Multiset;
import com.google.common.collect.TreeMultiset;

public class assignment7part2 {

public static void main(String[] args) {
    // TODO Auto-generated method stub
    String s = "ABCDEF";
    Map<String, Integer> counts = new HashMap<>();
    Integer count;
    int expected = (int)factorial(s.length());
    for(int i = 0; i < 720000; i++){
        String t = generatePermutation("",s);
        count = counts.get(t);
        if(count == null){
            count = 1;
        }
        else {
            count = count + 1;
        }
        counts.put(t, count);
        System.out.println(t);
    }
    for(Entry<String, Integer> entry : counts.entrySet()){
        System.out.println(entry.getValue() + " times: " + entry.getKey());
    }
    double chistat = 0.0;
    for(Entry<String, Integer> entry: counts.entrySet()){
        double di = entry.getValue() - expected;
        chistat += di*di/expected;
    }

    ChiSquaredDistribution chisq = new ChiSquaredDistribution(719.0);
    double prob = chisq.cumulativeProbability(chistat);

    System.out.printf("ChiSquare statistic = " + chistat + " the probability is " + prob);
}

public static String generatePermutation(String prefix, String t){
    char[] letters = t.toCharArray();
    shuffle(letters);
    String s = new String(letters);
    return s;
}

public static long factorial(int n){
    if (n <= 1){
        return 1;
    }
    else {
        return n * factorial(n-1);
    }
}

public static void shuffle(char[] array){
    int n = array.length;
    Random rand = new Random();
    while(n > 1){
        int k = rand.nextInt(n--);
        char temp = array[n];
        array[n] = array[k];
        array[k] = temp;
    }
}

public static int randInt(int min, int max) {
    Random rand = new Random();
    int randomNum = rand.nextInt((max - min) + 1) + min;

    return randomNum;
}

}

如您所见,Apache API中的Apache Math Distribution类正用于创建Chi Square Distribution。使用单独的for循环来计算Chi Square统计量。不幸的是,当我运行程序时,输出总是在最后有一个类似的静脉:

 Prints all random permutations 720,000 times
 Counts all the times each permutation occurs, and print out the numbers
 ChiSquare statistic = 79360.74444444438 the probability is 1.0

我希望最后一部分打印出来:

 ChiSquare statistic = 79360.74444444438 the probability is 0.64

请你帮我解决一下这个程序第二部分的最终结果如上所示?

1 个答案:

答案 0 :(得分:0)

您使用的预期值不正确。对于每个随机排列,您的输出应该显示大约1000的计数。人们可以预期每个在非随机设置中都是1000,因为720000/1000 = 720,这是你正在计算的     int expected =(int)factorial(s.length()); 而是试试     int expected = 1000;