如何从数组中计算出唯一的“模式”?

时间:2019-04-17 16:07:09

标签: java loops

我的任务是查找给定数组(未指定长度)的模式。模式定义为最唯一出现的数字。因此,例如,数组[1.0、2.0、3.0、2.0]的模式为2.0。但是,如果没有该值的唯一数字,例如[1.0、2.0、2.0、3.0、3.0],则程序在我的程序中返回“ no mode”或“ Double.NaN”。

我已经编写了适用于3/4个测试用例的代码,但是总是在赶上两种相同模式的情况下搞砸了。

public double mode() {

    double modeOne = data[0];
    double modeTwo = 0;
    int count = 0;
    int countOne = 0;
    int countTwo = 0;

    if(data.length == 1) { // special case: if array length is 1 the mode will always just be that value
        modeOne = data[0];
        return modeOne;
    } // end if

    for(int i = 0; i < data.length; i++) { // pulling out first value
        double value = data[i];
        for(int n = 0; n < data.length; n++) { // comparing first value to all other values
            if (data[n] == value) {
                count ++; // adding onto a count of how many of the same number there are
            }
        }
        if(modeOne == value || modeTwo == value) { // move on if the modes already have that value
            continue;
        }
        if(count > countOne) { // setting the max count
            countTwo = countOne;
            countOne = count;
            modeTwo = modeOne;
            modeOne = value;
        }
        else if(count > countTwo) { // setting second highest count
            countTwo = count;
            modeTwo = value;
        }
    } // end for
    if(countOne == 1) { // if all the modes are just one
        return Double.NaN;
    }
    if(countOne == countTwo) { // if there are two of the same modes
        return Double.NaN;
    }
    else {
        return modeOne;
    }
} //end MODE

对于此测试用例:

double[] data = {1,2,2,3,3,4};
Stat stat1 = new Stat(data);
System.out.println("stat1 mode = " + stat1.mode());

我期望“ NaN”但返回4。但是,它适用于以下情况:

double[] data = {-5.3, 2.5, 88.9, 0, 0.0, 28, 16.5, 88.9, 109.5, -90, 88.9};
Stat stat1 = new Stat(data);
System.out.println("stat1 mode = " + stat1.mode());

预期输出为88.9,程序会正确输出。

5 个答案:

答案 0 :(得分:2)

这里是使用Streaming API的一种方法。但是,我对模式的定义是集合,而不是单个数字。

import org.junit.Test;

import java.util.Arrays;
import java.util.Map;
import java.util.OptionalLong;
import java.util.Set;
import java.util.concurrent.ThreadLocalRandom;
import java.util.function.Function;
import java.util.stream.Collectors;
import java.util.stream.Stream;

import static org.junit.Assert.assertEquals;
import static org.junit.Assert.assertFalse;

public class ModeTest {

    private <T extends Number> Set<T> modes(T... input) {
        return modes(Arrays.stream(input));
    }

    /**
     * Calculate the modes of a numeric stream.  The modes are the values that occurs most often. If no number in the
     * stream is repeated, then all the numbers in the stream are modes.
     *
     * @param input stream of numbers
     * @param <T>   number type
     * @return modes.
     */
    private <T extends Number> Set<T> modes(Stream<T> input) {

        // transform the input to a map containing the counted entries
        final Set<Map.Entry<T, Long>> countedEntries = input
            .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()))
            .entrySet();

        // Figure out the max value
        final OptionalLong max = countedEntries
            .parallelStream()
            .mapToLong(Map.Entry::getValue)
            .max();

        // Handle the case where the stream was empty
        if (max.isEmpty()) {
            return Set.of();
        }

        return countedEntries
            .parallelStream()
            .filter(e -> e.getValue() == max.getAsLong())
            .map(Map.Entry::getKey)
            .collect(Collectors.toSet());

    }

    @Test
    public void oneMode() {
        final Double[] input = new Double[]{1.0, 1.1, 1.2, 2.0, 2.0, 3.0};
        assertEquals(modes(input), Set.of(2.0));
    }

    @Test
    public void multipleModes() {
        final Stream<Double> input = Stream.of(1.0, 1.1, 1.2, 2.0, 2.0, 3.0, 3.0);
        assertEquals(modes(input), Set.of(2.0, 3.0));
    }

    @Test
    public void allSingles() {
        final Stream<Double> input = Stream.of(1.0, 1.1, 1.2, 2.0, 3.0);
        assertEquals(modes(input), Set.of(1.0, 1.1, 1.2, 2.0, 3.0));
    }

    @Test
    public void largeRandomSet() {
        Integer[] randoms = new Integer[204800];
        for (int i = randoms.length - 1; i >= 0; --i) {
            randoms[i] = ThreadLocalRandom.current().nextInt(200);
        }
        assertFalse(modes(randoms).isEmpty());
    }

    @Test
    public void emptyStream() {
        final Stream<Double> input = Stream.of();
        assertEquals(modes(input), Set.of());
    }
}

答案 1 :(得分:1)

由于我想迎接一个小挑战,所以我确实使用Map编写了自己的解决方案以计算各个值。

然后,您检索可用的最高计数,并再次遍历地图以确定多个条目是否具有相同的最高计数,如果是,则将返回NaN。

public static double calculateMode(double[] numbers) {
    Map<Double, Integer> lookupMap = new TreeMap<>();

    for (double number : numbers) {
        if (lookupMap.get(number) != null) {
            lookupMap.put(number, lookupMap.get(number) + 1);
        } else {
            lookupMap.put(number, 1);
        }
    }

    int max = -1;
    double maxKey = Double.NaN;
    for (Entry<Double, Integer> entry : lookupMap.entrySet()) {
        if (entry.getValue() > max) {
            max = entry.getValue();
            maxKey = entry.getKey();
        }
    }

    int foundMax = 0;

    for (Entry<Double, Integer> entry : lookupMap.entrySet()) {
        if (entry.getValue() == max) {
            foundMax++;
        }
    }

    if (foundMax > 1) {
        return Double.NaN;
    }

    return maxKey;

}

方法调用:

public static void main(String[] args) {
    double[] data = {1, 2, 2, 3, 3, 4};
    double[] data2 = {-5.3, 2.5, 88.9, 0, 0.0, 28, 16.5, 88.9, 109.5, -90, 88.9};
    System.out.println("Expected NaN - and was: " + calculateMode(data));
    System.out.println("Expected 88.90 - and was: " + calculateMode(data2));
}

输出:

Expected NaN - and was: NaN
Expected 88.90 - and was: 88.9

答案 2 :(得分:1)

那里没有Collection等...纯硬编程:)

public double mode(double[] data)
{
    if(data.length==1)
        return data[0];
    double temp;
    double [] fr = new double [data.length];  //store frequency
    int visited = -1;  

    for(int i = 0; i < data.length; i++)
    {           
        int count = 1;  
        for(int j = i+1; j < data.length; j++)
        {  
            if(data[i] == data[j])
            {  
                count++;   
                fr[j] = visited;  
            }  
        }  
        if(fr[i] != visited)  
            fr[i] = count;  
    }  


    for (int i = 0; i < fr.length; i++)   // sort array in decreasing order
    {
        for (int j = i + 1; j < fr.length; j++) 
        {
            if (fr[i] < fr[j]) 
            {
                temp = data[i];
                data[i] = data[j];
                data[j] = temp;

                temp = fr[i];
                fr[i] = fr[j];
                fr[j] = temp;
            }
        }
    }

    if(fr[0] == fr[1])
        return Double.NaN;
    else
        return data[0];

}

答案 3 :(得分:1)

所以我也感到很挑战,并且在不使用Collection的情况下得到了解决方案。
不是一个很好的解决方案,但它似乎可以工作:

public class TestMode
{
  private static class NumberFrequency
  {
    double number;
    int    frequency;
  }

  public static double calculateMode(double[] numbers)
  {
    // Maybe array empty
    if ((numbers == null) || (numbers.length == 0))
      return Double.NaN;

    // Initialize array with frequencies
    NumberFrequency[] array;
    int               size = 0;
    array = new NumberFrequency[numbers.length];

    // Loop over numbers determining frequencies
    for (double number : numbers)
    {
      // Maybe encountered before
      int index;
      for (index = 0; index < size; index++)
      {
        if (array[index].number == number)
          break;
      }

      // Update array
      NumberFrequency elm;
      if (index == size)
      {
        elm = new NumberFrequency();
        elm.number = number;
        elm.frequency = 0;
        array[index] = elm;
        size++;
      }
      else
        elm = array[index];
      elm.frequency += 1;

    } // for all numbers

    // Initialize element with highest frequency
    int index_highest;
    int highest_freq;
    int nr_occurs;
    index_highest = 0;
    highest_freq = array[0].frequency;
    nr_occurs = 1;

    // Search 'better' element
    int counter;
    for (counter = 1; counter < size; counter++)
    {
      if (array[counter].frequency > highest_freq)
      {
        index_highest = counter;
        highest_freq = array[counter].frequency;
        nr_occurs = 1;
      }
      else if (array[counter].frequency == highest_freq)
        nr_occurs++;
    }

    // Return result
    if (nr_occurs == 1)
      return array[index_highest].number;
    else
      return Double.NaN;

  } // calculateMode

  public static void main(String[] args)
  {
    double[] data = {1, 2, 2, 3, 3, 4};
    double[] data2 = {-5.3, 2.5, 88.9, 0, 0.0, 28, 16.5, 88.9, 109.5, -90, 88.9};
    System.out.println("Expected NaN - and was: " + calculateMode(data));
    System.out.println("Expected 88.90 - and was: " + calculateMode(data2));  }

} // class TestMode

答案 4 :(得分:1)

添加另一种替代方法,因为我也感到挑战:

一般的想法是生成一个频率阵列,在上面给出的例子之前

 [1.0, 2.0, 2.0, 3.0, 3.0]
 [1,    2,   2,   2,   2]  

表示输入相同索引的元素有多少次,然后在频率数组中找到最大值,最后检查所有具有相同频率的值是否相等。

public static double mode(double [] data) {  
   if(data == null || data.length < 1){
       return Double.NaN;
   }      
   int [] freq = new int [data.length]; 
   for(int i = 0; i<data.length; i++){
        for(int j = 0; j<data.length; j++){
            if(data[i]==data[j]){
                freq[i]++;
            }
        }
    }
    int max = 0;
    double mode = data[0];
    for(int i = 0; i<freq.length; i++){
        if(freq[i]>max){
            max = freq[i];
            mode = data[i];
        }
    }
    for(int i = 0; i<freq.length; i++){
        if(freq[i] == max){
            if(mode != data[i]){
               return Double.NaN;
            }
        }
    }
    return mode;
}