String在Java哈希表中出现的次数

时间:2017-05-06 00:54:18

标签: java hash hashtable

我正在尝试从文本文件中输入数千个字符串,然后能够对最受欢迎的字符串进行排名。 我迷失了如何跟踪每个字符串的数量。

我是否需要实施另一个ADT,例如linkedlist? 我不允许使用除ArrayList之外的java库。

这是我到目前为止所拥有的。

public class StudentTrends implements Trends {
   int entries = 0;
   //ArrayList<Integer> list;
   String[] table;
   int arraySize;

public StudentTrends() {
    //this.list = new ArrayList<Integer>();
    this.table = new String[10];
    Arrays.fill(table, "-1");
}

//Method I'm having trouble with
@Override
public void increaseCount(String s, int amount) {
    int key = horner(s);

    if(table[key] == null){
        entries++;
        //table[key] = table[key];
    }
    else{
        amount += 1+amount;
    }
}


/**
 * The hashing method used
 * @param key
 *          Is the String inputed
 * @param size
 *          Size of the overall arraylist
 * @return
 *          The int representation of that string
 */
private int horner(String key){
    int val = 0;

    for(int a = 0; a < key.length(); a++){
        val = ((val << 8) | (a)) % table.length;
    }
    table[val] = key;
    return val;
}

这是我需要实现的接口。 对帖子不重要,但可以用来更好地理解我想要做的事情。

public interface Trends {   

/**
 * Increase the count of string s.
 * 
 * @param s          String whose count is being increased.
 * @param amount     Amount by which it is being increased.
 */
public void increaseCount(String s, int amount);

/**
 * Return the number of times string s has been seen.
 * @param s     The string we are counting.
 * @return int  The number of times s has been seen thus far.
 */
public int getCount(String s);


/**
 * Get the nth most popular item based on its count.  (0 = most popular, 1 = 2nd most popular).
 * In case of a tie, return the string that comes first alphabetically.
 * @param n         Rank requested
 * @return string   nth most popular string.
 */
public String getNthMostPopular(int n);

/**
 * Return the total number of UNIQUE strings in the list. This will NOT be equal to the number of
 * times increaseCount has been called, because sometimes you will add the same string to the
 * data structure more than once. This function is useful when looping through the results
 * using getNthPopular. If you do getNthPopular(numEntries()-1), it should get the least popular item. 
 * @return  Number of distinct entries.
 */
public int numEntries();

};

2 个答案:

答案 0 :(得分:1)

如果您允许使用的唯一Java ADT是ArrayList,我建议您使用一个,并使用自定义Collections#sort在其上调用Comparator,然后{ {1}}找到最常见元素的频率。

假设Collections#frequency已经与每个list初始化:

String

看到你只允许使用Collections.sort(list, Comparator.comparing(s -> Collections.frequency(list, s)).reversed()); // Frequency of most common element System.out.println(Collections.frequency(list, list.get(0))); ,这种方法很可能对你来说太高级了。有些方法可以使用嵌套的for循环来实现,但它会非常混乱。

答案 1 :(得分:1)

您不必为此编写哈希表。你可能会有这样的事情:

class Entry {
    String key;
    int count;
}

List<Entry> entries;

然后当你想要找到一个条目时,只需循环遍历列表:

for (Entry e : entries)  {
    if (e.key.equals(searchKey)) {
        // found it
    }
}

哈希表在时间复杂度方面要好得多,但对于那些对数据结构不熟悉的人来说,坦率地说这是一项非常艰巨的任务。如果哈希表确实是赋值的必要部分,那么请忽略它,但我只是想指出它并非严格必要。