用唯一的单词压缩字符串

时间:2017-01-17 21:23:51

标签: java

我遇到一个问题,我需要使用无损样式中的第一个唯一单词的索引来压缩字符串,如下所示:

起始字符串:任何重复的句子 压缩后的输出:与单词相对的词的位置列表

我经过多次尝试为自己创建代码后,在网上搜索了一个解决方案。我找不到类似的东西。

4 个答案:

答案 0 :(得分:4)

对于像这样的数据处理问题,Stream API非常强大和简洁。

String words = "ASK NOT WHAT YOUR COUNTRY CAN DO FOR YOU ASK WHAT YOU CAN DO FOR YOUR COUNTRY";
// create a dictionary
Map<String, Integer> lookup = new LinkedHashMap<>();
// go through each word
String code = Stream.of(words.split(" "))
         // lookup the code for that word, or add one as needed
        .map(w -> lookup.computeIfAbsent(w, k -> lookup.size() + 1))
         // turn the codes into Strings
        .map(Object::toString)
         // join them together as one String.
        .collect(Collectors.joining(""));
System.out.println(code);
// dump the dictionary.
lookup.forEach((w, c) -> System.out.println(c + "=" + w));

打印

12345678913967845
1=ASK
2=NOT
3=WHAT
4=YOUR
5=COUNTRY
6=CAN
7=DO
8=FOR
9=YOU

您可以扩展此示例以允许36个单词

String words = "Peter Piper picked a peck of pickled peppers. " +
        "A peck of pickled peppers Peter Piper picked. " +
        "If Peter Piper picked a peck of pickled peppers, " +
        "Where's the peck of pickled peppers Peter Piper picked?";
Map<String, Integer> lookup = new LinkedHashMap<>();
String code = Stream.of(words.split("([.,?] *| +)"))
        .map(w -> lookup.computeIfAbsent(w, k -> lookup.size() + 1))
        .map(c -> Integer.toString(c, 36))
        .collect(Collectors.joining(""));
System.out.println(code);
lookup.forEach((w, c) -> System.out.println(Integer.toString(c, 36) + "=" + w));

打印

1234567895678123a12345678bc5678123
1=Peter
2=Piper
3=picked
4=a
5=peck
6=of
7=pickled
8=peppers
9=A
a=If
b=Where's
c=the

答案 1 :(得分:1)

其他答案是正确的,但如果你不想处理地图等,这里是解决问题的更基本方法:

String str = "ASK NOT WHAT YOUR COUNTRY CAN DO FOR YOU ASK WHAT YOU CAN DO FOR YOUR COUNTRY";
String[] words = str.split("\\s+"); // Create a string array of the words in the string by splitting them around whitespace
ArrayList<String> uniqueWords = new ArrayList<String>();
uniqueWords.add(words[0]);
String result = "1";
boolean thereAlready = false; // Flag to be set if a word is not unique
for (int i = 1; i < words.length; i++) { // Iterate through every word
    thereAlready = false;
    for (int j = 0; j < uniqueWords.size(); j++) { // Iterate through previously found words to see if it matches
        if (words[i].equals(uniqueWords.get(j))) { // If the word is already there, modify the result string accordingly, set the flag, and break out of the inner loop
            result += (j + 1);
            thereAlready = true;
            break;
        }
    }
    if (!thereAlready) { // If the word is new, add it to the found words and modify the result string accordingly
        uniqueWords.add(words[i]);
        result += uniqueWords.size();
    }
}
System.out.println(result);

输出:12345678913967845

答案 2 :(得分:0)

更简单的方法是定义一个Hashmap,其中键是有问题的单词,值是地图索引。

Map<String, Integer> dictionary = new HashMap<>();
// Build the dictionary of strings
for(String word : arrWords) {
    word = word.toUpperCase();
    if (!dictionary.contains(word)) {
        // Insert the word into the map.
        dictionary.put(word, dictionary.size());
    }
}

之后。你可以打印出地图上的关键字&#39;文件:

// Print the dictionary
for(Entry<String, Integer> entry : dictionary.entrySet()) {
    String line = entry.getValue() + ":" + entry.getKey();
    print it somewhere...
}

最后,您可以通过在地图中查找来打印单词:

for(String word : arrWords) {
    print dictionary.get(word) + " ";
}

字典不会按数字顺序打印。我会让你想出那个。

答案 3 :(得分:0)

多亏了这一点,我也在努力解决这个问题。

:P XD

欢呼ATB,

Doe Nut。