这个算法的目标是获取一个字符串列表并将它们缩写为能够唯一地识别它们所代表的单词,这是一个家庭作业。
算法I'我试图为这个问题创建搜索子字符串中的任何可能的匹配,虽然看起来trie对于这个解决方案会更有效,但我似乎无法理解一个代码如何工作;话虽如此,它仍然很可能也是这样做的。
输入:~~~~~~~~~~~预期输出:~~~~~~~~~~~~实际输出:
碳水化合物-------- carboh --------------------------------碳水化合物
cart -------------------- cart ---------------------------- --------购物车
化油器------------ carbu --------------------------------- carburetor
焦糖--------------- cara --------------------------------- - 焦糖
caribou --------------- cari --------------------------------- ---- caribou
碳酸-------------- carboni -------------------------------- carbonic <登记/>
软骨-------------- carti ---------------------------------- - 软骨
碳----------------碳-------------------------------- - 碳
运输--------------运输-------------------------------- carr <登记/>
纸箱----------------- carto ------------------------------- ---- carto
车---------------------车--------------------------- ----------车
碳酸盐------------ carbona ------------------------------- carbona
这是实际代码。
import java.util.ArrayList;
public class ShortestPrefixes
{
ArrayList<String> characterCounter = new ArrayList<>();
boolean isEqual = false;
public static ArrayList testArrayList()
{
ArrayList<String> testArray = new ArrayList<>();
testArray.add("carbohydrate");
testArray.add("cart");
testArray.add("carburetor");
testArray.add("caramel");
testArray.add("caribou");
testArray.add("carbonic");
testArray.add("cartilage");
testArray.add("carbon");
testArray.add("carriage");
testArray.add("carton");
testArray.add("car");
testArray.add("carbonate");
return testArray;
}
public ArrayList getShortestPrefixes(ArrayList<String> input)
{
ArrayList<String> output = new ArrayList<>();
System.out.println(input.size());
for (int count = 0; count<input.size(); count++) //This loop counts which word is being reduced
{
//System.out.println(count);
String word = input.get(count);
//System.out.println(word);
for (int compare = 0; compare<input.size(); compare++)//this loop counts the characters of the word being reduced
{
if(input.get(compare) .equals(word)){}
isEqual=false;
ArrayList<String> wordBeingComparedRightNow = new ArrayList<>();
for (int searchCount = 0; searchCount<input.get(searchCount).length(); searchCount++)//this loop compares the string to every possible substring
{
wordBeingComparedRightNow.clear();
String wordAboutToBeCompared = input.get(searchCount);
wordBeingComparedRightNow.add(wordAboutToBeCompared);
for (int miniLoop = 0; miniLoop<input.get(searchCount).length()-1; miniLoop++)//this loop sets up for a word's substrings to be tested
{
wordAboutToBeCompared = wordAboutToBeCompared.substring(0, wordAboutToBeCompared.length() -1);
wordBeingComparedRightNow.add(wordAboutToBeCompared);
}
for(int variableName = 0; variableName < wordBeingComparedRightNow.size(); variableName++)//this word compares the array craeated in the last step with the substring we currently have
{
if(wordBeingComparedRightNow.get(variableName) .equals(word.substring(0, word.length()-1)))
{
isEqual=true;
}
}
}
if (!isEqual)
{
System.out.println(word);
word = word.substring(0, word.length()-1);
}
}
//System.out.println(word);
output.add(word);
}
System.out.println(testArrayList());
return output;
}
public static void main(String[] args)
{
ShortestPrefixes sp = new ShortestPrefixes();
ArrayList<String> output = sp.getShortestPrefixes(testArrayList());
System.out.println(output);
}
}
我认为问题是在第二个for循环中的某个地方,但我无法弄清楚在哪里;如果有人能够确定为什么我的算法不起作用或至少指出我正确的方向;我真的很感激。 感谢。
答案 0 :(得分:0)
如果我很清楚你想要做什么,对于每个单词,你试着找到最长的前缀,从最长的一个开始。
然后检查下一个前缀(当前减1个字符)是否等于其他单词的所有可能子串。 请注意,您还要将前缀与当前单词子串进行比较,您需要跳过此比较中的当前单词。
如果下一个前缀是唯一的,那么它将成为当前前缀:word.substring(0, word.length()-1)
应该重复该过程,直到找不到唯一的较短前缀。
但这for
非常奇怪:
for (int searchCount = 0; searchCount < input.get(searchCount).length(); searchCount++)
你迭代searchCount
,但它是结束条件两边的一部分。因此,每次迭代的结束条件都不同。
由于这是家庭作业,我建议你采用另一种方法,我认为这样更容易。我们的想法是从单字母前缀开始,然后将它们放大,直到所有前缀都有效。
Step 0:
input <- [carbohydrate, cart, carburetor, caramel, caribou, carbonic, cartilage, carbon, carriage, carton, car, carbonate]
output <- [ , , , , , , , , , , , ]
Step 1:
For each prefix in output check if it's valid. (*1) If not then add one more char to it.
Step 2:
If some prefix was not valid, the go to step 1.
(*1) A prefix is valid if one of the folling conditions occurs: Prefix is unique in output or you cannot add more chars prefix == word
提示对于每次迭代,输出为:
Iteration 0: [, , , , , , , , , , , ]
Iteration 1: [c, c, c, c, c, c, c, c, c, c, c, c]
Iteration 2: [ca, ca, ca, ca, ca, ca, ca, ca, ca, ca, ca, ca]
Iteration 3: [car, car, car, car, car, car, car, car, car, car, car, car]
Iteration 4: [carb, cart, carb, cara, cari, carb, cart, carb, carr, cart, car, carb]
Iteration 5: [carbo, cart, carbu, cara, cari, carbo, carti, carbo, carr, carto, car, carbo]
Iteration 6: [carboh, cart, carbu, cara, cari, carbon, carti, carbon, carr, carto, car, carbon]
Iteration 7: [carboh, cart, carbu, cara, cari, carboni, carti, carbon, carr, carto, car, carbona]
试试吧。我用一个小实现测试了它。但由于这是家庭作业我不认为我应该直接给你代码;)如果你遇到困难,发表评论。
<强>伪代码强>
// Step 0: Initialize input and output
input <- [carbohydrate, cart, carburetor, caramel, caribou, carbonic, cartilage, carbon, carriage, carton, car, carbonate]
output <- [ , , , , , , , , , , , ]
allFound <- false; // Helper variable to know if all prefixes are valid
while (not allFound) { // Step 2 Iteration: until all prefixes are ok
// For simplicity, we first count the occurrences of each prefix
for each prefix in output {
prefixCount[prefix] <- prefixCount[prefix] + 1 // You may use a Map for this
}
// Step 1: For each prefix check if is valid, if not add one more char
allFound <- true;
for (i = 0; i < output size; i++) {
if not (prefixCount[prefix] == 1 /* is unique */ OR output[i] is equals input[i]) {
// Prefix is not ok
output[i] <- output[i] + next char form input[i]; // Add one more char
allFound <- false; // Step 2 iteration condition "some prefix is not valid"
}
}
}