我正在尝试创建一个在插入过程中删除重复项的LinkedList。这是我的代码:
LinkedList<WordInDocument> list = new LinkedList<WordInDocument>();
/**
* Insert a word.
*
* @param insertWord The word that'll be inserted for, the word and it's Part Of Speech.
* If you print "insertWord.word" you'll get the String of the word.
*/
public void insert (Word insertWord) {
ListIterator<WordInDocument> listIt = list.listIterator();
WordInDocument entry = new WordInDocument(insertWord);
if (list.isEmpty()) {
listIt.add(entry);
}
else {
while (listIt.hasNext()) {
//the iterator iterate with this if-statment,
//if the iterator finds a equal word, then break
if (listIt.next().wordInEntry.word.equalsIgnoreCase(insertWord.word)) {
break;
}
}
//only true if an equal word wasn't found, then ad the word at the end
if (!listIt.hasNext()) {
listIt.add (entry);
}
}
}
但如果它有很多输入,则需要很长时间才能执行此操作(大约1分钟)。是否有更好的方法可以在插入过程中删除重复值?
编辑:
谢谢您的帮助。我称之为使用“二进制插入”解决了它。这样它也会在我在最后一次插入后打算做的每次插入后进行排序。这是我的代码:
WordInDocument[] list = new WordInDocument[MAX_INDEX];
int currentMaxIndex = 0;
/**
* Insert a word, it uses binary search to find where to put it, if the
* current word dosn't exists, then insert it, this can bee called "Binary insert".
* If the current word already exists, then ignore.
*
* @param insertword The word that'll be inserted for, the word and it's Part Of Speech.
* If you print "insertWord.word" you'll get the String of the word.
*/
public void insert(Word insertword) { // put element into array
WordInDocument entry = new WordInDocument(insertword);
//First element
if (list[0] == null) {
list[0] = entry;
currentMaxIndex++;
return;
}
int inputIndex = binaryInsert(insertword);
//It's at the end
if (list[inputIndex] == null) {
list[inputIndex] = entry;
currentMaxIndex++;
return;
}
//It's equal to another word
if (list[inputIndex].wordInEntry.word.equalsIgnoreCase(word.word)) {
return;
}
//It's between two entries
for (int i = currentMaxIndex; i > inputIndex; i--) { // move bigger ones one up.
list[i] = list[i - 1];
}
list[inputIndex] = entry;
currentMaxIndex++;
}
private int binaryInsert(Word word) {
int lowerBound = 0;
int upperBound = currentMaxIndex - 1;
int compareStrings = list[mid].wordInEntry.word.compareToIgnoreCase(word.word);
while (true) {
int mid = (upperBound + lowerBound) / 2;
if (lowerBound == mid) {
if (compareStrings > 0) {
return mid;
}
}
if (compareStrings < 0) {
lowerBound = mid + 1; // its in the upper
if (lowerBound > upperBound) {
return mid += 1;
}
} else if (lowerBound > upperBound) {
return mid;
} else {
upperBound = mid - 1; // its in the lower
}
}
}
现在需要2秒而不是45秒。
答案 0 :(得分:0)
如果您仍想使用LinkedList
数据结构(不是Set
)并使用Java8
执行此操作,那么这是一个简单快捷的解决方案:
LinkedList<WordInDocument> list = new LinkedList<>();
public void insert (WordInDocument insertWord) {
// try to find any matching element
Optional<WordInDocument> optionalExistingWord =
list
.stream()
.parallel()
.filter(element -> element.word.equalsIgnoreCase(insertWord.word))
.findAny();
// if none is found, add the new element to the list
if(!optionalExistingWord.isPresent()) {
list.add(insertWord);
}
}
答案 1 :(得分:0)
您可以通过TreeSet删除重复项:
LinkedList<WordInDocument> list = new LinkedList<>();
TreeSet<WordInDocument> set = new TreeSet<>(
new Comparator<WordInDocument>() {
@Override
public int compare(WordInDocument d1, WordInDocument d2) {
return d1.word.compareToIgnoreCase(d2.word);
}
});
set.addAll(list);
list = new LinkedList<WordInDocument>(set);
这应该更快