Question

我打算在java中实现一个解决方案，它检查用户密码是否包含字典单词，无论是英语，西班牙语，德语还是法语。

我有来自这里的单词列表： ftp://ftp.openwall.com/pub/wordlists/languages/English/

我正在考虑使用HashMap或使用像redis这样的缓存，它将包含字典中的所有单词作为单词。虽然这可能效率不高。

什么是最好的实施方法？

Answer 1

如果这确实是您的要求，我建议使用Trie数据结构，非常适合在字典中快速查找单词。

您可以在org.apache.commons.collections4中获得trie的实现。见https://commons.apache.org/proper/commons-collections/javadocs/api-release/org/apache/commons/collections4/Trie.html

使用trie，您需要从字典中构建它并将其保存在内存中。然后你需要从右到左遍历字符串，看看你是否可以在trie中查找结果。如果没有找到结果，则字典中没有密码的任何部分。

尝试在查找字符串模式方面非常有效，因为它们使用树状结构。

如果要在Maven项目上使用Apache Commons trie，请使用此导入依赖项：

<!-- https://mvnrepository.com/artifact/org.apache.commons/commons-collections4 -->
<dependency>
    <groupId>org.apache.commons</groupId>
    <artifactId>commons-collections4</artifactId>
    <version>4.0</version>
</dependency>

这是一个简单的玩具示例，它在字符串“hellothere”中找到字典单词：

import com.google.common.collect.ImmutableMap;
import org.apache.commons.collections4.Trie;
import org.apache.commons.collections4.trie.PatriciaTrie;
import org.apache.commons.collections4.trie.UnmodifiableTrie;

import java.util.ArrayList;
import java.util.Map;
import java.util.stream.IntStream;

public class TrieDict {

    public static void main(String[] args) {
        Trie<String, String> trie = new UnmodifiableTrie<>(new PatriciaTrie<>(fillMap()));
        String pwd = "hellothere";
        System.out.println(extractDictMatches(trie, pwd));
    }

    // Provides a dictionary
    private static Map<String, String> fillMap() {
        return ImmutableMap.<String, String>builder().
                put("there", "there").
                put("is", "is").
                put("word", "word").
                put("here", "here").
                put("hell", "hell").
                build();
    }

    private static ArrayList<String> extractDictMatches(Trie<String, String> trie, String pwd) {
        return IntStream.range(0, pwd.length()).collect(ArrayList::new, (objects, i) -> {
            String suffix = pwd.substring(i);
            IntStream.rangeClosed(0, suffix.length()).forEach(j -> {
                String suffixCut = suffix.substring(0, j);
                if (suffixCut.length() > 2) {
                    if (trie.containsKey(suffixCut)) {
                        objects.add(suffixCut);
                    }
                }
            });
        }, (objects, i) -> {
        });
    }
}

这将打印出来：

[hell, there, here]

确保密码不包含字典单词

1 个答案: