Question

我们说我有以下内容：

john: [a, b, c, d]
bob:  [a, c, d, e]
mary: [a, b, e, f]

或稍微重新格式化，以便您可以轻松查看分组：

john: [a, b, c, d]
bob:  [a,    c, d, e]
mary: [a, b,       e, f]

生成以下分组的最常见或最有效的算法是什么？

[john, bob, mary]: [a]
[john, mary]:      [b]
[john, bob]:       [c,d]
[bob, mary]:       [e]
[mary]:            [f]
[john]:            []
[bob]:             []

快速谷歌搜索后，上面的键似乎表示＆＃34;电源设置＆＃34;。所以我正在计划以下内容：

1）生成权力集{{j，b，m}，{j，m}，{j，b} {b，m}，{m}，{j}，{b}} // j = john，b = bob，m = mary

2）生成所有字母的集合：{a，b，c，d，e，f}

3）迭代子集，并为每个字母查看子集的所有元素中是否存在字母

因此...

subset = {j, b, m}

letter = a
    j contains a? true
    b contains a? true
    m contains a? true
        * add a to subset {j, b, m}

letter = b
    j contains b? true
    b contains b? false, continue

letter = c
    j contains c? true
    b contains c? true
    m contains c? false, continue
.....

subset = {j, m}
.....

有更好的解决方案吗？

编辑：上述算法存在缺陷。例如，{j，m}也包含＆＃34; a＆＃34;，我不想要。我想我可以简单地修改它，以便在每次迭代中，我也检查这封信是否是＆＃34; NOT IN＆＃34;不在此集合中的元素。所以在这种情况下，我也会检查：

if b does not contain a

Answer 1

你可以通过两个地图/词典实现这一目标，其中一个是“反向”和“反向”。另一个。对于第一张地图，键是＆＃39;将是名称，以及＆＃39;值＆＃39;将是一个字符列表。第二个地图将字母作为键，并将与其关联的名称列表作为值。

在Python中

nameDict = {'john' : ['a', 'b', 'c', 'd'], 'bob' : ['a', 'c', 'd', 'e'], 'mary' : ['a', 'b', 'e', 'f']}

reverseDict = {}
for key,values in nameDict.items():
    for v in values:
        if v in reverseDict.keys():
            reverseDict[v].append(key)
        else:
            reverseDict[v] = [key] # If adding v to dictionary for the first time it needs to be as a list element

# Aggregation
finalDict = {}
for key,values in reverseDict.items():
    v = frozenset(values)
    if v in finalDict.keys():
        finalDict[v].append(key)
    else:
        finalDict[v] = [key]

这里，reverseDict包含你想要的映射 - ＆gt; [约翰，鲍勃，玛丽]，b - ＆gt; [john，mary]等。您还可以通过检查reverseDict [＆＃39; a＆＃39;]返回的列表是否包含john来检查if john does not contain a。

[编辑]在finalDict中添加了聚合。

您可以使用frozensets作为字典键，因此finalDict现在包含正确的结果。打印出字典：

frozenset({'bob', 'mary'})
['e']

frozenset({'mary'})
['f']

frozenset({'john', 'bob'})
['c', 'd']

frozenset({'john', 'mary'})
['b']

frozenset({'john', 'bob', 'mary'})
['a']

Answer 2

步骤3（迭代子集）将是低效的，因为它或者＆＃34; j包含＆＃34;或者＆＃34; a不在j＆＃34;对于电源组中的每个元素。

以下是我的建议：

1）生成权力集{{j，b，m}，{j，m}，{j，b} {b，m}，{m}，{j}，{b}}。您不需要这一步，因为您不关心最终输出中的空映射。

2）迭代原始数据结构中的所有元素并构建以下内容：

[a] : [j, b, m]
[b] : [j, m]
[c] : [j, b]
[d] : [j, b]
[e] : [b, m]
[f] : [m]

3）反转上面的结构和聚合（使用[j，b，...]的地图到[a，b ...]的列表应该做的诀窍）来得到这个：

[j, b, m] : [a]
[j, m] : [b]
[j, b] : [c, d]
[b, m] : [e]
[m] : [f]

4）将3与1进行比较以填充剩余的空映射。

编辑：她是Java中的完整代码

    // The original data structure. mapping from "john" to [a, b, c..] 
    HashMap<String, HashSet<String>> originalMap = new HashMap<String, HashSet<String>>();

    // The final data structure. mapping from power set to [a, b...]
    HashMap<HashSet<String>, HashSet<String>> finalMap = new HashMap<HashSet<String>, HashSet<String>>();

    // Intermediate data structure. Used to hold [a] to [j,b...] mapping
    TreeMap<String, HashSet<String>> tmpMap = new TreeMap<String, HashSet<String>>();

    // Populate the original dataStructure
    originalMap.put("john", new HashSet<String>(Arrays.asList("a", "b", "c", "d")));
    originalMap.put("bob", new HashSet<String>(Arrays.asList("a", "c", "d", "e")));
    originalMap.put("mary", new HashSet<String>(Arrays.asList("a", "b", "e", "f")));

    // Hardcoding the powerset below. You can generate the power set using the algorithm used in googls guava library.
    // powerSet function in https://code.google.com/p/guava-libraries/source/browse/guava/src/com/google/common/collect/Sets.java
    // If you don't care about empty mappings in the finalMap, then you don't even have to create the powerset
    finalMap.put(new HashSet<String>(Arrays.asList("john", "bob", "mary")), new HashSet<String>());
    finalMap.put(new HashSet<String>(Arrays.asList("john", "bob")), new HashSet<String>());
    finalMap.put(new HashSet<String>(Arrays.asList("bob", "mary")), new HashSet<String>());
    finalMap.put(new HashSet<String>(Arrays.asList("john", "mary")), new HashSet<String>());
    finalMap.put(new HashSet<String>(Arrays.asList("john")), new HashSet<String>());
    finalMap.put(new HashSet<String>(Arrays.asList("bob")), new HashSet<String>());
    finalMap.put(new HashSet<String>(Arrays.asList("mary")), new HashSet<String>());

    // Iterate over the original map to prepare the tmpMap.
    for(Entry<String, HashSet<String>> entry : originalMap.entrySet()) {
        for(String value : entry.getValue()) {
            HashSet<String> set = tmpMap.get(value);
            if(set == null) {
                set = new HashSet<String>();
                tmpMap.put(value, set);
            }
            set.add(entry.getKey());
        }
    }

    // Iterate over the tmpMap result and add the values to finalMap
    for(Entry<String, HashSet<String>> entry : tmpMap.entrySet()) {
        finalMap.get(entry.getValue()).add(entry.getKey());
    }

    // Print the output
    for(Entry<HashSet<String>, HashSet<String>> entry : finalMap.entrySet()) {
        System.out.println(entry.getKey() +" : "+entry.getValue());
    }

上面代码的输出是：

[bob] : []
[john] : []
[bob, mary] : [e]
[bob, john] : [d, c]
[bob, john, mary] : [a]
[mary] : [f]
[john, mary] : [b]

将项目分组为子集（电源设置）

2 个答案: