如何按照祖先的常见程度对全名列表进行排序?

时间:2018-09-19 09:57:40

标签: algorithm

如果以下每个字母代表一个名称。按祖先的常见程度对它们进行排序的最佳方法是什么?

A B C D
E F G H
I J K L
M N C D
O P C D
Q R C D
S T G H
U V G H
W J K L
X J K L

结果应为:

I J K L # Three names is more important that two names
W J K L
X J K L
A B C D # C D is repeated more than G H
M N C D
O P C D
Q R C D
E F G H
S T G H
U V G H

编辑:

名称中可能有空格(Double names)。

考虑以下示例,其中每个字母代表一个单词:

A B C D M
E F G H M
I J K L M
M N C D M
O P C D
Q R C D
S T G H
U V G H
W J K L
X J K L

输出应为:

A B C D M
M N C D M
I J K L M
E F G H M
W J K L
X J K L
O P C D
Q R C D
S T G H
U V G H

2 个答案:

答案 0 :(得分:1)

首先计算每个链的出现次数。然后根据该计数对每个名称进行排名。试试这个:

from collections import defaultdict

words = """A B C D
E F G H
I J K L
M N C D
O P C D
Q R C D
S T G H
U V G H
W J K L
X J K L"""

words = words.split('\n')

# Count ancestors
counters = defaultdict(lambda: defaultdict(lambda: 0))
for word in words:
    parts = word.split()
    while parts:
        counters[len(parts)][tuple(parts)] += 1
        parts.pop(0)

# Calculate tuple of ranks, used for sorting
ranks = {}
for word in words:
    rank = []
    parts = word.split()
    while parts:
        rank.append(counters[len(parts)][tuple(parts)])
        parts.pop(0)
    ranks[word] = tuple(rank)

# Sort by ancestor count, longest chain comes first
words.sort(key=lambda word: ranks[word], reverse=True)
print(words)

答案 1 :(得分:0)

这是您可以在Java中完成的方法-与@fafl的解决方案基本相同:

static List<Name> sortNames(String[] input)
{
  List<Name> names = new ArrayList<>();
  for (String name : input)
    names.add(new Name(name));

  Map<String, Integer> partCount = new HashMap<>();
  for (Name name : names)
    for (String part : name.parts)
      partCount.merge(part, 1, Integer::sum);

  for (Name name : names)
    for (String part : name.parts)
      name.counts.add(partCount.get(part));

  Collections.sort(names, new Comparator<Name>()
  {
    public int compare(Name n1, Name n2)
    {
      for (int c, i = 0; i < n1.parts.size(); i++)
        if ((c = Integer.compare(n2.counts.get(i), n1.counts.get(i))) != 0)
          return c;
      return 0;
    }
  });
  return names;
}

static class Name
{
  List<String> parts = new ArrayList<>();
  List<Integer> counts = new ArrayList<>();

  Name(String name)
  {
    List<String> s = Arrays.asList(name.split("\\s+"));
    for (int i = 0; i < s.size(); i++)
      parts.add(String.join(" ", s.subList(i, s.size())));
  }
}

测试:

public static void main(String[] args)
{
  String[] input = { 
      "A B C D", 
      "W J K L", 
      "E F G H", 
      "I J K L", 
      "M N C D", 
      "O P C D", 
      "Q R C D", 
      "S T G H",
      "U V G H", 
      "X J K L" };

  for (Name name : sortNames(input))
    System.out.println(name.parts.get(0));
}

输出:

I J K L
W J K L
X J K L
A B C D
M N C D
O P C D
Q R C D
E F G H
S T G H
U V G H