获取一组字符串对并返回所有连接项的数组

时间:2018-05-21 18:35:38

标签: arrays ruby algorithm time-complexity

我有一系列名称,如:

[
  ['alison', 'jason'],
  ['alison', 'chris'],
  ['john', 'bill'],
  ['bill', 'alex'],
  ['alex', 'jack']
]

我尝试编写一个可以接受此方法的方法,并返回

[
  ['alison', 'jason', 'chris'],
  ['john', 'bill', 'alex', 'jack']
]

优于O(N ^ 2)时间。我的尝试就是这样:

def teams(arr)
  pair_hash = {}
  arr.each do |pair|
    if pair_hash[pair[0]].nil?
      pair_hash[pair[0]] = [pair[1]]
    else
      pair_hash[pair[0]].push(pair[1])
    end
  end
  teams = []
  pair_hash.map do |leader, team|
    teams.push(find_teammates(pair_hash, leader, team))
  end
  teams
end

def find_teammates(hash, leader, team)
  result = [leader]
  team.each do |member|
    if hash[member].nil?
      result += [member]
    else
      result += find_teammates(hash, member, hash[member])
    end
  end
  result
end

但是这个解决方案在结果中有额外的团队,我能想到的每个解决方案都涉及非常糟糕的时间复杂性。如果你有任何想法如何解决这个问题,而不是强迫所有对,我很乐意知道。

1 个答案:

答案 0 :(得分:4)

你很幸运,disjoint-set是我最喜欢的数据结构。这是一个快速的肮脏实施:

pairs = [['alison', 'jason'], ['alison', 'chris'], ['john', 'bill'], ['bill', 'alex'], ['alex', 'jack'], ['steve', 'alex']]

parents = {}

pairs.each do |x, y|
  # each person starts as their own set, and their own representative
  parents[x] ||= x
  parents[y] ||= y

  # find representative of x set
  x_parent = parents[x]
  loop do
    break if parents[x_parent] == x_parent
    x_parent = parents[x_parent]
  end

  # find representative of y set
  y_parent = parents[y]
  loop do
    break if parents[y_parent] == y_parent
    y_parent = parents[y_parent]
  end

  # union by changing y's representative
  parents[y_parent] = x_parent
  # path compression to speed up later unions
  parents[x] = x_parent
  parents[y] = x_parent
end

# group by set representative (some paths might not be compressed)
groups = parents.each_key.group_by do |person|
  parent = parents[person]
  loop do
    break if parents[parent] == parent
    parent = parents[parent]
  end
  parent
end

p groups.values
[["alison", "jason", "chris"], ["john", "bill", "alex", "jack", "steve"]]

这大约为O(N + M),其中N是人数,M是对数。注意重复,例如查找一组代表。如果您为查找定义了合适的类,那么该算法看起来会更清晰。

此外,我的路径压缩并不理想,如果将压缩放在代表性查找而不是联合中,则可以加快速度。