Question

我必须遵循以下表格：

关系

[id,user_id,status]
1,2,sent_reply
1,2,sent_mention
1,3,sent_mention
1,4,sent_reply
1,4,sent_mention

我正在寻找一种删除重复项的方法，以便只保留以下行：

1,2,sent_reply
1,3,sent_mention
1,4,sent_reply

（最好使用Rails）

Answer 1

我知道这已经太迟了，但是我找到了一个使用Rails 3的好方法。但是，可能有更好的方法，而且我不知道这将如何使用100,000多行数据，但这应该让你走上正确的轨道。

# Get a hash of all id/user_id pairs and how many records of each pair
counts = ModelName.group([:id, :user_id]).count
# => {[1, 2]=>2, [1, 3]=>1, [1, 4]=>2}

# Keep only those pairs that have more than one record
dupes = counts.select{|attrs, count| count > 1}
# => {[1, 2]=>2, [1, 4]=>2}

# Map objects by the attributes we have
object_groups = dupes.map do |attrs, count|
  ModelName.where(:id => attrs[0], :user_id => attrs[1])
end

# Take each group and #destroy the records you want.
# Or call #delete instead to save time if you don't need ActiveRecord callbacks
# Here I'm just keeping the first one I find.
object_groups.each do |group|
  group.each_with_index do |object, index|
    object.destroy unless index == 0
  end
end

Answer 2

最好通过SQL来实现。但是如果你更喜欢使用Rails：

(Relation.all - Relation.all.uniq_by{|r| [r.user_id, r.status]}).each{ |d| d.destroy }

或

 ids = Relation.all.uniq_by{|r| [r.user_id, r.status]}.map(&:id)
 Relation.where("id IS NOT IN (?)", ids).destroy_all # or delete_all, which is faster

但我不喜欢这个解决方案：D

如何使用Rails删除MySQL中的重复项？

2 个答案: