查找两个表之间重叠的最有效方法

时间:2019-03-29 19:57:35

标签: sql ruby-on-rails ruby postgresql

给出两个表,其中“ title”列未排序或唯一:

Book
|id|title |
|1 |book_1|
|2 |book_2|
|3 |book_3|
|4 |book_4|
|5 |book_5|
|6 |book_5|
|7 |book_5|
|8 |book_6|
|9 |book_7|

UserBook
|user_id|book_id|state        |title  |
|1      |2      |"in progress"|book_2 |
|1      |4      |"completed"  |book_4 |
|1      |6      |"completed"  |book_5 |
|2      |3      |"completed"  |book_3 |
|2      |6      |"completed"  |book_5 |
|3      |1      |"completed"  |book_1 |
|3      |2      |"completed"  |book_2 |
|3      |4      |"completed"  |book_4 |
|3      |7      |"in progress"|book_5 |
|3      |8      |"completed"  |book_6 |
|3      |9      |"completed"  |book_7 |

我想创建状态为“已完成”的用户和书名的二进制矩阵。

[0, 0, 0, 1, 1, 0, 0]
[0, 0, 1, 0, 1, 0, 0]
[1, 1, 0, 1, 0, 1, 1]

这可以得到我想要的结果,但是算法复杂度很高。我希望通过SQL获得结果。

如果状态为boolean并且标题是唯一的,那会简单得多吗?

matrix = []
User.all.each do |user|
  books = Book.distinct.sort(title: :asc).pluck(:title).uniq
  user_books = UserBook.where(user: user, state: "completed").order(title: :asc).pluck(:title)
  matrix << books.map{|v| user_books.include?(v) ? 1 : 0}
end

4 个答案:

答案 0 :(得分:2)

SQL在矩阵方面不是很好。但是您可以将值存储为(x,y)对。您希望同时包含0值和1,因此其想法是使用cross join生成行,然后引入现有数据:

select b.book_id, u.user_id,
       (case when ub.id is not null then 1 else 0 end) as is_completed
from books b cross join
     users u left join
     user_books ub
     on ub.user_id = u.id and
        ub.book_id = b.id and
        ub.state = 'completed';

答案 1 :(得分:1)

您可以按UserBookuser_id进行分组,并使用汇总功能选择每组中的书籍列表。整个代码段如下:

books = Book.order(title: :asc).pluck(:title).uniq
matrix = []
UserBook.where(state: "completed")
        .select("string_agg(title, ',') as grouped_name")
        .group(:user_id)
        .each do |group|
  user_books = group.grouped_name.split(',')
  matrix << books.map { |title| user_books.include?(title) ? 1 : 0 }
end

在MySQL中,您需要将string_agg(title, ',')替换为GROUP_CONCAT(title)

答案 2 :(得分:1)

您是否应该考虑使用Ruby而不是SQL生成所需的数组,请先将数据从表Book中读取到数组book中:

book = [
  [1, "book_1"], [2, "book_2"], [3, "book_3"], [4, "book_4"],
  [5, "book_5"], [6, "book_5"], [7, "book_5"], [8, "book_6"],
  [9, "book_7"]
] 

并将表UserBook中的数据放入数组user_book

user_book = [
  [1, 2, :in_progress], [1, 4, :completed], [1, 6, :completed],
  [2, 3, :completed],   [2, 6, :completed],
  [3, 1, :completed],   [3, 2, :completed], [3, 4, :completed], [3, 7, :in_progress],
  [3, 8, :completed],   [3, 9, :completed]
] 

请注意,book的每个元素的第一个元素是整数book_iduser_book的每个元素的前两个元素分别是整数{{ 1}}和user_id

然后您可以按以下方式构造所需的数组:

book_id

答案 3 :(得分:0)

直接SQL

select * from books join user_books on (books.id = user_books.id) 
where user_books.state = 'completed';

在Ruby ActiveRecord中

Book.joins(:user_books).where(:state => 'completed')