查询以根据标记查找主题

时间:2010-07-30 06:43:54

标签: sql ruby-on-rails mysql

我想在我的应用程序中搜索数据,如下面的

topic_id   tag
1          cricket
1          football
2          football
2          basketball
3          cricket
3          basketball
4          chess
4          basketball

现在,当我搜索术语cricket AND football时,o / p应该是

 topic_id
    1

当我搜索术语cricket OR football时,o / p应为

 topic_id
    1
    2
    3

我尝试类似下面的内容

FOR AND

  select topic_id from table_name where tag like "%cricket%" and topic_id in (select topic_id from table_name where tag like "%football%")

FOR OR

 select topic_id from table_name where tag like "%cricket%" OR tag like "%football%"

我的问题是当用户搜索cricket AND football AND basketball AND chess我的查询变得非常可怜时

有没有简单的解决方案。我也尝试过GROUP_CONCAT但是徒劳无功

4 个答案:

答案 0 :(得分:4)

 SELECT TopicId
 FROM Table
 WHERE Tag IN ('cricket', 'football', 'basketball', 'chess')
 GROUP By TopicId
 HAVING Count(*) = 4

  4 is magic number - its a length of your AND list.

 FOR cricket AND football

 it will be 2:

 SELECT TopicId
 FROM Table
 WHERE Tag IN ('cricket', 'football')
 GROUP By TopicId
 HAVING Count(*) = 2

 if you want use 'like' statement:

 SELECT TopicId
 FROM Table
 WHERE Tag IN (SELECT distinct Tag from Table Where Tag like '...'
                OR Tag like '...'
                OR Tag like '...'
                OR Tag like '...'
              )
 GROUP By TopicId
 HAVING Count(*) = (SELECT COUNT(distinct Tag) from Table 
                    Where Tag like '...'
                       OR Tag like '...' 
                       OR Tag like '...'
                       OR Tag like '...'
                   )

<强>更新

使用支持所有集合操作的RDBMS可轻松解决此任务: UNION INTERSECT EXCEPT (或 MINUS < /强>)

然后是任何条件:

  1. (Tag1 AND Tag2)或Tag3 NOT Tag4
  2. Tag1或Tag2
  3. Tag1 AND Tag2 And Tag3
  4. (Tag1 AND Tag2)OR(Tag3 AND Tag4)
  5. 可以很容易地转化为:

    1. (Select * ... Where Tag = Tag1
        INTERSECT
        Select * ... Where Tag = Tag2
       )
       UNION
       (Select * ... Where Tag = Tag3)
       EXCEPT
       (Select * ... Where Tag = Tag4)
    
    2. Select * ... Where Tag = Tag1
       UNION
       Select * ... Where Tag = Tag2
    
    3. Select * ... Where Tag = Tag1
       INTERSECT
       Select * ... Where Tag = Tag2
       INTERSECT
       Select * ... Where Tag = Tag3
    
     4.(Select * ... Where Tag = Tag1
        INTERSECT
        Select * ... Where Tag = Tag2
       )
       UNION
       (Select * ... Where Tag = Tag1
        INTERSECT
        Select * ... Where Tag = Tag2
       )
    

    MYSQL不支持INTERSECT的真正问题,应该如上所示进行模拟。第二个问题是尊重括号和运算符优先级。

    在表达式中不使用括号的可能解决方案:

    1. 收集所有通过AND条件加入的标签并构建查询作为答案中的第一个示例。

    2. 添加所有加入OR条件的标签(可以使用IN或UNION)并使用UNION组合结果。

    3. 只有当标签数量少于64时,另一种方法才有可能。然后每个标签都有自己的位(您需要将bigint字段'标签'添加到主题表中,其中将以二进制格式表示标签)并使用mysql位操作创建查询。

      此解决方案仅限于64个标签的大缺点。

答案 1 :(得分:1)

你需要自我加入

select distinct topic_id from 
table_name as t1
join
table_name as t2 
on 
t1.topic_id = t2.topic_id
and
t1.tag = "cricket"
and
t2.tag = "football"

答案 2 :(得分:0)

a AND b AND c AND d:

SELECT t1.topic_id
FROM tags_table AS t1
INNER JOIN tags_table AS t2
ON t2.topic_id = t1.topic_id AND t2.tag = 'b'
INNER JOIN tags_table AS t3
ON t3.topic_id = t1.topic_id AND t3.tag = 'c'
INNER JOIN tags_table AS t4
ON t4.topic_id = t1.topic_id AND t4.tag = 'd'
WHERE t1.tag = 'a'

不幸的是,OR条件更难。完全外连接会很方便,但MySQL缺少这个功能。

我建议您确保括号内没有OR(不是(a OR b) AND c,而是(a AND c) OR (b AND c)并执行以下查询:

OR b OR c OR(某些和d和e之类的子句):

SELECT DISTINCT topic_id FROM (
  SELECT topic_id FROM tags_table where tag = 'a'
  UNION ALL
  SELECT topic_id FROM tags_table where tag = 'b'
  UNION ALL
  SELECT topic_id FROM tags_table where tag = 'c'
  UNION ALL
  query_like_the_previous_one_represinting_some_AND_clause
) as union_table

在MySQL以外的数据库软件中,您可以使用查询可能(我现在无法对其进行测试),如下所示:

SELECT COALESCE(t1.topic_id, t2.topic_id, t3.topic_id, ...)
FROM tags_table AS t1
INNER JOIN tags_table AS t2
ON t2.topic_id = t1.topic_id AND t2.tag = 'b'
FULL OUTER JOIN tags_table AS t3
ON t3.topic_id = t1.topic_id AND t3.tag = 'c'
INNER JOIN tags_table AS t4
ON t4.topic_id = t1.topic_id AND t4.tag = 'd'
WHERE t1.tag = 'a'

我认为应该代表(a AND b)OR(c AND d)。注意COALESCE,因为完全外连接t1.topic_id可能为null。

答案 3 :(得分:0)

这是一个Rails解决方案,它为AND案例创建自引用连接,为OR案例创建一个简单的SQL包含。该解决方案假设一个名为TopicTag的模型,因此称为topic_tags。

类方法Search需要2个参数,一个Tags数组和一个包含“and”或“or”的字符串

class TopicTag < ActiveRecord::Base

  def self.search(tags, andor)

    # Ensure tags are unique or you will get duplicate table names in the SQL
    tags.uniq!

    if andor.downcase == "and"
      first = true
      sql = ""

      tags.each do |tag|
        if first
          sql = "SELECT DISTINCT topic_tags.topic_id FROM topic_tags "
          first = false
        else
          sql += " JOIN topic_tags as tag_#{tag} ON tag_#{tag}.topic_id = \
                   topic_tags.topic_id AND tag_#{tag}.tag = '#{tag}'"
        end
      end
      sql += " WHERE topic_tags.tag = '#{tags[0]}'"
      TopicTag.find_by_sql(sql)

    else
      TopicTag.find(:all, :select => 'DISTINCT topic_id', 
          :conditions => { :tag => tags})
    end
  end

end

为了获得更多的测试覆盖率,数据被扩展为包括国际象棋的额外记录。数据库使用以下代码

播种
[1,2].each   {|i| TopicTag.create(:topic_id => i, :tag => 'football')}
[1,3].each   {|i| TopicTag.create(:topic_id => i, :tag => 'cricket')}
[2,3,4].each {|i| TopicTag.create(:topic_id => i, :tag => 'basketball')}
[4,5].each   {|i| TopicTag.create(:topic_id => i, :tag => 'chess')}

以下测试代码生成了显示的结果

tests = [
  %w[football cricket],
  %w[chess],
  %w[chess cricket basketball]
]

tests.each do |test|
  %w[and or].each do |op|
    puts test.join(" #{op} ") + " = " + 
      (TopicTag.search(test, op).map(&:topic_id)).join(', ')
  end
end
football and cricket = 1
football or cricket = 1, 2, 3
chess = 4, 5
chess = 4, 5
chess and cricket and basketball = 
chess or cricket or basketball = 1, 2, 3, 4, 5

使用SqlLite

在Rails 2.3.8上测试

修改

如果您希望使用,那么OR案例也会稍微复杂一些。您还应该知道,如果您要搜索的表具有非平凡的大小,那么使用带有前导'%'的LIKE可能会对性能产生重大影响。

以下版本的模型在两种情况下都使用LIKE。

class TopicTag < ActiveRecord::Base

  def self.search(tags, andor)

    tags.uniq!

    if andor.downcase == "and"
      first = true
      first_name = ""
      sql = ""

      tags.each do |tag|
        if first
          sql = "SELECT DISTINCT topic_tags.topic_id FROM topic_tags "
          first = false
        else
          sql += " JOIN topic_tags as tag_#{tag} ON tag_#{tag}.topic_id = \    
                  topic_tags.topic_id AND tag_#{tag}.tag like '%#{tag}%'"
        end
      end
      sql += " WHERE topic_tags.tag like '%#{tags[0]}%'"
      TopicTag.find_by_sql(sql)

    else
      first = true
      tag_sql = ""
      tags.each do |tag| 
        if first
          tag_sql = " tag like '%#{tag}%'" 
          first = false
        else
          tag_sql += " OR tag like '%#{tag}%'" 
        end
      end
      TopicTag.find(:all, :select => 'DISTINCT topic_id', 
            :conditions => tag_sql)
    end
  end

end

tests = [
  %w[football cricket],
  %w[chess],
  %w[chess cricket basketball],
  %w[chess ll],
  %w[ll]
]

tests.each do |test|
  %w[and or].each do |op|
    result = TopicTag.search(test, op).map(&:topic_id)
    puts ( test.size == 1 ? "#{test}(#{op})" : test.join(" #{op} ") ) + 
         " = " + result.join(', ')
  end
end
football and cricket = 1
football or cricket = 1, 2, 3
chess(and) = 4, 5
chess(or) = 4, 5
chess and cricket and basketball = 
chess or cricket or basketball = 1, 2, 3, 4, 5
chess and ll = 4
chess or ll = 1, 2, 3, 4, 5
ll(and) = 1, 2, 3, 4
ll(or) = 1, 2, 3, 4