思考sphinx - 具有交叉和子查询的复杂查询

时间:2014-07-22 09:59:54

标签: sql ruby-on-rails ruby thinking-sphinx

我有这样的模特:

class Machine < ActiveRecord::Base
  has_many :property_values, dependent: :destroy
end

class PropertyValue < ActiveRecord::Base
  belongs_to :property, touch: true
  belongs_to :machine, touch: true
end

PropertyValue模型的模式:

  create_table "property_values", force: true do |t|
    t.integer  "property_id"
    t.integer  "machine_id"
    t.datetime "created_at",  null: false
    t.datetime "updated_at",  null: false
    t.float    "value_int"
    t.string   "value_str"
  end

在我的机器模型中,我有这样的方法:

class Machine < ActiveRecord::Base
  def filter_search(params, order_query = 'machines.updated_at desc')
   if params[:properties].present?
    query = []
    params[:properties].each_pair do |p|
      property_id = p.first.split('_').last
      p.last.each do |type, value|
        if value.present?
          select_join_query = "(SELECT machine_id FROM property_values AS pv LEFT OUTER JOIN machines ON machines.id = pv.machine_id
                    WHERE pv.property_id = #{property_id}"
          if type == 'str'
            query << select_join_query + " AND pv.value_str ILIKE '%#{value.mb_chars}%')"
          elsif type == 'gt' || type == 'lt'
            query << select_join_query + " AND pv.value_int #{type == 'gt' ? '>=' : '<='} #{value})"
          elsif type == 'float'
            floated_val = value.to_f                                                            # because input value may be integer.
            query << select_join_query + " AND pv.value_int = #{floated_val})"
          end
        end
      end
    end

    machines = machines.where("machines.id IN (#{query.join('intersect')})").references(:property_value) if query.present?
   end
  end
end

这种方法的想法是对不同的属性对执行单独的查询,并找到通过交集满足这些条件的机器。所以,如果我有这样的输入属性:

{"properties"=>{"property_7"=>{"str"=>"some data"}, "property_6"=>{"gt"=>"1", "lt"=>"1000"}, "property_5"=>{"gt"=>"3", "lt"=>"800"}}}

然后它生成一个查询,如下所示:

(machines.id IN ((SELECT machine_id FROM property_values AS pv LEFT OUTER JOIN machines ON machines.id = pv.machine_id
                        WHERE pv.property_id = 7 AND pv.value_str ILIKE '%some data%')intersect(SELECT machine_id FROM property_values AS pv LEFT OUTER JOIN machines ON machines.id = pv.machine_id
                        WHERE pv.property_id = 6 AND pv.value_int >= 1)intersect(SELECT machine_id FROM property_values AS pv LEFT OUTER JOIN machines ON machines.id = pv.machine_id
                        WHERE pv.property_id = 6 AND pv.value_int <= 1000)intersect(SELECT machine_id FROM property_values AS pv LEFT OUTER JOIN machines ON machines.id = pv.machine_id
                        WHERE pv.property_id = 5 AND pv.value_int >= 3)intersect(SELECT machine_id FROM property_values AS pv LEFT OUTER JOIN machines ON machines.id = pv.machine_id
                        WHERE pv.property_id = 5 AND pv.value_int <= 800)))

是否有可能以某种方式通过Thinking sphinx编写模拟,这将执行这样复杂的查询?我不知道该怎么做。

1 个答案:

答案 0 :(得分:0)

您还需要在Sphinx中将其拆分为两部分。第一个是搜索PropertyValue(如在Machine的Sphinx索引中,值的属性ID和值不能链接在一起 - 在Sphinx中没有散列的数据类型)。

在PropertyValue的搜索中,您需要编写自定义SELECT子句...

value_ids = PropertyValue.search_for_ids(
  :select => "*, ((property_id = 6 AND value_int > 1) OR (...)) AS matching",
  :with => {:matching => true}
)

然后您可以使用这些value_ids来查询所需的计算机。如果您需要为特定计算机提供所有值,则可以使用:with_all选项:

Machine.search :with_all => {:property_value_ids => value_ids}

这可确保匹配一个(或多个)但不是所有的ID的计算机不会返回。

这可能需要进行一些严肃的修改,因为我不太了解你正在做的交叉事情,但希望它足以让你走上正确的道路。