Elasticsearch Ruby Activerecord持久性模型URL术语搜索

时间:2015-04-18 11:05:23

标签: ruby ruby-on-rails-4 elasticsearch elasticsearch-rails

我尝试使用弹性搜索术语查询在包含URL的字段上进行搜索。我使用elasticsearch-rails ActiveRecord Persistance Pattern。这就是我尝试这样做的方式。

total_views = UserAction.search :query=> {
        :filtered=> {
            :filter=> {
                :term=> { action_path:"http://0.0.0.0:3000/tshirt/test" } 
            }
        }
    }  

如果没有' /'或者':'字符。例如,当action_path只是' tshirt'时。其他字段未经过分析,如果没有' /',':'字段中的各种字符。 所以很明显弹性搜索试图分析它,但问题是它们不应该被分析,因为映射已经存在。

这是我的用户操作类

class UserAction
  include Elasticsearch::Persistence::Model  
  extend Calculations
  include Styles

  attribute :user_id, Integer
    attribute :user_referrer, String, mapping: { index: 'not_analyzed' } 
    attribute :user_ip, String, mapping: { index: 'not_analyzed' } 
    attribute :user_country, String, mapping: { index: 'not_analyzed' }
    attribute :user_city, String, mapping: { index: 'not_analyzed' }
    attribute :user_device, String, mapping: { index: 'not_analyzed' }
  attribute :user_agent, String, mapping: { index: 'not_analyzed' }
    attribute :user_platform
  attribute :user_visitid, Integer
    attribute :action_type, String, mapping: { index: 'not_analyzed' } 
    attribute :action_css, String, mapping: { index: 'not_analyzed' }
  attribute :action_text, String, mapping: { index: 'not_analyzed' }
  attribute :action_path, String, mapping: { index: 'not_analyzed' } 
  attribute :share_url, String, mapping: { index: 'not_analyzed' } 
  attribute :tag 
  attribute :date 

我还尝试使用'映射来添加索引。"然后" create_index!"但结果是一样的。因为映射存在,所以它确实创建了映射。

这是我的宝石文件

   gem "elasticsearch-model", git: "git://github.com/elasticsearch/elasticsearch-rails.git", require: "elasticsearch/model"
          gem "elasticsearch-persistence", git: "git://github.com/elasticsearch/elasticsearch-rails.git", require: "elasticsearch/persistence/model"
          gem "elasticsearch-rails"

当我进行搜索时,我也会看到那些未经过分析的字段。

       :reload_on_failure=>false,
         :randomize_hosts=>false,
         :transport_options=>{}},
       @protocol="http",
       @reload_after=10000,
       @resurrect_after=60,
       @serializer=
        #<Elasticsearch::Transport::Transport::Serializer::MultiJson:0x007fc4bf9e0e18
         @transport=#<Elasticsearch::Transport::Transport::HTTP::Faraday:0x007fc4bf9b35a8 ...>>,
       @sniffer=
        #<Elasticsearch::Transport::Transport::Sniffer:0x007fc4bf9e0dc8
         @timeout=1,
         @transport=#<Elasticsearch::Transport::Transport::HTTP::Faraday:0x007fc4bf9b35a8 ...>>,
       @tracer=nil>>,
   @document_type="user_action",
   @index_name="useraction",
   @klass=UserAction,
   @mapping=
    #<Elasticsearch::Model::Indexing::Mappings:0x007fc4bfab18d8
     @mapping=
      {:created_at=>{:type=>"date"},
       :updated_at=>{:type=>"date"},
       :user_id=>{:type=>"integer"},
       :user_referrer=>{:type=>"string"},
       :user_ip=>{:type=>"string"},
       :user_country=>{:type=>"string", :index=>"not_analyzed"},
       :user_city=>{:type=>"string", :index=>"not_analyzed"},
       :user_device=>{:type=>"string", :index=>"not_analyzed"},
       :user_agent=>{:type=>"string", :index=>"not_analyzed"},
       :user_platform=>{:type=>"string"},
       :user_visitid=>{:type=>"integer"},
       :action_type=>{:type=>"string", :index=>"not_analyzed"},
       :action_css=>{:type=>"string", :index=>"not_analyzed"},
       :action_text=>{:type=>"string", :index=>"not_analyzed"},
       :action_path=>{:type=>"string", :index=>"not_analyzed"}},
     @options={},
     @type="user_action">,
   @options={:host=>UserAction}>,
 @response={"took"=>1, "timed_out"=>false, "_shards"=>{"total"=>4, "successful"=>4, "failed"=>0}, "hits"=>{"total"=>0, "max_score"=>nil, "hits"=>[]}}>
(END) 

初始化文件只有elastichq连接URL。

数据是弹性的,所以我应该得到结果,但不能得到任何结果。

    user_action 1   AUzH9xKDueQ8OtBQuyQC    http://example.org/api/analytics/track
user_actions    user_action 1   AUzIAUsvueQ8OtBQuyQg    http://0.0.0.0:3000/tshirt/funnel_test2
user_actions    user_action 1   AUzH7ay5ueQ8OtBQuyP2    http://example.org/api/analytics/track
user_actions    user_action 1   AUzH-HAdueQ8OtBQuyQU    http://0.0.0.0:3000/tshirt/test
user_actions    user_action 1   AUzIJbCGueQ8OtBQuyQ4    http://example.org/api/analytics/track
user_actions    user_action 1   AUzIJbCjueQ8OtBQuyQ5    http://example.org/api/analytics/track

Elastichq的卷曲结果

curl -XGET "https://YYYYY:XXXXX@xxxx.qbox.io/user_actions/_mapping"
{
  "user_actions": {
    "mappings": {
      "user_action": {
        "properties": {
          "action_css": { "type": "string" },
          "action_path": { "type": "string" },
          "action_text": { "type": "string" },
          "action_type": { "type": "string" },
          "created_at": { "format": "dateOptionalTime", "type": "date" },
          "date": { "type": "string" },
          "share_url": { "type": "string" },
          "tag": { "type": "string" },
          "updated_at": { "format": "dateOptionalTime", "type": "date" },
          "user_agent": { "type": "string" },
          "user_city": { "type": "string" },
          "user_country": { "type": "string" },
          "user_device": { "type": "string" },
          "user_id": { "type": "long" },
          "user_ip": { "type": "string" },
          "user_referrer": { "type": "string" },
          "user_visitid": { "type": "long" }
        }
      }
    }
  }
}

有人可以帮助我获取网址搜索工作吗?

4 个答案:

答案 0 :(得分:2)

从最终的elasticsearch curl看,你的字段似乎被分析了(没有not_analyzed标志)。也许尝试使用您想要的映射重建索引。

答案 1 :(得分:1)

我做了我不想做的事情。 使用下面的post请求手动创建索引及其映射,因此elasticsearch-rails无法创建错误。现在一切正常

curl -XPOST https://xxxxxx.qbox.io/user_actions -d '{
    "settings" : {
        "number_of_shards" : 1
    },
    "mappings" : {
        "user_action" : {
            "_source" : { "enabled" : false },
            "properties" : {
                "action_path" : { "type" : "string", "index" : "not_analyzed" }
            }
        }
    }
}'

答案 2 :(得分:0)

根据经验,如果您想搜索某些内容,则不应将其留下not_analyzed

特别是在这种情况下,您一定要尝试Keyword Analyzer,将相关字段设置为keyword

只要你搜索完整的字符串,即"http://0.0.0.0:3000/tshirt/test",就很有可能Keyword Analyzer就可以了。

答案 3 :(得分:0)

尝试原始查询:

total_views = UserAction.search :query=> {
    :filtered=> {
        :filter=> {
            :term=> { "action_path.raw" => "http://0.0.0.0:3000/tshirt/test" } 
        }
    }
}