用于kibana的Vega可视化 - 聚合和访问文档字段

时间:2018-04-04 12:58:56

标签: javascript data-visualization kibana vega vega-lite

我也是Vega和Kibana的新手,我试图创建一个散点图,显示主题标签及其平均极性,但我坚持两个方面,首先是平均极性聚合,然后从文档中访问主题标签文本字段。

我试图获得平均极性的代码(现在只是以时间刻度显示):

      {$schema: https://vega.github.io/schema/vega-lite/v2.json
  data: {
    # URL object is a context-aware query to Elasticsearch
    url: {
      # The %-enclosed keys are handled by Kibana to modify the query
      # before it gets sent to Elasticsearch. Context is the search
      # filter as shown above the dashboard. Timefield uses the value 
      # of the time picker from the upper right corner.
      %context%: true
      %timefield%: timestamp
      index: tw
      body: {
        size: 10000
        _source: ["timestamp", "user_lang", "country", "polarity", "lang", "sentiment"]
      }
    }
    # We only need the content of hits.hits array
    format: {property: "hits.hits"}
  }
  # Parse timestamp into a javascript date value
  transform: [
    {calculate: "toDate(datum._source['timestamp'])", as: "time"}
  ]
  # Draw a circle, with x being the time field, and y - number of bytes
  mark: line
  encoding: {
    x: {field: "time", type: "temporal"}
    y: {aggregate: "mean", field: "_source.polarity", type: "quantitative"}
  }
}

这给了我一个错误无法读取属性'极性'未定义的。一旦我摆脱它的聚合它,但我想显示平均值而不是所有数据。

另外,我不知道如何访问标签文本字段作为其嵌套,我已经尝试了_source.hashtags.text但没有工作:

示例文档:

{
        "_index": "tw",
        "_type": "tweet",
        "_id": "_HHWSGIBbYt8wc5TlB8B",
        "_score": 1,
        "_source": {
          "lang": "en",
          "favorited": false,
          "sentiment": "positive",
          "user_lang": "en",
          "user_screenname": "BrideWiltshire",
          "timestamp": "2018-03-21T13:54:04.928556",
          "user_follow_count": 147,
          "hashtags": [
            {
              "indices": [
                8,
                12
              ],
              "text": "WIN"
            }
          ],
          "user_stat_count": 3377,
          "user_fav_count": 11,
          "coordinates": null,
          "source": """<a href="https://panel.socialpilot.co/" rel="nofollow">SocialPilot.co</a>""",
          "subjectivity": 0.3333333333333333,
          "user_friends_count": 62,
          "polarity": 0.5333333333333333,
          "text": "Want to #WIN ‘His and Hers’ luggage labels from @DavidHampton, worth more than £100? Enter our competition now",
          "message": "Want to #WIN ‘His and Hers’ luggage labels from @DavidHampton, worth more than £100? Enter our competition now",
          "country": null,
          "user_name": "Wiltshire Bride",
          "favorite_count": 0
        }
      },

映射:

{
  "tw": {
    "mappings": {
      "tweet": {
        "properties": {
          "coordinates": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "country": {
            "type": "keyword"
          },
          "favorite_count": {
            "type": "long"
          },
          "favorited": {
            "type": "boolean"
          },
          "hashtags": {
            "properties": {
              "indices": {
                "type": "long"
              },
              "text": {
                "type": "text",
                "fields": {
                  "keyword": {
                    "type": "keyword",
                    "ignore_above": 256
                  }
                }
              }
            }
          },
          "lang": {
            "type": "text"
          },
          "location": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "message": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "polarity": {
            "type": "float"
          },
          "sentiment": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "source": {
            "type": "text"
          },
          "subjectivity": {
            "type": "float"
          },
          "text": {
            "type": "text"
          },
          "time_zone": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "timestamp": {
            "type": "date"
          },
          "user": {
            "properties": {
              "favourites_count": {
                "type": "long"
              },
              "followers_count": {
                "type": "long"
              },
              "friends_count": {
                "type": "long"
              },
              "lang": {
                "type": "text"
              },
              "name": {
                "type": "text"
              },
              "screen_name": {
                "type": "text"
              },
              "statuses_count": {
                "type": "long"
              }
            }
          },
          "user_fav_count": {
            "type": "long"
          },
          "user_follow_count": {
            "type": "long"
          },
          "user_friends_count": {
            "type": "long"
          },
          "user_lang": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "user_name": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "user_screenname": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "user_stat_count": {
            "type": "long"
          }
        }
      }
    }
  }
}

1 个答案:

答案 0 :(得分:2)

如果你的hashtags字段是嵌套类型而hashtags.text是一个关键字字段(或者有hashtags.text.keyword),那么你可以使用以下

进行散点图。
{
  $schema: https://vega.github.io/schema/vega-lite/v2.json
  title: hashtags vs avg_polarity
  data: {
    url: {
      index: twitter
      body: {
        size: 0
        query: {
          match_all: {}
        }
        aggs: {
          HashTags: {
            nested: {path: "hashtags"}
            aggs: {
              HashTags_Text: {
                terms: {field: "hashtags.text"}
                aggs: {
                  Tweet_Polarity: {
                    reverse_nested: {}
                    aggs: {
                      Tweet_Polarity_avg: {
                        avg: {field: "polarity"}
                      }
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
    format: {property: "aggregations.HashTags.HashTags_Text.buckets"}
  }
  mark: {type: "line"}
  encoding: {
    x: {
      field: key
      type: Nominal
      axis: {title: "HashTags"}
    }
    y: {
      field: Tweet_Polarity.Tweet_Polarity_avg.value
      type: quantitative
      axis: {title: "polarity"}
    }
  }
}

有趣的小插图 enter image description here修改

在开始添加文档之前,您必须在下面指定索引映射

POST /tw
{
"mappings": {
            "tweet": {
                "properties": {
                    "favorite_count": {
                        "type": "long"
                    },
                    "favorited": {
                        "type": "boolean"
                    },
                    "hashtags": {
                        "type": "nested",
                        "properties": {
                            "indices": {
                                "type": "long"
                            },
                            "text": {
                                "type": "keyword"
                            }
                        }
                    },
                    "lang": {
                        "type": "text",
                        "fields": {
                            "keyword": {
                                "type": "keyword",
                                "ignore_above": 256
                            }
                        }
                    },
                    "message": {
                        "type": "text",
                        "fields": {
                            "keyword": {
                                "type": "keyword",
                                "ignore_above": 256
                            }
                        }
                    },
                    "polarity": {
                        "type": "float"
                    },
                    "sentiment": {
                        "type": "text",
                        "fields": {
                            "keyword": {
                                "type": "keyword",
                                "ignore_above": 256
                            }
                        }
                    },
                    "source": {
                        "type": "text",
                        "fields": {
                            "keyword": {
                                "type": "keyword",
                                "ignore_above": 256
                            }
                        }
                    },
                    "subjectivity": {
                        "type": "float"
                    },
                    "text": {
                        "type": "text",
                        "fields": {
                            "keyword": {
                                "type": "keyword",
                                "ignore_above": 256
                            }
                        }
                    },
                    "timestamp": {
                        "type": "date"
                    },
                    "user_fav_count": {
                        "type": "long"
                    },
                    "user_follow_count": {
                        "type": "long"
                    },
                    "user_friends_count": {
                        "type": "long"
                    },
                    "user_lang": {
                        "type": "text",
                        "fields": {
                            "keyword": {
                                "type": "keyword",
                                "ignore_above": 256
                            }
                        }
                    },
                    "user_name": {
                        "type": "text",
                        "fields": {
                            "keyword": {
                                "type": "keyword",
                                "ignore_above": 256
                            }
                        }
                    },
                    "user_screenname": {
                        "type": "text",
                        "fields": {
                            "keyword": {
                                "type": "keyword",
                                "ignore_above": 256
                            }
                        }
                    },
                    "user_stat_count": {
                        "type": "long"
                    }
                }
            }
        }
}