我使用Elasticsearch,每天有一个索引,我希望我的Ruby on Rails应用程序通过指定最小和最精确的索引列表来查询给定时间段内的文档。
我无法找到获取索引列表的代码。让我解释一下:
考虑使用YYYY-MM-DD
格式化的日期。
您可以在日期字符串的末尾使用joker *
。例如。 2016-07-2*
描述了从2016-07-20
到2016-07-29
的所有日期。
现在,考虑一个由开始日期和结束日期表示的时间段。
代码必须返回表示句点的最小可能日期数组。
让我们举个例子。在以下时期:
2014-11-29
2016-10-13
代码必须返回包含以下字符串的数组:
2014-11-29
2014-11-30
2014-12-*
2015-*
2016-0*
2016-10-0*
2016-10-10
2016-10-11
2016-10-12
2016-10-13
它更好(但我仍然会采用未经优化的代码而不是没有),如果:
["2016-09-*"]
优于["2016-09-0*", "2016-09-1*", "2016-09-2*", "2016-09-30"]
有什么想法吗?
答案 0 :(得分:0)
好的,经过更多的思考和同事的帮助,我可能有一个解决方案。可能没有完全优化,但仍然......
def get_indices_from_period(start_date_str, end_date_str)
dates = {}
dates_strings = []
start_date = Date.parse(start_date_str)
end_date = Date.parse(end_date_str)
# Create a hash with, for each year and each month of the period: {:YYYY => {:MMMM => [DD1, DD2, DD3...]}}
(start_date..end_date).collect do |date|
year, month, day = date.year, date.month, date.day
dates[year] ||= {}
dates[year][month] ||= []
dates[year][month] << day
end
dates.each do |year, days_in_year|
start_of_year = Date.new(year, 1, 1)
max_number_of_days_in_year = (start_of_year.end_of_year - start_of_year).to_i + 1
number_of_days_in_year = days_in_year.collect{|month, days_in_month| days_in_month}.flatten.size
if max_number_of_days_in_year == number_of_days_in_year
# Return index formatted as YYYY-* if full year
dates_strings << "#{year}-*"
else
days_in_year.each do |month, days_in_month|
formatted_month = format('%02d', month)
if Time.days_in_month(month, year) == days_in_month.size
# Return index formatted as YYYY-MM-* if full month
dates_strings << "#{year}-#{formatted_month}-*"
else
decades_in_month = {}
days_in_month.each do |day|
decade = day / 10
decades_in_month[decade] ||= []
decades_in_month[decade] << day
end
decades_in_month.each do |decade, days_in_decade|
if (decade == 0 && days_in_decade.size == 9) ||
((decade == 1 || decade == 2) && days_in_decade.size == 10)
# Return index formatted as YYYY-MM-D* if full decade
dates_strings << "#{year}-#{formatted_month}-#{decade}*"
else
# Return index formatted as YYYY-MM-DD
dates_strings += days_in_decade.collect{|day| "#{year}-#{formatted_month}-#{format('%02d', day)}"}
end
end
end
end
end
end
return dates_strings
end
测试电话:
get_indices_from_period('2014-11-29', '2016-10-13')
=> ["2014-11-29", "2014-11-30", "2014-12-*", "2015-*", "2016-01-*", "2016-02-*", "2016-03-*", "2016-04-*", "2016-05-*", "2016-06-*", "2016-07-*", "2016-08-*", "2016-09-*", "2016-10-0*", "2016-10-10", "2016-10-11", "2016-10-12", "2016-10-13"]