我想拆分一个查询字符串,如:
"(first_name:zach AND last_name:woods) OR (first_name:thomas AND last_name:middleditch) OR (first_name:martin AND last_name:starr) OR "...
进入子字符串,每个字符串不超过5000个字符,我想在模式" OR "
上拆分。
帮助将不胜感激。
答案 0 :(得分:1)
如果您的查询与示例类似,则可以按OR
拆分,然后遍历子字符串将它们连接在一起,直到达到5000个字符。
original_query = "(first_name:zach AND last_name:woods) OR ..."
split_arr = original_query.split(/(?<=OR)/) # Split but keeps delimiter OR
result = []
pattern = ""
split_arr.each do |query|
if (pattern.length + query.length) > 5000 # If reached limit
result.push(pattern) # Store the current pattern
pattern = query # Start new substring
else # Else
pattern = pattern + " " + query # Just add more query to current pattern
end
end
result.push(pattern) if pattern.length > 0 # Check for the final case
puts result
然后,您将获得具有少于5000个字符的子串的数组result
。但是,如果您的字符串是一个SQL查询(可能),那么子字符串在语法上是否正确取决于您的原始查询。
答案 1 :(得分:0)
在构建查询本身时最好有这些查询约束。
如果你仍想使用这种方法,一种方法是scan
条件,并根据你喜欢的大小连接它们。
# Scan all matching conditions
conditions = str.scan(/first_name:[a-z]+ AND last_name:[a-z]+/)
# Final queries array
result = []
# Iterate over the conditions array as batch collection and build query
# Considering average size of each one as 35, batching group of 140 items
conditions.in_groups_of(140) { |group| group.reduce { |x, y| result << (x + (y.nil? ? '' : ' OR '+ y)) } }
结果数组将按大小分割查询。